Sign in to save progress across devices.
Sign inPart 1: Zero-to-One — Why DevOps Matters
Demystify the buzzword, understand the culture shift, and commit to owning the full build → deploy → observe lifecycle.
1"Is DevOps a Good Career?"
Reframe DevOps as high-leverage engineering that collapses the wall between builders and operators.
Checkpoints
- Name the r/btechtards anxieties and counter them with the impact of owning production reliability.
- Explain the difference between the old throw-it-over-the-wall model and "you build it, you run it".
- List three business outcomes (speed, reliability, trust) that DevOps teams are accountable for.
2The "It Works On My Machine" Problem
Understand dependency hell and why reproducible environments birthed the DevOps movement.
Checkpoints
- List three reasons your project runs locally but fails on a friend's laptop.
- Diagram how mismatched runtimes, libraries, and environment variables break deployments.
- Commit to shipping environments alongside code so teammates never chase ghost bugs again.
3Ship Project 0: The DevOps Portfolio Repo
Set up the single GitHub repo that will capture every script, Dockerfile, Terraform module, and postmortem.
Checkpoints
- Create `my-devops-portfolio` with a README that defines DevOps in your own words.
- Outline every tier of this roadmap and commit to logging weekly learnings in the repo.
- Push your first commit (`git commit -m "Initial commit: My DevOps journey begins."`).
Part 2: Tier 0 — Steel-Reinforced Foundations
Learn the Linux, networking, and automation basics that turn "click-ops" into code-driven operations.
4Linux Command Line Confidence
Treat the terminal as your control room so remote servers feel as friendly as your laptop.
Checkpoints
- Navigate, search, and manipulate files fluently with `ls`, `cd`, `grep`, `find`, `cp`, and `rm`.
- Automate repetitive chores with shell scripts that use pipes and redirection.
- Harden access with permissions, `sudo`, SSH keys, and audit trails.
5Networking for Humans (Who Ship Systems)
Grasp the language of packets so load balancers, ports, and firewalls stop feeling mysterious.
Checkpoints
- Explain IP, DNS, and ports using the apartment-building analogy without peeking at notes.
- Compare TCP and UDP trade-offs through real-world app examples (web vs gaming).
- Sketch how a firewall rule protects your service and when to allow/deny traffic.
6Automation with Python (or Go later)
Move beyond hello world scripts into the automation glue every DevOps engineer relies on.
Checkpoints
- Read, write, and refactor scripts that move files, call APIs, and parse JSON.
- Use virtual environments and `requirements.txt` to pin dependencies.
- Handle errors gracefully with logging and retries so scripts survive flaky networks.
7Ship Project: Automated Backup Script
Bring Linux, scripting, and scheduling together to protect data without manual toil.
Checkpoints
- Write `backup.sh` (or `backup.py`) that archives a directory with timestamped tarballs.
- Rotate old backups automatically with `find` + `rm` or Python equivalents.
- Schedule nightly runs with cron and document setup steps in your portfolio repo.
Appreciate how heavyweight VMs exposed the need for portable containers.
Checkpoints
- Compare VM images and container images in terms of boot time, size, and resource use.
- Explain why "snapshotting your laptop" doesn't scale for collaboration.
- Document trade-offs of Vagrant or VirtualBox so you can defend Docker in interviews.
Write production-ready Dockerfiles and understand the lifecycle from image to container.
Checkpoints
- Compose multi-stage Dockerfiles that install dependencies then copy clean builds.
- Use `docker build`, `docker run`, and `docker exec` to iterate without guesswork.
- Publish images to Docker Hub with meaningful tags tied to git SHAs.
10Ship Project: Dockerized Hello World
Package a Flask (or Node) app with a requirements file and run it anywhere using containers.
Checkpoints
- Write a Dockerfile that copies `requirements.txt`, installs dependencies, and runs the app.
- Expose the container with `docker run -d -p 5000:5000` and test locally.
- Compose app + Redis with docker-compose to prove multi-container networking.
11The Pain of Manual Deploys
Quantify downtime and toil caused by ssh + git pull rituals.
Checkpoints
- Map the manual steps you currently take to deploy and highlight risks at each point.
- Explain why humans forget flags, stop commands, or roll back too slowly.
- Set a personal SLO: no production change should require more than one human step.
12GitHub Actions Assembly Line
Use hosted runners to lint, test, and build Docker images automatically.
Checkpoints
- Author workflows that run on pull requests and main branch pushes.
- Cache dependencies and use matrix builds to speed up feedback loops.
- Authenticate to Docker Hub (or GHCR) securely with repository secrets.
13Ship Project: Push-to-Deploy Pipeline
Trigger tests, build Docker images, and push to Docker Hub every time `main` updates.
Checkpoints
- Create `.github/workflows/ci-cd.yml` with lint + test + build stages.
- Use `docker/build-push-action` to publish versioned images tagged with `latest` and `${{ github.sha }}`.
- Document how to rotate secrets and validate the pipeline using badges in your README.
Spot the scaling, reliability, and update traps of single-node thinking.
Checkpoints
- Quantify how CPU, memory, and disk bottlenecks ruin user experience when traffic spikes.
- Describe failure domains and why redundancy plus health checks matter.
- Explain downtime caused by stop/start deployments and why rolling updates help.
Use the free tier responsibly and guard against surprise bills while learning core services.
Checkpoints
- Enable billing alerts and budgets before launching anything.
- Provision and secure an EC2 instance with SSH keys and security groups.
- Tear down unused resources immediately to stay inside the free tier.
16Kubernetes Without Tears
Adopt declarative thinking and let the control plane reconcile desired state for you.
Checkpoints
- Install minikube (or kind) locally and run your first cluster.
- Create Deployments and Services that heal themselves when pods fail.
- Observe rolling updates and scale changes via `kubectl get pods -w`.
17Ship Project: Manual Cloud + Local K8s
Deploy your container manually on EC2, then repeat with Kubernetes to feel the difference.
Checkpoints
- Launch a t2.micro, install Docker, and run your image on port 80 (terminate when done).
- Create `deployment.yml` and `service.yml`, apply to minikube, and test with `minikube service`.
- Delete a pod to watch Kubernetes self-heal and record the demo in your repo.
Highlight drift, inconsistency, and missing audit trails when environments are built by hand.
Checkpoints
- Document 50-click console workflows that cannot be repeated accurately.
- Explain infrastructure drift and how it causes "it worked in staging" bugs.
- Commit to version-controlling every network, server, and database change.
Use Terraform to declare resources, manage state, and orchestrate multi-environment deployments.
Checkpoints
- Write `main.tf` with providers, resources, and outputs for EC2 or similar compute.
- Run `terraform init`, `plan`, `apply`, and `destroy` while keeping state secure.
- Module-ize common patterns and store state in remote backends with locking.
20Configuration Management with Ansible
Install and configure software via idempotent playbooks that ride on SSH.
Checkpoints
- Write inventory files and playbooks that install Docker or configure Nginx.
- Use modules (`apt`, `service`, `docker_container`) instead of raw shell commands.
- Run playbooks repeatedly to prove idempotence and capture results in logs.
21Ship Project: One-Click Server
Combine Terraform + Ansible so `terraform apply` followed by `ansible-playbook` yields a ready app.
Checkpoints
- Provision EC2 (or Lightsail) plus security groups in Terraform with outputs for public IP.
- Run Ansible to install Docker and launch your containerised app automatically.
- Destroy and recreate the stack to prove repeatability and document the workflow in your repo.
Part 7: Tier 5 — Monitoring & Observability
Instrument metrics, logs, and traces so you can detect, debug, and narrate incidents with confidence.
22Know When You're Flying Blind
Recognise the risks of running production without telemetry.
Checkpoints
- List symptoms (slow, erroring, unavailable) and map them to missing signals.
- Define SLIs/SLOs that tie to actual user happiness instead of vanity metrics.
- Set up incident alarms on meaningful thresholds with auto-notifications.
23Prometheus + Loki + Grafana
Adopt the free monitoring stack that powers countless production teams.
Checkpoints
- Expose application metrics via `/metrics` and scrape them with Prometheus.
- Aggregate structured logs with Loki and query via LogQL.
- Build Grafana dashboards that blend metrics, logs, and alert panels for storytelling.
24Ship Project: Single Pane of Glass
Spin up the PLG stack with docker-compose and monitor your containerised app end-to-end.
Checkpoints
- Add `prometheus-flask-exporter` (or similar) to expose metrics from your app.
- Write `prometheus.yml` to scrape the app container and wire logs to Loki.
- Design a Grafana dashboard showing CPU, request rate, error rate, and live logs.
Part 8: Capstone — End-to-End Platform
Combine every tier into a single repo that proves you can design, automate, and observe a production-grade system.
Select a real open-source app and outline user outcomes plus non-functional requirements.
Checkpoints
- Document personas, traffic expectations, and SLOs in your README.
- List every component (app, database, ingress, monitoring) before touching code.
- Sketch a high-level architecture in Excalidraw or draw.io and add it to the repo.
26Codify the Architecture
Use Terraform for infrastructure, Kubernetes manifests for workloads, and GitHub Actions for delivery.
Checkpoints
- Provision a managed Kubernetes cluster (or K3s on EC2) plus managed database via Terraform.
- Build GitHub Actions workflows that test, build, and `kubectl apply` your manifests.
- Automate Prometheus/Grafana install via Helm and bootstrap dashboards as code.
Document runbooks, incident simulations, and before/after metrics to showcase impact.
Checkpoints
- Record a 5-minute Loom walk-through of the pipeline and monitoring stack.
- Publish a postmortem template and run at least one simulated incident.
- Add a hiring manager TL;DR section explaining outcomes, costs, and next steps.
Rewrite your resume to emphasise shipped systems and measurable outcomes.
Checkpoints
- Replace tool laundry lists with bullet points that quantify toil reduction or uptime gains.
- Link directly to repo folders (CI/CD, IaC, monitoring) as proof.
- Tailor one page for DevOps, another for SRE roles, highlighting relevant metrics.
29System Design & "What Happens When..."
Practice narrating the request lifecycle using every tier you mastered.
Checkpoints
- Answer "What happens when you type google.com" with DNS → load balancer → Kubernetes detail.
- Whiteboard CI/CD flows that include security scans, SBOMs, and progressive delivery.
- Design on-call rotations and SLO dashboards to demonstrate empathy for teammates and users.
30Salary & Growth Reality
Set expectations for compensation and career progression without falling for hype.
Checkpoints
- Compare service company vs product startup salary bands for DevOps roles.
- Plan a 2-5 year trajectory from junior engineer to platform lead with milestone skills.
- Track market data (Levels.fyi, LinkedIn) and negotiate with evidence, not guesswork.
Follow creators who teach with empathy and practical demos, not buzzwords.
Checkpoints
- Queue Kunal Kushwaha's DevOps Bootcamp and pick one module per week.
- Use TechWorld with Nana for quick refreshers when a concept feels intimidating.
- Pair freeCodeCamp marathon streams with note-taking inside your portfolio repo.
32Hands-On Text Curriculum
Bookmark labs and documentation that keep you shipping, not just reading.
Checkpoints
- Progress through OverTheWire Bandit until Level 34 and journal tricky commands.
- Complete Terraform, Kubernetes, and GitHub Actions get-started guides inside sandboxes.
- Clone 90DaysOfDevOps and follow the days as optional side quests.