Skip to content

DevOps Fundamentals - Interview Questions

Essential DevOps concepts and principles


Basic Questions

1. What is DevOps?

Answer: DevOps is a culture, set of practices, and tools that combines software development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver high-quality software continuously.

Key Points: - Culture: Collaboration between Dev and Ops teams - Automation: Automate repetitive tasks - Continuous Integration/Delivery: Frequent code integration and deployment - Monitoring: Continuous monitoring and feedback - Infrastructure as Code: Manage infrastructure through code

Benefits: - Faster time to market - Improved collaboration - Higher quality software - Better reliability - Increased efficiency


2. What are the key principles of DevOps?

Answer: 1. Collaboration - Breaking down silos between teams 2. Automation - Automate build, test, and deployment 3. Continuous Improvement - Iterative improvements 4. Customer-Centric - Focus on customer needs 5. Fail Fast - Quick feedback and recovery 6. Monitoring - Continuous monitoring and logging 7. Infrastructure as Code - Version-controlled infrastructure


3. Explain the DevOps lifecycle

Answer: The DevOps lifecycle consists of continuous phases:

Plan → Code → Build → Test → Release → Deploy → Operate → Monitor
  ↑                                                              ↓
  └──────────────────── Feedback Loop ────────────────────────┘

Phases: 1. Plan - Define requirements and features 2. Code - Write and version control code 3. Build - Compile and package application 4. Test - Automated testing (unit, integration, etc.) 5. Release - Prepare for deployment 6. Deploy - Deploy to production 7. Operate - Manage and maintain 8. Monitor - Track performance and issues


4. What is CI/CD?

Answer: CI (Continuous Integration): - Developers frequently merge code to shared repository - Automated builds and tests run on each commit - Early detection of integration issues

CD (Continuous Delivery): - Code is always in deployable state - Automated deployment to staging - Manual approval for production

CD (Continuous Deployment): - Fully automated deployment to production - No manual intervention - Every change goes through pipeline

Pipeline Example:

Code Commit → Build → Unit Tests → Integration Tests → 
Deploy to Staging → Acceptance Tests → Deploy to Production


5. What is the difference between Agile and DevOps?

Answer:

Aspect Agile DevOps
Focus Software development Development + Operations
Goal Deliver working software Deliver and operate software
Team Development team Dev + Ops + QA
Feedback Sprint reviews Continuous monitoring
Automation Testing Build, test, deploy, monitor
Duration Sprints (2-4 weeks) Continuous

Relationship: DevOps extends Agile principles to operations


6. What is Infrastructure as Code (IaC)?

Answer: Infrastructure as Code is managing and provisioning infrastructure through machine-readable definition files rather than manual processes.

Benefits: - Version Control - Track infrastructure changes - Consistency - Same infrastructure every time - Automation - Automated provisioning - Documentation - Code serves as documentation - Disaster Recovery - Quick infrastructure recreation

Tools: - Terraform - Ansible - CloudFormation - Pulumi

Example (Terraform):

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t2.micro"

  tags = {
    Name = "WebServer"
  }
}


7. What is Configuration Management?

Answer: Configuration Management is the process of maintaining systems in a desired state through automation.

Key Concepts: - Idempotency - Same result regardless of how many times applied - Declarative - Define desired state, not steps - Version Control - Track configuration changes

Tools: - Ansible - Puppet - Chef - SaltStack

Example (Ansible):

- name: Install and start nginx
  hosts: webservers
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present
    - name: Start nginx
      service:
        name: nginx
        state: started


8. What is Version Control and why is it important?

Answer: Version Control is a system that records changes to files over time, allowing you to recall specific versions later.

Benefits: - Collaboration - Multiple developers work together - History - Track all changes - Branching - Work on features independently - Rollback - Revert to previous versions - Backup - Distributed copies

Popular Systems: - Git (most popular) - SVN - Mercurial

Git Workflow:

git clone <repo>
git checkout -b feature-branch
# Make changes
git add .
git commit -m "Add new feature"
git push origin feature-branch
# Create pull request


9. What are the benefits of DevOps?

Answer:

Business Benefits: - Faster time to market - Increased revenue - Better customer satisfaction - Competitive advantage - Cost reduction

Technical Benefits: - Faster deployment - Higher quality - Better reliability - Improved collaboration - Automated processes

Team Benefits: - Better communication - Shared responsibility - Continuous learning - Job satisfaction - Innovation


10. What is the role of automation in DevOps?

Answer: Automation is the backbone of DevOps, enabling:

Build Automation: - Automated compilation - Dependency management - Artifact creation

Test Automation: - Unit tests - Integration tests - Security scans - Performance tests

Deployment Automation: - Infrastructure provisioning - Application deployment - Configuration management - Rollback procedures

Monitoring Automation: - Log collection - Metrics gathering - Alert generation - Incident response


Intermediate Questions

11. Explain the concept of "Shift Left" in DevOps

Answer: Shift Left means moving testing, security, and quality checks earlier in the development process.

Traditional Approach:

Dev → QA → Security → Ops → Production

Shift Left Approach:

Dev (with testing, security, quality) → Ops → Production

Benefits: - Earlier bug detection - Lower fix costs - Faster feedback - Better quality - Reduced risk


12. What is a DevOps pipeline?

Answer: A DevOps pipeline is an automated sequence of steps that code goes through from development to production.

Typical Pipeline:

Source → Build → Test → Deploy → Monitor

Detailed Pipeline:

1. Code Commit (Git)
2. Trigger Build (Jenkins/GitHub Actions)
3. Compile Code
4. Run Unit Tests
5. Code Quality Analysis (SonarQube)
6. Security Scan
7. Build Docker Image
8. Push to Registry
9. Deploy to Staging
10. Run Integration Tests
11. Deploy to Production
12. Monitor and Alert


13. What is Blue-Green Deployment?

Answer: Blue-Green deployment is a strategy where two identical production environments (Blue and Green) exist, with only one active at a time.

Process:

1. Blue environment is live (serving traffic)
2. Deploy new version to Green environment
3. Test Green environment
4. Switch traffic from Blue to Green
5. Keep Blue as rollback option

Benefits: - Zero downtime - Instant rollback - Full testing before switch - Reduced risk


14. What is Canary Deployment?

Answer: Canary deployment gradually rolls out changes to a small subset of users before full deployment.

Process:

1. Deploy to 5% of servers
2. Monitor metrics
3. If healthy, increase to 25%
4. Continue monitoring
5. Increase to 50%
6. Finally 100%

Benefits: - Risk mitigation - Real-world testing - Quick rollback - Gradual rollout


15. What is the difference between Continuous Delivery and Continuous Deployment?

Answer:

Continuous Delivery: - Code is always deployable - Manual approval for production - Automated deployment to staging - Human decision for release

Continuous Deployment: - Fully automated to production - No manual intervention - Every change goes live automatically - Requires high confidence in automation

When to use: - Continuous Delivery: Regulated industries, critical systems - Continuous Deployment: SaaS products, web applications


16. What is GitOps?

Answer: GitOps is a way of implementing Continuous Deployment where Git is the single source of truth for infrastructure and applications.

Principles: 1. Declarative - Entire system described declaratively 2. Versioned - Canonical state in Git 3. Pulled Automatically - Agents pull changes 4. Continuously Reconciled - Ensure actual state matches desired state

Tools: - ArgoCD - Flux - Jenkins X

Workflow:

1. Developer commits to Git
2. GitOps operator detects change
3. Operator applies changes to cluster
4. Cluster state matches Git


17. What is Observability?

Answer: Observability is the ability to understand the internal state of a system by examining its outputs.

Three Pillars: 1. Metrics - Numerical measurements (CPU, memory, requests/sec) 2. Logs - Event records (application logs, error logs) 3. Traces - Request flow through distributed system

Tools: - Prometheus (metrics) - Grafana (visualization) - ELK Stack (logs) - Jaeger (tracing)


18. What is Immutable Infrastructure?

Answer: Immutable Infrastructure means servers are never modified after deployment. Instead, new servers are created with changes.

Traditional Approach:

Deploy Server → Update → Patch → Configure → Update

Immutable Approach:

Deploy Server v1 → Deploy Server v2 → Replace v1 with v2

Benefits: - Consistency - No configuration drift - Easy rollback - Predictable deployments


19. What is a Service Mesh?

Answer: A Service Mesh is an infrastructure layer that handles service-to-service communication in microservices.

Features: - Service discovery - Load balancing - Encryption - Authentication - Monitoring - Traffic management

Popular Service Meshes: - Istio - Linkerd - Consul Connect


20. What is the difference between Monitoring and Observability?

Answer:

Monitoring: - Known unknowns - Predefined metrics - Alerts on thresholds - "Is the system up?"

Observability: - Unknown unknowns - Explore and discover - Understand why system behaves - "Why is the system slow?"

Relationship: Observability enables better monitoring


Advanced Questions

21. How do you implement DevOps in a legacy system?

Answer:

Step 1: Assessment - Identify pain points - Map current processes - Assess team skills - Evaluate tools

Step 2: Start Small - Choose one application - Implement CI/CD - Automate testing - Measure improvements

Step 3: Expand - Add more applications - Implement IaC - Improve monitoring - Enhance automation

Step 4: Culture Change - Training and education - Break down silos - Encourage collaboration - Celebrate wins

Challenges: - Resistance to change - Legacy technology - Skill gaps - Budget constraints


22. What metrics do you track in DevOps?

Answer:

DORA Metrics: 1. Deployment Frequency - How often you deploy 2. Lead Time for Changes - Time from commit to production 3. Mean Time to Recovery (MTTR) - Time to recover from failure 4. Change Failure Rate - Percentage of deployments causing failure

Additional Metrics: - Build success rate - Test coverage - Code quality - Infrastructure costs - Incident count - Customer satisfaction


23. How do you handle secrets in DevOps?

Answer:

Best Practices: 1. Never commit secrets to Git 2. Use secret management tools 3. Rotate secrets regularly 4. Encrypt secrets at rest 5. Use least privilege access

Tools: - HashiCorp Vault - AWS Secrets Manager - Azure Key Vault - Kubernetes Secrets

Example (Vault):

# Store secret
vault kv put secret/db password=mysecret

# Retrieve secret
vault kv get secret/db


24. What is Chaos Engineering?

Answer: Chaos Engineering is the practice of intentionally injecting failures to test system resilience.

Principles: 1. Define steady state 2. Hypothesize about failures 3. Introduce real-world events 4. Verify system remains stable

Tools: - Chaos Monkey (Netflix) - Gremlin - Chaos Toolkit

Example Experiments: - Terminate random instances - Introduce network latency - Fill disk space - Simulate region failure


25. How do you ensure high availability in DevOps?

Answer:

Strategies: 1. Redundancy - Multiple instances 2. Load Balancing - Distribute traffic 3. Auto-Scaling - Scale based on demand 4. Health Checks - Monitor service health 5. Disaster Recovery - Backup and restore plans 6. Multi-Region - Deploy across regions

Architecture:

Load Balancer
Multiple Availability Zones
Auto-Scaling Groups
Health Checks
Monitoring & Alerts


Scenario-Based Questions

26. Your deployment failed in production. What do you do?

Answer:

Immediate Actions: 1. Assess Impact - How many users affected? 2. Rollback - Revert to previous version 3. Communicate - Notify stakeholders 4. Monitor - Watch for recovery

Investigation: 1. Check logs 2. Review metrics 3. Identify root cause 4. Document findings

Prevention: 1. Improve testing 2. Add monitoring 3. Update runbooks 4. Conduct post-mortem


27. How do you handle a security vulnerability in production?

Answer:

Immediate Response: 1. Assess Severity - Critical, high, medium, low 2. Contain - Isolate affected systems 3. Patch - Apply security fix 4. Verify - Confirm vulnerability fixed 5. Monitor - Watch for exploitation attempts

Long-term: 1. Security scanning in CI/CD 2. Regular dependency updates 3. Security training 4. Incident response plan


28. How do you optimize CI/CD pipeline performance?

Answer:

Strategies: 1. Parallel Execution - Run tests in parallel 2. Caching - Cache dependencies 3. Incremental Builds - Build only changed code 4. Optimize Tests - Remove slow/flaky tests 5. Resource Allocation - Use appropriate instance sizes

Example:

# Before: 20 minutes
build → test → deploy

# After: 8 minutes
build (parallel) → test (parallel) → deploy


29. How do you manage database changes in CI/CD?

Answer:

Strategies: 1. Database Migrations - Version-controlled schema changes 2. Backward Compatibility - Support old and new schema 3. Blue-Green for Databases - Separate read/write databases 4. Feature Flags - Toggle new database features

Tools: - Flyway - Liquibase - Alembic (Python)

Example Migration:

-- V1__create_users_table.sql
CREATE TABLE users (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

-- V2__add_email_column.sql
ALTER TABLE users ADD COLUMN email VARCHAR(100);


30. How do you implement disaster recovery?

Answer:

Components: 1. Backup Strategy - Regular automated backups - Multiple backup locations - Test restore procedures

  1. Recovery Time Objective (RTO)
  2. Maximum acceptable downtime
  3. Example: 4 hours

  4. Recovery Point Objective (RPO)

  5. Maximum acceptable data loss
  6. Example: 1 hour

  7. DR Plan:

  8. Document procedures
  9. Assign responsibilities
  10. Regular DR drills
  11. Update documentation

Architecture:

Primary Region (Active)
Continuous Replication
DR Region (Standby)
Automated Failover



Total Questions: 30
Difficulty: Basic to Advanced
Topics Covered: DevOps culture, CI/CD, IaC, automation, best practices