Case Study Part 3: The Deployment Saga – A DevOps Deep Dive

If Part 1 was the “what” and Part 2 was the “how,” this part is the “oh no, it’s broken… okay, I fixed it.” Welcome to the deployment saga. This is the story of taking a perfectly functional local Docker setup and launching it into a live, secure, and fully automated production environment on the internet.

This is where the real DevOps work begins. It’s a process filled with challenges that test your understanding of networking, security, and automation. Let’s dive in.

Step 1: Hardening the Production Server

The first step was preparing the battlefield: a fresh Linux VPS. A default server is not secure. Before deploying any code, I performed essential server hardening:

Created a Non-Root User: All operations are performed by a user with sudo privileges, never directly as root.
Configured UFW (Uncomplicated Firewall): I immediately locked down all ports, allowing only essential traffic: SSH (initially), HTTP, and HTTPS.
Implemented a “Zero-Trust” SSH Policy with Tailscale: Instead of leaving the SSH port (22) open to the world and relying on IP whitelisting, I took a more secure approach. I installed Tailscale on both my local machine and the VPS, creating a private, encrypted mesh network. The UFW was then configured to only allow SSH connections originating from within this private Tailscale network. To the public internet, my server’s SSH port is completely invisible, drastically reducing the attack surface.

Step 2: The CI/CD Pipeline – Automation is King

The goal was a true Continuous Deployment pipeline: every git push to the main branch should automatically and safely update the live application.

The Tool: GitHub Actions. The Challenge: How do you get GitHub Actions to securely connect to a server whose SSH port is locked down behind a VPN?

The Solution: A Self-Hosted Runner.

Instead of using GitHub’s cloud runners, I configured a self-hosted runner directly on the VPS. This brilliant piece of software runs as a service, polling GitHub for new jobs. This inverts the connection model: the server initiates the connection outwards, meaning no inbound ports need to be opened for the CI/CD pipeline.

Here is the final, resilient deploy.yml workflow file:

name: Deploy to Production VPS

on:
  push:
    branches: [ "main" ]

jobs:
  deploy:
    runs-on: self-hosted

    steps:
      - name: Clean and Prepare Workspace
        run: |
          if [ -d "${{ github.workspace }}" ]; then
            sudo chown -R vanessa:vanessa "${{ github.workspace }}"
          fi

      - name: Checkout code
        uses: actions/checkout@v4

      - name: Create .env file
        run: echo "${{ secrets.DOT_ENV }}" > .env

      - name: Deploy Application
        run: |
          docker compose -f docker-compose.prod.yml up --build -d
          docker image prune -f

Step 3: Troubleshooting in the Trenches (The Real DevOps Work)

A green pipeline is a beautiful thing, but it’s usually built on the ashes of many red ones. Here are the key “boss battles” I fought and won during this deployment.

Battle 1: The 502 Bad Gateway Mystery

The pipeline ran successfully, docker ps showed all containers were “Up,” but the website showed a dreaded 502 error.

Symptom: Nginx (the “gatekeeper”) was running but couldn’t communicate with the Flask app (the “upstream”).
Investigation: The docker logs app showed no errors, only successful startups. This was the clue. The app was crashing so fast that Gunicorn couldn’t log the error before Docker restarted it. The docker logs nginx finally revealed the truth: connect() failed (113: Host is unreachable).
Solution: The app container wasn’t correctly registering on Docker’s internal network after the CI/CD run. A simple docker compose restart app forced the container to re-register and resolved the issue, confirming a transient network glitch within Docker.

Battle 2: The Permission Denied Paradox

The second pipeline run failed spectacularly with an EACCES: permission denied error during the “Checkout code” step.

Symptom: The runner couldn’t clean its own workspace.
Investigation: The first docker compose run had created certificate folders (certbot/) owned by the root user. The runner, operating as the vanessa user, was now forbidden from touching these folders.
Solution: This was a classic permissions battle. The fix was twofold: first, running sudo chown -R vanessa:vanessa ~/actions-runner to fix the immediate problem. Second, adding the “Clean and Prepare Workspace” step to the deploy.yml file. This made the pipeline self-healing, ensuring it fixes its own permissions before every run.

Battle 3: The Case of the Vanishing Certificates

After fixing the permissions, the 502 error returned, but this time the Nginx logs were different: cannot load certificate... No such file or directory.

Symptom: The SSL certificates were disappearing after every deploy.
Investigation: I realized the “Clean and Prepare Workspace” step, while necessary, was too effective. It was wiping the entire project directory, including the certbot folder containing the live SSL certificates!
Solution: This highlighted a critical architectural principle: separating stateful data from stateless code. The certificates are stateful data; they should not live inside the ephemeral project folder. I created a permanent, absolute path on the server (/opt/komocred/certs), moved the certbot folder there, and updated the docker-compose.prod.yml to mount the certificates from this new, safe location. The CI/CD pipeline could now clean its workspace without destroying critical data.

Conclusion: A Resilient System Forged in Fire

Deploying an application is a journey. The final, stable production environment is a testament not just to a good plan, but to the ability to diagnose and solve the unexpected problems that arise along the way. Through systematic troubleshooting, I transformed a fragile deployment process into a secure, resilient, and fully automated CI/CD pipeline.

In the final post of this series, I’ll shift focus from the infrastructure to the user, showcasing the application’s key features, the UI/UX decisions I made, and the final polish that turned a tool into a product.