GitHub Actions CI/CD Tutorial: 3 Workflows That Ship + 5 Traps

Every GitHub Actions CI/CD tutorial shows you one workflow file. Then you ship it, add a second trigger, and watch the whole thing collide with itself at 11pm. The articles never explain how the workflows actually connect.

Real CI/CD on GitHub Actions is three files, not one. Test runs on every push. Build runs when something merges to main and produces an artifact. Deploy consumes that artifact behind an environment gate. Each one has its own trigger, its own permissions, and its own way to break.

Here’s all three, wired together, with the caching pattern that cut a real build from four minutes to ninety seconds — and the five YAML traps that look fine until they aren’t.

The Three-Workflow Pattern (and Why One File Doesn’t Work)

The short version: test on push, build on merge to main, deploy on a successful build. Three files in .github/workflows/, each triggered by a different event. SHA-pin your third-party actions, lock down GITHUB_TOKEN to least privilege, and you have a pipeline that’s hard to break by accident.

One mega-workflow is tempting because it looks simpler. It isn’t. A single file means every trigger re-runs every job. Your deploy permissions get inherited by PR builds. Your concurrency rules fight each other. Debugging which step caused which side effect turns into archaeology.

Splitting works because each workflow has different needs:

ci.yml runs on every push and PR. Read-only permissions. Matrix across Node versions.
build.yml runs on pushes to main. Produces a versioned artifact.
deploy.yml runs when build succeeds. Pulls the artifact, deploys behind an environment with required reviews.

The artifact is the contract between build.yml and deploy.yml. actions/upload-artifact@v4 on one side, actions/download-artifact@v4 on the other. Or push a Docker image and pull the SHA tag. Either way, the handoff is explicit and auditable.

workflow_run is the glue. The catch: it runs in the default branch’s context, not the triggering commit’s context. Trap #1 territory, but we’ll get there.

The shape makes sense. Now the test workflow.

Workflow 1: Test on Every Push (with a Real Node Matrix)

Here’s ci.yml in full:

name: CI
on:
  push:
  pull_request:
    branches: [main]

permissions:
  contents: read

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        node-version: [20, 22]
    steps:
      - uses: actions/checkout@v6
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}
          cache: 'npm'
      - run: npm ci
      - run: npm run lint
      - run: npm test

A few things in here aren’t optional.

permissions: contents: read at the top. The default GITHUB_TOKEN can write to your repo, comment on PRs, and create releases. A test workflow doesn’t need any of that. Scope it down and you’ve closed an attack surface you didn’t know was open.

fail-fast: false on the matrix. The default behavior cancels every other job the moment one fails. That’s the opposite of what you want when you’re debugging — you want to see whether Node 22 fails the same way Node 20 does, not guess from one cancelled run.

cache: 'npm' on setup-node@v4 is the github actions cache dependencies pattern most tutorials still skip. No separate actions/cache step. The action hashes package-lock.json, stores node_modules under a key keyed to the OS and Node version, and restores it on the next run.

npm ci, not npm install. CI environments need deterministic installs. npm ci errors if the lockfile and package.json disagree. npm install quietly updates the lockfile and moves on, which is exactly the wrong behavior for an automated build.

The numbers: on a cold run, this takes about three and a half minutes per Node version. With the cache warm, around ninety seconds. That’s roughly a 60% drop on the slowest path, with no extra YAML.

Two versions in the matrix is honest. Add OSes only if you actually ship on them — windows-latest runners cost twice as much per minute, and most teams who add Windows to the matrix never look at the Windows logs.

Tests green across the matrix. Build’s next, and the cache story gets more interesting.

Workflow 2: Build and Package on Merge to Main

name: Build
on:
  push:
    branches: [main]

concurrency:
  group: build-${{ github.ref }}
  cancel-in-progress: true

permissions:
  contents: read

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
      - uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'npm'
      - run: npm ci
      - run: npm run build
      - uses: actions/upload-artifact@v4
        with:
          name: build-${{ github.sha }}
          path: dist/
          retention-days: 7

Trigger is push to main only. Not pull_request. PR forks can run your workflow with their code, which means a malicious PR could harvest your build secrets if you wired them in here. Builds are for merged code.

concurrency with cancel-in-progress: true means a fast second merge kills the in-flight build for the older commit. You don’t waste a runner producing an artifact you’re about to overwrite.

The artifact name includes github.sha. That’s deliberate. Same code, same artifact, every time. When deploy downloads it, you know exactly what shipped.

Docker variant in one block (if you’re new to containers, the Docker tutorial for developers covers the fundamentals this assumes):

- uses: docker/build-push-action@v6
  with:
    push: true
    tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

type=gha wires the Docker layer cache to the same GitHub Actions cache backend setup-node uses. On a real Next.js project, this dropped the build from four minutes to ninety seconds — same project as the test numbers above.

One cache-key gotcha worth flagging now: if you reach for actions/cache directly to cache your build output (not just deps), the key needs to include the source SHA. Otherwise feature branches will pull each other’s stale builds, and you’ll debug a corrupt production deploy for hours. That’s trap #4 — bookmark it.

Artifact’s published. Deploy time — and this is where pipelines start hurting people.

Workflow 3: Deploy with Environment Protection

name: Deploy
on:
  workflow_run:
    workflows: ['Build']
    types: [completed]

permissions:
  id-token: write
  contents: read

jobs:
  deploy:
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    runs-on: ubuntu-latest
    environment: production
    concurrency:
      group: production-deploy
      cancel-in-progress: false
    steps:
      - uses: actions/download-artifact@v4
        with:
          name: build-${{ github.event.workflow_run.head_sha }}
          run-id: ${{ github.event.workflow_run.id }}
          github-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-deploy
          aws-region: us-east-1
      - run: aws s3 sync ./dist s3://my-bucket --delete

The if: guard on workflow_run is non-negotiable. Without it, every completed build — including the failed ones — triggers a deploy attempt. Check conclusion == 'success' and let the rest fall through.

environment: production is the line that unlocks GitHub’s deployment protection rules. Required reviewers, wait timers, branch restrictions — you configure them in the repo’s Settings → Environments, not in YAML. The job pauses until someone approves it.

id-token: write turns on OIDC. This is the line that lets you delete every long-lived AWS access key from your repo secrets. configure-aws-credentials@v4 requests a short-lived token from AWS, scoped to a role you control via trust policy. Nothing to rotate. Everything auditable. If you still have an AWS_ACCESS_KEY_ID in your repo secrets in 2026, fix it before you finish reading this article.

cancel-in-progress: false on the concurrency group is deliberate. You want deploys to queue, not cancel each other. Killing a half-finished deploy leaves the environment in an unknown state, and that’s the kind of incident that ruins weekends.

Rollback: keep artifact retention long enough that re-running the previous deploy job is your rollback. Don’t build a separate rollback workflow on day one. The simpler version works for the first six months, and by then you’ll know what you actually need.

Pipeline ships end to end. Now the trapdoors.

5 YAML Traps That Break Teams

These are the failures that don’t show up in your test runs. They show up in incident reports.

Trap 1: pull_request_target with PR code checkout. The pull_request_target trigger runs in the base repo’s context, with access to your secrets. If you actions/checkout the PR’s HEAD inside that workflow, you’ve handed your secrets to whoever opened the PR. Use pull_request for anything that needs to execute PR code. Reserve pull_request_target for jobs that only read metadata, never run the PR’s scripts.

Trap 2: Third-party actions pinned to tags. uses: some-org/some-action@v3 looks fine until the maintainer’s account gets compromised and v3 quietly gets retagged to a malicious commit. Pin to a full commit SHA: uses: some-org/some-action@a1b2c3d4e5f6.... Dependabot will bump the SHAs for you with full release notes. Tags are mutable. SHAs aren’t. This is the single fix that prevents the supply-chain incident that makes the news.

Trap 3: Permissions wildcards. No permissions: block means your workflow inherits read/write access to repo contents, issues, pull requests, packages, and more. Declare permissions: at the top of every workflow, scoped to what it actually needs. contents: read is the default you want. Add others one at a time, with a comment explaining why.

Trap 4: Cache key collisions. hashFiles('package-lock.json') alone shares a cache across branches and OSes. Feature branches start pulling cached node_modules from main with a different lockfile state, builds silently corrupt, and you debug for hours. Include ${{ runner.os }} and the Node version in the cache key. Use restore-keys for graceful fallback to a partial match — exact key first, then progressively less specific.

Trap 5: Unquoted cron strings. on: schedule: - cron: 0 9 * * * parses, but the * characters can interact with YAML in ways you don’t expect, and GitHub silently accepts a malformed cron without running the workflow. Quote the string: cron: '0 9 * * *'. And always pair schedule: with workflow_dispatch: so you can trigger it manually to confirm it actually fires. A scheduled workflow that silently fails to fire is the worst kind of bug — you don’t notice for days.

Audit your YAML now. Each trap maps to a specific line in the workflows above.

The Bottom Line

Three workflows isn’t more complex than one. It’s less complex once it’s running. Each file has one job, one trigger, one permission scope. Failures stay where they happen, and you debug one thing at a time.

If you’re starting today, copy ci.yml first. Get it green across a Node matrix. Add build.yml with the cache once tests are stable. Layer in deploy.yml with environment protection last. Trying to ship all three on day one is how you end up debugging the artifact handoff at midnight.

Of the five traps, fix two this week: SHA-pin every third-party action, and lock down GITHUB_TOKEN permissions on every workflow. Those two alone prevent the incidents that get written up.

Copy the three YAML files into .github/workflows/, push, and watch the Actions tab. Pair it with a Git workflow that doesn’t fight your pipeline and your team will spend less time arguing about branches than writing CI.