Manual deployments are where bugs slip through and where teams lose confidence in their release process. This is how we build GitHub Actions pipelines that take Go services from a git push to a live rolling ECS deployment with zero manual steps and automatic rollback on failure.
Manual deployments are where bugs slip through and where teams lose confidence in their release process. Every manual step is an opportunity for human error, a skipped test, a wrong environment variable, or a deployment to the wrong cluster. For SaaS teams in Lebanon and the MENA region running backend services on AWS ECS, a properly built CI/CD pipeline is not a nice-to-have, it is a prerequisite for shipping with confidence.
This is how we build GitHub Actions pipelines that take Go services from a git push to a live rolling ECS deployment with zero manual steps and automatic rollback on failure.
What the pipeline needs to do
A complete pipeline for a Go service on ECS covers these stages in sequence:
- Build and test the Go service
- Run the dependency check (ensure no imports are missing from go.mod)
- Build a minimal Docker image
- Push the image to Amazon ECR
- Update the ECS task definition with the new image
- Trigger a rolling deployment on the ECS service
- Monitor deployment health and roll back automatically on failure
Stage seven is what most basic pipeline tutorials skip, and it is the stage that matters most for production SaaS operations.
Repository structure for a Go service
A well-organized Go service for ECS deployment has a predictable layout:
/
├── .github/
│ └── workflows/
│ ├── test.yml # runs on every PR
│ └── deploy.yml # runs on push to main
├── cmd/
│ └── server/
│ └── main.go
├── internal/
├── Dockerfile
├── task-definition.json # ECS task definition template
└── go.mod
The task-definition.json is a template checked into the repository. The pipeline fills in the image tag at deploy time.
The Dockerfile: building a minimal Go image
Small Docker images build faster, push faster, pull faster on ECS instances, and have a smaller attack surface. The multi-stage build is the standard approach for Go:
FROM golang:1.23-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags='-w -s' -o server ./cmd/server
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/server /server
COPY --from=builder /etc/ssl/certs /etc/ssl/certs
USER nonroot:nonroot
ENTRYPOINT ["/server"]
The distroless/static-debian12 base image contains only the Go binary and TLS certificates. No shell, no package manager, no extra libraries. The resulting image is typically 10 to 20 MB, compared to 100 to 200 MB for an ubuntu-based image. On ECS, smaller images mean faster task replacement during deployments.
The -ldflags='-w -s' flags strip the debug symbol table and DWARF data, which reduces binary size by 20 to 30% without affecting runtime behavior.
The test workflow: runs on every pull request
name: Test
on:
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_PASSWORD: test
POSTGRES_DB: testdb
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
cache: true
- name: Run tests
env:
DATABASE_URL: postgres://postgres:test@localhost:5432/testdb?sslmode=disable
run: go test ./... -race -timeout 120s
- name: Verify go.mod is tidy
run: |
go mod tidy
git diff --exit-code go.mod go.sum
The embedded PostgreSQL service container runs the real database during tests. Testing against a real PostgreSQL instance rather than mocks catches SQL compatibility issues, constraint violations, and transaction behavior that SQLite or in-memory databases miss.
-race enables Go's race detector. Race conditions in Go services show up under concurrent load, not in single-threaded unit tests. Catching them in CI before they reach production is worth the 2x slowdown in test execution.
The deployment workflow
name: Deploy
on:
push:
branches: [main]
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::${{ vars.AWS_ACCOUNT_ID }}:role/github-deploy
aws-region: ${{ vars.AWS_REGION }}
- name: Login to ECR
id: ecr-login
uses: aws-actions/amazon-ecr-login@v2
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
cache: true
- name: Run tests
run: go test ./... -race -timeout 120s
- name: Build and push image
id: build
env:
REGISTRY: ${{ steps.ecr-login.outputs.registry }}
REPO: ${{ vars.ECR_REPOSITORY }}
IMAGE_TAG: ${{ github.sha }}
run: |
docker build -t $REGISTRY/$REPO:$IMAGE_TAG -t $REGISTRY/$REPO:latest .
docker push $REGISTRY/$REPO:$IMAGE_TAG
docker push $REGISTRY/$REPO:latest
echo "image=$REGISTRY/$REPO:$IMAGE_TAG" >> $GITHUB_OUTPUT
- name: Render ECS task definition
id: task-def
uses: aws-actions/amazon-ecs-render-task-definition@v1
with:
task-definition: task-definition.json
container-name: api
image: ${{ steps.build.outputs.image }}
- name: Deploy to ECS
uses: aws-actions/amazon-ecs-deploy-task-definition@v2
with:
task-definition: ${{ steps.task-def.outputs.task-definition }}
service: ${{ vars.ECS_SERVICE }}
cluster: ${{ vars.ECS_CLUSTER }}
wait-for-service-stability: true
wait-for-minutes: 10
force-new-deployment: true
AWS authentication: OIDC instead of long-lived keys
The configuration above uses role-to-assume rather than AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. This is the OIDC (OpenID Connect) approach where GitHub Actions assumes an IAM role directly without storing long-lived credentials in GitHub Secrets.
The benefit is significant: there are no long-lived keys to rotate, no risk of a key leaking from a repository, and the assumed role exists only for the duration of the workflow run.
To configure this, create an IAM OIDC identity provider in AWS for GitHub Actions (one-time setup), then create an IAM role that the GitHub Actions workflows for your specific repository can assume, with permissions scoped to ECR push, ECS task definition registration, and ECS service update.
Zero-downtime rolling deployments
ECS rolling deployments replace tasks one at a time while keeping the service available. The deployment stops and rolls back automatically if new tasks fail their health checks.
For zero-downtime, your ECS service needs:
A properly configured health check. ECS waits for new tasks to pass their health check before terminating old tasks. Your application must respond to GET /health with a 200 within the health check timeout. If the health check endpoint itself requires database access, ensure the database connection is validated on startup.
Minimum healthy percent above zero. Set minimumHealthyPercent to at least 50 in your ECS service configuration. With two tasks, this means ECS will keep at least one task running during the deployment.
Adequate deployment timeout. The wait-for-minutes: 10 in the GitHub Actions step matches ECS's deployment timeout. If your service does not reach stable state within 10 minutes, the deployment is marked as failed and GitHub Actions returns a non-zero exit code.
Automatic rollback on failure
When wait-for-service-stability: true is set and the deployment does not stabilize, the GitHub Actions step fails. ECS itself does not automatically revert to the previous task definition, but you can add a rollback step:
- name: Rollback on failure
if: failure()
run: |
PREV_TASK_DEF=$(aws ecs describe-services \
--cluster ${{ vars.ECS_CLUSTER }} \
--services ${{ vars.ECS_SERVICE }} \
--query 'services[0].taskDefinition' \
--output text)
aws ecs update-service \
--cluster ${{ vars.ECS_CLUSTER }} \
--service ${{ vars.ECS_SERVICE }} \
--task-definition $PREV_TASK_DEF
This step runs only when a previous step has failed, retrieves the currently active task definition (which is the last stable version), and updates the service back to it.
Environment-specific configuration
Environment variables for the running service should come from AWS Systems Manager Parameter Store or Secrets Manager, not from the task definition file checked into the repository. Secrets in the task definition file would be committed to version control.
In the task definition template:
{
"secrets": [
{
"name": "DATABASE_URL",
"valueFrom": "arn:aws:ssm:eu-west-1:123456789012:parameter/voxire/prod/database_url"
}
]
}
ECS pulls the secret value from SSM at task launch time. Rotating a database password requires updating the SSM parameter and restarting the tasks, not modifying any code or configuration files.
Key lessons from production
OIDC for AWS authentication eliminates the long-lived credential management problem permanently. The setup takes 20 minutes and the operational benefit is significant, particularly for teams across Lebanon and the MENA region where credential rotation practices are inconsistent.
Running tests in the deployment workflow, not just the PR workflow, catches cases where a dependency in the main branch changed between when a PR was reviewed and when it was merged.
Distroless base images are worth the learning curve. The build time difference over ubuntu-based images is meaningful at the frequency teams deploy.
The rollback step should be in every deployment workflow. Without it, a failed deployment leaves the service in a degraded state that requires manual intervention.
Enjoying this article?
Enter your email and get a clean, formatted PDF of this article - free, no spam.
Not sure where to start?
Voxire builds deployment pipelines and AWS infrastructure for Go SaaS teams in Lebanon and across the MENA region. If you are building CI/CD from scratch or migrating from a manual deployment process, we can help design and implement the pipeline.
https://voxire.com/get-a-quote/



