Skip to content

fix: [#428] Docker vulnerability remediation pass 1 (all 8 images)#436

Merged
josecelano merged 39 commits intomainfrom
428-remediate-docker-vulnerabilities-apr2026
Apr 9, 2026
Merged

fix: [#428] Docker vulnerability remediation pass 1 (all 8 images)#436
josecelano merged 39 commits intomainfrom
428-remediate-docker-vulnerabilities-apr2026

Conversation

@josecelano
Copy link
Copy Markdown
Member

Summary

This PR implements Docker vulnerability remediation pass 1 for all 8 images tracked in issue #428. Each image was scanned with Trivy, remediation was applied where possible, results were verified, and follow-up issues were created for remaining unresolved CVEs.

Closes #428

Changes by Image

1. torrust/tracker-deployer (trixie) — partial remediation

2. torrust/tracker-backup (trixie) — remediation no change

3. torrust/tracker-ssh-server (Alpine 3.23.3) — fully remediated ✅

  • Added apk upgrade --no-cache to base layer
  • Fixed malformed entrypoint script (echoprintf for multi-line in Alpine)
  • After: 0 HIGH, 0 CRITICAL

4. torrust/tracker-provisioned-instance (Ubuntu 24.04) — fully remediated ✅

  • Added --no-install-recommends + apt-get upgrade -y to base layer
  • After: 0 HIGH, 0 CRITICAL

5. caddy (3rd-party) — partial remediation

6. prom/prometheus (3rd-party) — partial remediation

7. grafana/grafana (3rd-party) — partial remediation

8. mysql (3rd-party) — monitored, no safe upgrade

Documentation Updates

  • docs/security/docker/scans/ — added remediation pass 1 sections to all 8 scan files
  • docs/security/docker/scans/README.md — updated global summary table
  • docs/issues/428-docker-vulnerability-analysis-apr8-2026.md — all checklists and acceptance criteria complete

Follow-up Issues Created

Issue Image Remaining
#429 deployer 44 HIGH, 1 CRITICAL
#431 backup 6 HIGH, 0 CRITICAL
#432 caddy 14 HIGH, 4 CRITICAL
#433 prometheus 6 HIGH, 4 CRITICAL
#434 grafana 4 HIGH, 0 CRITICAL
#435 mysql 7 HIGH, 1 CRITICAL

- add/update Docker scan reports for April 8, 2026

- add issue specification and sequential per-image remediation plan

- rename spec file with issue prefix and set issue reference

- add cspell dictionary term for remediation planning
- install gnupg only for OpenTofu installation step

- purge gnupg/dirmngr after OpenTofu install

- apply package upgrade during runtime dependency install

- mark deployer remediation subtask complete
- confirm deployer build and smoke test after remediation

- update scan counts to 44 HIGH / 1 CRITICAL

- mark rebuild, re-scan, and docs subtasks complete
- add apt-get upgrade in backup base image

- mark backup remediation subtask complete
- backup image rebuilt and validated

- vulnerability counts unchanged at 6 HIGH / 0 CRITICAL

- mark backup verification and docs subtasks complete
- apply apk upgrade to pick latest security fixes

- remove duplicate ssh host-key generation step

- mark ssh remediation subtask complete
- record vuln scan improvement to 0 HIGH / 0 CRITICAL

- document expected secret-scan test key findings

- mark ssh image checklist complete
- use --no-install-recommends

- add apt-get upgrade in base install step

- mark provisioned-instance remediation subtask complete
- record scan improvement from 12 HIGH to 0

- mark image 4 checklist fully complete
- update compose template caddy image tag

- sync docker security workflow image matrix

- mark caddy remediation subtask complete
- update caddy scan baseline to 2.10.2

- document reduction from 18/6 to 14/4 (HIGH/CRITICAL)

- mark caddy verification/docs subtasks complete
- update domain config and renderer expectations

- align docs/examples in source comments

- mark Prometheus remediation subtask complete
- update scan baseline to prom/prometheus:v3.5.1

- document reduction from 16/4 to 6/4 (HIGH/CRITICAL)

- mark prometheus verification/docs subtasks complete
- update domain config and code references

- align source examples and tests

- mark Grafana remediation subtask complete
- update scan baseline to grafana/grafana:12.4.2

- document reduction from 18/6 to 4/0 (HIGH/CRITICAL)

- mark Grafana verification/docs subtasks complete
@josecelano josecelano marked this pull request as ready for review April 9, 2026 11:59
…ed root cause)

The SSH connectivity timeout in GitHub runners was caused by files checked out
with world/group-readable permissions. OpenSSH silently rejects private keys that
aren't exactly mode 0600. The CI test failure when permissions normalization was
disabled confirms this is the actual root cause, not a flaky test.

Normalizing to 0600 ensures SSH keys work regardless of git checkout permissions.
@josecelano
Copy link
Copy Markdown
Member Author

ACK 7a44e51

@josecelano
Copy link
Copy Markdown
Member Author

📋 For Existing Deployments: How to Apply Docker Image Remediation

If you've already deployed the tracker using a previous version of the deployer (before this PR), you'll need to manually update your Docker images to benefit from these vulnerability fixes.

Docker Images to Update

Service Previous Tag New Tag Vulnerability Impact
prom/prometheus v3.5.0 v3.5.1 10 CVEs fixed (16 HIGH → 6 HIGH)
grafana/grafana 12.3.1 12.4.2 14 CVEs fixed (18 HIGH, 6 CRITICAL → 4 HIGH, 0 CRITICAL)
caddy 2.10 (or untagged) 2.10.2 4 CVEs fixed (18 HIGH, 6 CRITICAL → 14 HIGH, 4 CRITICAL)

All other services are already updated in this version of the deployer.

Update Steps

  1. Update docker-compose.yml in your deployment

    cd /opt/torrust
    nano docker-compose.yml  # or your preferred editor

    Find and update these lines:

    # Before
    prometheus:
      image: prom/prometheus:v3.5.0
    
    grafana:
      image: grafana/grafana:12.3.1
    
    caddy:
      image: caddy:2.10
    
    # After
    prometheus:
      image: prom/prometheus:v3.5.1
    
    grafana:
      image: grafana/grafana:12.4.2
    
    caddy:
      image: caddy:2.10.2
  2. Pull the new images

    docker compose pull prometheus grafana caddy
  3. Recreate containers with new images

    docker compose up -d prometheus grafana caddy
  4. Verify services are running

    docker compose ps
    docker compose logs prometheus -n 50  # Check for any startup issues

Verification

After updating, you can verify the new image tags are running:

docker images | grep -E "prom/prometheus|grafana/grafana|caddy"

Expected output (or similar):

prom/prometheus                     v3.5.1    ...
grafana/grafana                     12.4.2    ...
caddy                               2.10.2    ...

Rollback (if needed)

If you experience issues, you can quickly rollback by reverting the docker-compose.yml changes and running:

docker compose up -d prometheus grafana caddy

What About Future Deployments?

Any new deployments created with this version of the deployer (v0.1.0+) will automatically use the updated image versions. The deployer pins these versions in the code:

  • src/domain/prometheus/config.rsv3.5.1
  • src/domain/grafana/config.rs12.4.2
  • templates/docker-compose/docker-compose.yml.teracaddy:2.10.2

Related: See issue #437 for tracking remaining unresolved CVEs and future security improvements.

@josecelano
Copy link
Copy Markdown
Member Author

Follow-up on the Trivy: 5 configurations not found warning:

Root cause identified in workflow logs: third-party SARIF uploads are failing with HTTP 422 because the custom gh api /code-scanning/sarifs call sends category, which is now rejected (Invalid input: "category" is not a permitted key).

This warning is therefore a code-scanning upload/configuration issue, not a regression in the Docker remediation work in this PR.

I opened a dedicated fix issue: #437.

Given all required CI checks are green and this warning is neutral, PR #436 can be merged safely, then #437 can fix the SARIF upload path in a focused follow-up.

@josecelano josecelano merged commit e300498 into main Apr 9, 2026
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Analyze and remediate Docker image vulnerabilities from April 8, 2026 scan

1 participant