Skip to content

fix: ARC/DinD pre-create mount dirs, install docker compose, and pull build-tools image#42906

Merged
lpcox merged 15 commits into
mainfrom
fix/arc-dind-mkdir-mount-paths
Jul 2, 2026
Merged

fix: ARC/DinD pre-create mount dirs, install docker compose, and pull build-tools image#42906
lpcox merged 15 commits into
mainfrom
fix/arc-dind-mkdir-mount-paths

Conversation

@lpcox

@lpcox lpcox commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Summary\n\nFixes ARC/DinD (Actions Runner Controller + Docker-in-Docker) topology failures by addressing path visibility constraints: on ARC/DinD, /tmp/gh-aw/ is not accessible to the Docker daemon, so all mounts, binaries, prompts, and log paths must reside under ${RUNNER_TEMP}/gh-aw/ (a bind-mounted, daemon-visible location). Also bumps DefaultFirewallVersion to v0.27.22.\n\n### Changes\n\n#### Runtime / mount fixups (pkg/workflow/awf_helpers.go)\n\n- Pre-creates rw mount source directories (${RUNNER_TEMP}/gh-aw/home, ${RUNNER_TEMP}/gh-aw/sandbox/agent) before AWF invocation. AWF validates that mount source paths exist before starting containers; the parent dir is created by actions/setup but the subdirectories may not exist.\n- Copies prompt files from /tmp/gh-aw/aw-prompts/ to ${RUNNER_TEMP}/gh-aw/aw-prompts/ so they are visible to the Docker daemon.\n\n#### Node.js daemon-visibility (pkg/workflow/compiler_yaml_main_job.go)\n\n- Inserts "Ensure Node.js is at daemon-visible path" setup step for ARC/DinD. setup-node may cache node under /home/runner/_work/_tool/node/..., which is not bind-mounted into the AWF container. The step copies node to ${RUNNER_TEMP}/gh-aw/tool-cache/node if needed and sets GH_AW_NODE_BIN.\n- collectArtifactPaths: for ARC/DinD with firewall, uses ${{ runner.temp }}/gh-aw/... (Actions expressions) instead of /tmp/gh-aw/... constants, because actions/upload-artifact's with: block does not expand shell variables.\n\n#### Copilot CLI daemon-visibility (pkg/workflow/nodejs.go, pkg/workflow/copilot_engine_execution.go)\n\n- Adds "Copy Copilot CLI to daemon-visible path" step for ARC/DinD + firewall: copies /usr/local/bin/copilot to ${RUNNER_TEMP}/gh-aw/bin/copilot.\n- Prompt file path in buildCopilotBaseCommand now switches to ${RUNNER_TEMP}/gh-aw/aw-prompts/prompt.txt for ARC/DinD (both sandbox and non-sandbox modes).\n\n#### Docker Compose installation (pkg/workflow/copilot_engine_installation.go, pkg/workflow/nodejs.go)\n\n- New generateDockerComposeInstallStep(): downloads Docker Compose CLI plugin v2.36.2 for the runner's arch (x86_64 or aarch64). Injected into the install sequence for ARC/DinD runners, which may not have it pre-installed. AWF requires Docker Compose to orchestrate the squid-proxy, agent, and api-proxy containers.\n\n#### Build-tools sysroot image (pkg/workflow/docker.go)\n\n- collectDockerImages: for ARC/DinD + firewall, appends build-tools:<awfImageTag> to the image pull list. AWF uses this as an init container to populate the sysroot volume with system binaries (gcc, make, libraries) that are not visible on the DinD daemon's filesystem.\n\n#### Firewall log paths (pkg/workflow/engine_firewall_support.go)\n\n- generateSquidLogsUploadStep: now accepts workflowData; uses ${{ runner.temp }}/gh-aw/sandbox/firewall/logs/ for ARC/DinD in the with: block.\n- generateFirewallLogParsingStep: uses ${RUNNER_TEMP}/gh-aw/sandbox/firewall/logs in run: and ${{ runner.temp }}/gh-aw/sandbox/firewall/logs in env: for ARC/DinD (Actions expressions are required in env: blocks, shell vars in run: blocks).\n\n#### Version bump (pkg/constants/version_constants.go)\n\n- DefaultFirewallVersion: v0.27.20v0.27.22.\n\n### Tests\n\n- pkg/workflow/awf_helpers_test.go: TestBuildAWFCommand_ArcDindPreCreatesMountDirs — verifies mkdir -p for mount source dirs and --mount flags for rw paths.\n- pkg/workflow/docker_build_tools_test.go (new): TestCollectDockerImages_BuildToolsForArcDind with 4 sub-tests covering inclusion/exclusion of build-tools image across topology and firewall combinations.\n- pkg/workflow/testdata/TestWasmGolden_*/: all golden files updated to reflect version bump and new ARC/DinD steps.\n\n### Affected topologies\n\nAll changes are guarded by isArcDindTopology(). Standard (non-ARC/DinD) runner behaviour is unchanged.

Generated by PR Description Updater for #42906 · 108.6 AIC · ⌖ 9.31 AIC · ⊞ 4.7K ·

lpcox and others added 3 commits July 1, 2026 22:16
AWF validates that --mount host paths exist before starting containers.
On ARC/DinD, the compiler emits mounts for ${RUNNER_TEMP}/gh-aw/home and
${RUNNER_TEMP}/gh-aw/sandbox/agent which may not exist yet (only the parent
${RUNNER_TEMP}/gh-aw/ is created by actions/setup).

Add mkdir -p for these directories in the generated shell script, right
after the DOCKER_HOST detection probe and before the AWF invocation.

Error seen in canary run 28566765653:
  Invalid volume mount: ...gh-aw/home:...gh-aw/home:rw
  Reason: Host path does not exist: ...gh-aw/home

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
ARC/DinD runners may not have the Docker Compose v2 CLI plugin
pre-installed. AWF requires 'docker compose' to orchestrate its
containers (squid-proxy, agent, api-proxy).

Add a generated step that downloads and installs docker-compose
v2.36.2 into $DOCKER_CONFIG/cli-plugins/ before the AWF invocation.

Error seen in canary: 'unknown shorthand flag: -d in -d' when AWF
tried to run 'docker compose up -d'.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
AWF uses --pull never when starting containers, so all images must be
pre-downloaded. The build-tools image (used as the sysroot-stage init
container on arc-dind topology) was missing from the download list,
causing: 'No such image: ghcr.io/github/gh-aw-firewall/build-tools:0.27.21'

Add build-tools to collectDockerImages() when arc-dind topology is active.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 2, 2026 05:38

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves ARC/DinD runner topology support by ensuring required host mount directories exist before AWF runs, guaranteeing Docker Compose v2 availability for AWF orchestration, and pre-pulling the AWF build-tools image that ARC/DinD needs when --pull never is used.

Changes:

  • Pre-create ARC/DinD bind-mount source directories before invoking AWF.
  • Install the Docker Compose v2 CLI plugin on ARC/DinD runners.
  • Include the AWF build-tools image in the docker image pre-download set for ARC/DinD.
Show a summary per file
File Description
pkg/workflow/nodejs.go Adds ARC/DinD-specific Docker Compose plugin install step to the AWF-injected npm engine install sequence.
pkg/workflow/docker.go Ensures the AWF build-tools image is included in collected images for ARC/DinD + firewall-enabled workflows.
pkg/workflow/copilot_engine_installation.go Introduces a workflow step generator that installs the Docker Compose v2 CLI plugin.
pkg/workflow/awf_helpers.go Creates required ARC/DinD mount directories before AWF invocation.
pkg/workflow/awf_helpers_test.go Adds a unit test asserting ARC/DinD mount directories are pre-created in the generated AWF command.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 5/5 changed files
  • Comments generated: 3
  • Review effort level: Low

Comment thread pkg/workflow/copilot_engine_installation.go Outdated
Comment thread pkg/workflow/docker.go
Comment on lines +135 to +146
// Add build-tools sysroot image for ARC/DinD topology.
// AWF uses this as an init container to populate the sysroot volume with
// system binaries (gcc, make, libraries) that are invisible on the DinD daemon's FS.
if isArcDindTopology(workflowData) {
buildToolsImage := constants.DefaultFirewallRegistry + "/build-tools:" + awfImageTag
if !setutil.Contains(imageSet, buildToolsImage) {
images = append(images, buildToolsImage)
imageSet[buildToolsImage] = struct {
}{}
dockerLog.Printf("Added AWF build-tools sysroot container for arc-dind: %s", buildToolsImage)
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TestCollectDockerImages_BuildToolsForArcDind in pkg/workflow/docker_build_tools_test.go (commit $(git rev-parse --short HEAD)). It covers four cases: image included when arc-dind + firewall enabled, excluded without arc-dind topology, excluded when firewall is disabled, and correct tag used with a custom AWF version.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TestCollectDockerImages_BuildToolsForArcDind in pkg/workflow/docker_build_tools_test.go (18d8420). Covers: image included when arc-dind + firewall enabled, excluded without arc-dind topology, excluded when firewall is disabled, and correct tag used with a custom AWF version.

Comment thread pkg/workflow/docker.go
Comment on lines +135 to +145
// Add build-tools sysroot image for ARC/DinD topology.
// AWF uses this as an init container to populate the sysroot volume with
// system binaries (gcc, make, libraries) that are invisible on the DinD daemon's FS.
if isArcDindTopology(workflowData) {
buildToolsImage := constants.DefaultFirewallRegistry + "/build-tools:" + awfImageTag
if !setutil.Contains(imageSet, buildToolsImage) {
images = append(images, buildToolsImage)
imageSet[buildToolsImage] = struct {
}{}
dockerLog.Printf("Added AWF build-tools sysroot container for arc-dind: %s", buildToolsImage)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation. The embedded action_pins.json is synced from .github/aw/actions-lock.json via make sync-action-pins, which is populated during the gh aw update / release cycle. The build-tools image has never been included in that cycle because it's a new image type — the existing pinned versions only cover agent, squid, api-proxy, cli-proxy, and agent-act. The release workflow (release.md) validates that all container images referenced by compiled workflows have SHA pins before releasing, so the first release that includes this will require the pins to be present in actions-lock.json. Adding the digest pins requires registry access to fetch them — that needs to happen as part of the gh aw update pin-refresh process, which is the appropriate path for all firewall images. Flagging this as a prerequisite for the first release that ships arc-dind support.

lpcox and others added 3 commits July 1, 2026 22:50
…RC/DinD

On ARC runners, setup-node may find a pre-cached node at the original
tool cache (/home/runner/_work/_tool/) which is NOT under RUNNER_TEMP
and therefore not bind-mounted into the AWF container. Add a step after
runtime setup that copies node to the redirected tool cache if needed.

Also add a step to copy the Copilot CLI binary from /usr/local/bin
(where install_copilot_cli.sh places it) to ${RUNNER_TEMP}/gh-aw/bin/
which is the daemon-visible path the AWF command references.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
…ARC/DinD

On ARC/DinD, /tmp/gh-aw/ is on the runner's filesystem but NOT visible to
the Docker daemon. The activation job writes prompt files to
/tmp/gh-aw/aw-prompts/, but AWF can only access daemon-visible paths.

Two fixes:
1. Copy /tmp/gh-aw/aw-prompts/ to ${RUNNER_TEMP}/gh-aw/aw-prompts/ before
   AWF invocation (in the arcDindDockerHostProbe preamble)
2. Change --prompt-file path in copilot engine from /tmp/gh-aw/... to
   ${RUNNER_TEMP}/gh-aw/... when arc-dind topology is active

Other engines (Claude, Codex) have the same hardcoded /tmp/ path and will
need similar fixes for arc-dind support.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot run pr-finisher skill

Copilot AI and others added 2 commits July 2, 2026 06:27
… arc-dind

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
The post-run steps (firewall log parsing, artifact upload) used hardcoded
/tmp/gh-aw/sandbox/firewall/ paths. On ARC/DinD, AWF writes logs to
${RUNNER_TEMP}/gh-aw/sandbox/firewall/ (daemon-visible). This mismatch
caused 'No log sources found' errors and empty artifact uploads.

Fix generateFirewallLogParsingStep(), generateSquidLogsUploadStep(), and
the unified artifact path list to use ${RUNNER_TEMP} paths when arc-dind
topology is active.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, make sure the branch is up to date with main, address any remaining review feedback, and rerun checks.

Generated by 👨‍🍳 PR Sous Chef · 10.5 AIC · ⌖ 16.1 AIC · ⊞ 6.4K ·

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@copilot please run the pr-finisher skill, make sure the branch is up to date with main, address any remaining review feedback, and rerun checks.

...

  • ✅ Reviews — Both open threads were already addressed in prior commits. Thread docker.go:135-146 (missing tests): TestCollectDockerImages_BuildToolsForArcDind was added in pkg/workflow/docker_build_tools_test.go (18d8420), covering all four branches. Thread docker.go:135-145 (missing embedded pin): replied with explanation that build-tools pin entries are added via make sync-action-pins / gh aw update during the release cycle — flagged as a prerequisite for the first arc-dind release. Both threads have @copilot replies. Programmatic thread resolution is blocked by the proxy in this environment; manual resolve by a maintainer is needed.
  • ✅ Checks (local) — make fmt, make lint, make test-unit, and make agent-report-progress all pass with no file changes.
  • ❓ Checks (CI) — Stale after agent push; needs human re-trigger. Prior agent job failure (84706356355) was EACCES: permission denied, lstat '/tmp/gh-aw/sandbox/firewall/logs' in the artifact upload post-step — infra-level permissions issue from the DinD firewall container leaving root-owned files, not caused by this PR's changes. Not reproducible locally.
  • ✅ Mergeable — Branch is 8 commits ahead of main, 0 behind. mergeable_state: unstable is driven by the agent CI failure, not a conflict.

Actions taken: No code changes needed in this pass; all three merge-ready conditions were already satisfied by prior commits.
Hand-off: A maintainer should (1) manually resolve the two open review threads, and (2) re-trigger CI (close/reopen or workflow_dispatch) to get a fresh green agent run before merging.

Copilot AI requested a review from gh-aw-bot July 2, 2026 08:20
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Hey @lpcox 👋 — great work tracking down and fixing these three interrelated ARC/DinD pain points! The pre-create of rw mount directories, Docker Compose plugin installation, and inclusion of the build-tools image in collectDockerImages() are exactly the kind of targeted, well-understood fixes this subsystem needed.

The PR is well-scoped — all changes live in pkg/workflow/ and address a single topology, the description clearly explains each fix and why it's needed, and the new docker_build_tools_test.go file plus the additions to awf_helpers_test.go give solid coverage. This looks ready for review. ✅

Generated by ✅ Contribution Check · 133.2 AIC · ⌖ 18.4 AIC · ⊞ 6.3K ·

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Unresolved review threads to address (newest first):

Generated by 👨‍🍳 PR Sous Chef · 8.92 AIC · ⌖ 16.3 AIC · ⊞ 6.4K ·

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Unresolved review threads to address (newest first):...

  • ✅ Reviews — Both unresolved threads have @copilot replies from prior commits. Thread docker.go:135-146 (missing tests): TestCollectDockerImages_BuildToolsForArcDind was added in pkg/workflow/docker_build_tools_test.go (18d8420), covering all four branches (arc-dind+firewall, no arc-dind, firewall disabled, custom tag). Thread docker.go:135-145 (missing embedded pin): acknowledged and explained that build-tools pin entries are populated via make sync-action-pins / gh aw update during the release cycle — flagged as a prerequisite for the first arc-dind release. Programmatic thread resolution is proxy-blocked in this agent environment; a maintainer must resolve both threads manually.
  • ✅ Checks (local) — make fmt, make lint, make test-unit, and make agent-report-progress all pass on the current HEAD (d628540, which includes the merge from main).
  • ❓ Checks (CI) — A new CI run (28584906610) is in progress after the main-merge commit; results are not yet available. The prior agent job failure was EACCES: permission denied, lstat '/tmp/gh-aw/sandbox/firewall/logs' — an infra-level permissions issue from the DinD firewall container, unrelated to this PR's changes.
  • ✅ Mergeable — Branch was already merged up with main (merge commit d628540); 9 commits ahead, 0 behind.

Actions taken: No code changes in this pass. Branch is up to date and all local checks are green.
Hand-off: A maintainer should (1) manually resolve the two open review threads on docker.go, and (2) wait for the in-progress CI run (28584906610) to complete before merging.

lpcox and others added 2 commits July 2, 2026 06:58
…ith blocks

${RUNNER_TEMP} is a shell variable that only expands in `run:` blocks.
In `env:` and `with:` blocks, GitHub Actions only expands ${{ }}
expressions. The firewall log summary step and artifact upload paths
were using ${RUNNER_TEMP} which was passed as a literal string, causing
'No log sources found' even though AWF_LOGS_DIR was set.

Fix: use ${{ runner.temp }} in env/with contexts for ARC/DinD topology.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 5.87 AIC · ⌖ 21.7 AIC · ⊞ 6.4K ·

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

...

  • ✅ Reviews — Both threads have @copilot replies from prior commits. Thread docker.go:135-146 (missing tests): addressed in 18d8420TestCollectDockerImages_BuildToolsForArcDind added to pkg/workflow/docker_build_tools_test.go. Thread docker.go:135-145 (missing embedded pin): acknowledged with explanation that build-tools pin entries are added via make sync-action-pins / gh aw update during the release cycle. Programmatic resolution is proxy-blocked; a maintainer must resolve both threads manually on GitHub.
  • ✅ Checks (local) — make fmt and make agent-report-progress pass on HEAD 30da62a (second merge-from-main commit).
  • ❓ Checks (CI) — New run 28596792313 is in progress as of triage; not yet complete. Prior agent failures were infra-level EACCES on DinD firewall sandbox dirs — unrelated to this PR.
  • ✅ Mergeable — Branch is up to date with main (merge commit 30da62a), no conflicts.

Actions taken: No code changes. Branch is current and local validation is green.
Hand-off: A maintainer must resolve the two open review threads on docker.go and wait for CI run 28596792313 to complete before merging.

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Picks up the pre-shutdown squid log chmod fix (#5817) which resolves
artifact permission errors on ARC/DinD runners.
@lpcox

lpcox commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot resolve the merge conflicts in this pull request

Copilot AI and others added 2 commits July 2, 2026 16:44
…unt-paths

# Conflicts:
#	.github/workflows/plan.lock.yml

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Resolved. The only conflict was in .github/workflows/plan.lock.yml — a Copilot CLI version bump (1.0.67→1.0.68, SDK 1.0.4→1.0.5) from main that collided with the recompiled lock file on this branch. Took main's side (newer versions). Wasm golden files were regenerated post-merge. All unit tests pass.

Copilot finished work on behalf of lpcox July 2, 2026 16:51
@lpcox lpcox merged commit 0b1173e into main Jul 2, 2026
@lpcox lpcox deleted the fix/arc-dind-mkdir-mount-paths branch July 2, 2026 16:52
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🎉 This pull request is included in a new release.

Release: v0.82.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants