Skip to content

feat(security): validate user-supplied strings flowing into shell context#109

Merged
VincentGuyader merged 2 commits into
masterfrom
feat/security-audit-interpolation
May 6, 2026
Merged

feat(security): validate user-supplied strings flowing into shell context#109
VincentGuyader merged 2 commits into
masterfrom
feat/security-audit-interpolation

Conversation

@VincentGuyader
Copy link
Copy Markdown
Member

Summary

Closes the post-#106 security audit follow-up. Adds 9 .validate_*
helpers in R/utils.R (@noRd) that reject inputs which would
otherwise be interpolated raw into the generated Dockerfile's
shell commands.

Validation now fires at function entry of dock_from_desc() and
dock_from_renv() for every public parameter that flows into a
Dockerfile RUN, COPY, FROM, ARG or ENV directive.

Sites closed

Param Site Vector
FROM FROM <X>:<r_version> [AS <Y>] newline-injection of new directives
AS FROM <X> AS <Y> same as FROM
repos values echo "options(repos = ...)" | tee Rprofile.site ' / " breaks R / shell quoting
repos names same RUN, via dput() backticks in names trigger shell command-substitution
extra_sysreqs apt-get install -y <pkg> ; injects arbitrary commands
renv_version R -e 'remotes::install_version("renv", version = "<x>")' '"); system("evil")... breaks the inner quoting
renv_paths_cache ARG RENV_PATHS_CACHE=<x> newline injects new directive
lockfile basename COPY <basename> renv.lock spaces break the COPY directive
use_pak echo "options(renv.config.pak.enabled = <%s>, ...)" non-logical breaks shell quoting
r_version (lockfile) FROM <X>:<r_version> defensive depth

Out of scope (per threat model)

  • R6 helpers in R/add.R (dock$RUN, dock$COPY, ...) -
    author-controlled API. Injection here is self-foot-shooting, not
    attack.
  • distro - deprecated, ignored at runtime.
  • github_pat - match.arg'd against a fixed enum.

Process note

This PR went through two rounds of pr-reviewer. The first round
caught 2 additional blocking vectors I had missed (use_pak
quote injection in the Rprofile.site echo, and names(repos)
backtick-substitution via dput()). Both are closed in this PR.

The discipline rule "spawn pr-reviewer before push, no exception"
(feedback_pr_reviewer_no_skip.md) saved this PR from shipping
with 2 holes the agent eventually catches.

Test plan

  • TDD red-first: 9 new expect_error() blocks across
    test-dock_from_desc.R and test-dock_from_renv.R. Each was
    asserted red-against-buggy by mental revert.
  • One existing test patched: renv_version = "banana" ->
    "1.2.3" since the new validator rejects non-semver strings.
    The test still exercises the same install_version code path.
  • Full suite: [ FAIL 0 | WARN 3 | SKIP 0 | PASS 302 ].
  • R CMD check --as-cran: 0 errors / 0 warnings / 1 note
    ("unable to verify current time" host glitch, unrelated).
  • All @param blocks updated to mention the validation
    contract per parameter.

Context

Tier-A item from the post-CRAN-prep roadmap (memory/project_dockerfiler_global_security_pass.md).
After this lands, master is ready for the next CRAN-readiness step
(check_win_devel, urlchecker, revdepcheck) or further feature
work per the maintainer's call.

The renv_paths_cache bullet still described the pre-#106 behaviour
(default = "/root/.cache/R/renv"). After #106, the signature default
is NULL with auto-derivation from `user`: /root for user = NULL,
/home/<user> otherwise. Update the bullet to reflect the actual
runtime, closing the doc-runtime drift introduced by the merge.

No code change. No NAMESPACE / Rd impact.
@VincentGuyader VincentGuyader requested a review from Copilot May 6, 2026 11:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens dock_from_desc() and dock_from_renv() against shell/Dockerfile injection by validating user-controlled inputs that are interpolated into Dockerfile directives and RUN commands.

Changes:

  • Added internal (@noRd) validation helpers for Docker image refs, build-stage names, repo URLs/names, package names, version tokens, cache paths, lockfile basenames, and scalar logicals.
  • Wired validation to run at function entry in dock_from_desc() / dock_from_renv(), plus validated the R version string read from renv.lock.
  • Added targeted test coverage and updated documentation/NEWS to describe the new validation contract.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
R/utils.R Introduces .validate_* helpers used to reject shell-unsafe inputs early.
R/dock_from_renv.R Calls validators at entry and validates lockfile-derived r_version; expands parameter documentation.
R/dock_from_desc.R Replaces inline strict_install validation with shared helper and validates other shell-sensitive inputs.
tests/testthat/test-dock_from_renv.R Updates an existing renv-version test case and adds security-focused rejection tests.
tests/testthat/test-dock_from_desc.R Adds security-focused rejection tests for shell/Dockerfile injection vectors.
NEWS.md Documents the security validation behavior and expands renv_paths_cache release notes.
man/dockerfiles.Rd Updates generated docs for dock_from_desc() parameter validation contract.
man/dock_from_renv.Rd Updates generated docs for dock_from_renv() parameter validation contract.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/utils.R Outdated
Comment on lines +154 to +163
bad <- !grepl(
"^https?://[A-Za-z0-9._~:/?#@!$&()*+,;=%-]+$",
x
)
if (any(bad)) {
stop(
"`repos` entries must be http(s) URLs without quotes, ",
"spaces or newlines; invalid: ",
paste(vapply(x[bad], deparse, character(1)), collapse = ", ")
)
Comment thread R/utils.R
Comment on lines +169 to +177
nms <- names(x)
if (!is.null(nms)) {
bad_nms <- !grepl("^[A-Za-z][A-Za-z0-9._-]*$", nms)
if (any(bad_nms)) {
stop(
"`names(repos)` must be simple identifiers ",
"(`^[A-Za-z][A-Za-z0-9._-]*$`); invalid: ",
paste(vapply(nms[bad_nms], deparse, character(1)), collapse = ", ")
)
Comment thread R/utils.R
Comment on lines +224 to +236
if (!is.character(x)) {
stop(
"`extra_sysreqs` must be a character vector, got: ",
deparse(x)
)
}
bad <- !grepl("^[a-z0-9][a-z0-9.+-]+$", x)
if (any(bad)) {
stop(
"`extra_sysreqs` entries must be Debian package names ",
"matching `^[a-z0-9][a-z0-9.+-]+$`; invalid: ",
paste(vapply(x[bad], deparse, character(1)), collapse = ", ")
)
Comment thread R/dock_from_renv.R
Comment on lines +159 to +167
.validate_lockfile(lockfile)
.validate_FROM(FROM)
.validate_AS(AS)
.validate_repos(repos)
.validate_extra_sysreqs(extra_sysreqs)
.validate_scalar_logical(use_pak, "use_pak")
if (!missing(renv_version)) {
.validate_renv_version(renv_version)
}
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comment thread R/utils.R Outdated
Comment on lines +155 to +161
"^https?://[A-Za-z0-9._~:/?#@!$&()*+,;=%-]+$",
x
)
if (any(bad)) {
stop(
"`repos` entries must be http(s) URLs without quotes, ",
"spaces or newlines; invalid: ",
Comment thread R/utils.R Outdated
Comment on lines +111 to +118
"^[a-zA-Z0-9][a-zA-Z0-9._/-]*(:[a-zA-Z0-9._-]+)?(@sha256:[a-fA-F0-9]+)?$",
x
)
) {
stop(
"`FROM` must be a valid Docker image reference ",
"(alphanumerics, dot, slash, dash, underscore; optional `:tag` ",
"and / or `@sha256:<hex>`; no newlines or shell metacharacters), got: ",
Comment thread R/dock_from_renv.R Outdated
#' dash, underscore, optional `:tag` and / or `@sha256:<hex>`); other
#' values raise an error to prevent shell-metacharacter injection
#' into the generated FROM directive.
#' @param AS The AS of the Dockerfile. Default it `NULL`. When non-NULL,
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread R/utils.R Outdated
Comment on lines +175 to +182
"^https?://[A-Za-z0-9._~:/?#@!&*+,;=%-]+$",
x
)
if (any(bad)) {
stop(
"`repos` entries must be http(s) URLs containing only ",
"URL-safe characters (no quotes, parentheses, dollar, ",
"backtick, backslash, spaces or newlines); invalid: ",
Comment thread R/dock_from_renv.R
Comment on lines 158 to +164
github_pat <- match.arg(github_pat)
.validate_lockfile(lockfile)
.validate_FROM(FROM)
.validate_AS(AS)
.validate_repos(repos)
.validate_extra_sysreqs(extra_sysreqs)
.validate_scalar_logical(use_pak, "use_pak")
…text

Closes the post-#106 security audit follow-up. Adds 9 .validate_*
helpers in R/utils.R (@nord) that reject inputs which would
otherwise be interpolated raw into the generated Dockerfile's shell
commands. Each helper raises stop() with a clear message naming the
offending parameter.

Validators added:

- .validate_FROM (image reference: `^[a-zA-Z0-9][a-zA-Z0-9._/-]*(:tag)?(@sha256:hex)?$`)
- .validate_AS (build-stage name: `^[a-zA-Z0-9][a-zA-Z0-9._-]*$`)
- .validate_repos (https URLs + simple-identifier names; closes the
  `dput()`-backtick injection vector on names)
- .validate_extra_sysreqs (Debian package name: `^[a-z0-9][a-z0-9.+-]+$`)
- .validate_renv_version (semver-like, NULL allowed)
- .validate_lockfile (basename: alphanumerics + `._-`)
- .validate_renv_paths_cache (absolute path, no metacharacters)
- .validate_r_version (defensive depth on lockfile-derived value)
- .validate_scalar_logical (factored out from strict_install; used
  for use_pak too, closing a real injection vector via
  `echo "options(renv.config.pak.enabled = %s, ...)"`)

dock_from_desc() and dock_from_renv() call the relevant validators
at function entry, just after match.arg(github_pat).

The pr-reviewer agent flagged two additional vectors I missed in
the first pass: use_pak (B1, blocked) and names(repos) (B2, blocked
via the dput() backtick path). Both are closed by this commit.

Tests: 9 new expect_error() blocks across test-dock_from_desc.R
and test-dock_from_renv.R, each TDD red-against-buggy. One
existing test patched: renv_version = "banana" -> "1.2.3" since
the new validator rejects non-semver strings (the test was
exercising the install_version code path, which "1.2.3" still
covers).

devtools::test() -> [ FAIL 0 | WARN 3 | SKIP 0 | PASS 302 ].
R CMD check --as-cran: 0 errors / 0 warnings / 1 note (the
"unable to verify current time" host note, unrelated).

Out of scope for this PR (per the threat model agreed with the
maintainer):
- R6 helpers in R/add.R (dock$RUN, dock$COPY, ...) are
  author-controlled API; if the author writes a malicious RUN it
  is self-foot-shooting, not injection.
- distro is deprecated and ignored at runtime.
- github_pat is match.arg'd against a fixed enum.

NEWS.md gains a `## Security` section above `## New features`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants