Skip to content

Commit de5bf23

Browse files
authored
Merge pull request #3716 from divine7022/docs-joint-input-design
add input design documentation to book source
2 parents 6ee5086 + 4dc275d commit de5bf23

File tree

8 files changed

+86
-6
lines changed

8 files changed

+86
-6
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,10 @@ For more information about this file see also [Keep a Changelog](http://keepacha
3434
- Support for inspecting and plotting NetCDF output variables within the notebook workflow.
3535
- added support for soil temperature, relative humidity, soil moisture, and PPFD downscaling to `met_temporal_downscale.Gaussian_ensemble`
3636
- The PEcAn uncertainty analysis tutorial ("Demo 2") has been updated and reimplemented as a Quarto notebook at `documentation/tutorials/Demo_02_Uncertainty_Analysis/uncertainty.qmd`. (#3570)
37+
- Added the shared `input_design` matrix, generated via
38+
`runModule.run.write.configs()`/`generate_joint_ensemble_design()`, that keeps
39+
parameter draws and sampled inputs aligned across `run.write.configs()`,
40+
`write.ensemble.configs()`(#3535, #3634, #3677).
3741

3842
### Fixed
3943

base/workflow/R/run.write.configs.R

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@
88
#'
99
#' @param settings a PEcAn settings list
1010
#' @param ensemble.size number of ensemble runs
11-
#' @param input_design input indices for samples
11+
#' @param input_design data frame containing the design matrix describing parameter and input indices, as
12+
#' documented in \code{runModule.run.write.configs()}.
1213
#' @param write should the runs be written to the database?
1314
#' @param posterior.files Filenames for posteriors for drawing samples for ensemble and sensitivity
1415
#' analysis (e.g. post.distns.Rdata, or prior.distns.Rdata)
@@ -28,6 +29,15 @@
2829
run.write.configs <- function(settings, ensemble.size, input_design, write = TRUE,
2930
posterior.files = rep(NA, length(settings$pfts)),
3031
overwrite = TRUE) {
32+
33+
# Validate that input_design matches ensemble.size
34+
if (nrow(input_design) != ensemble.size) {
35+
stop(
36+
"input_design has ", nrow(input_design), " rows, but ensemble.size is ",
37+
ensemble.size, ".The design matrix must have exactly one row for each run."
38+
)
39+
}
40+
3141
## Skip database connection if settings$database is NULL or write is False
3242
if (!isTRUE(write) && is.null(settings$database)) {
3343
PEcAn.logger::logger.info("Not writing this run to database, so database connection skipped")

base/workflow/R/runModule.run.write.configs.R

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,14 @@
22
#'
33
#' @param settings a PEcAn Settings or MultiSettings object
44
#' @param overwrite logical: Replace config files if they already exist?
5-
#' @param input_design the input indices for samples
5+
#' @param input_design data.frame design matrix linking parameter draws and any
6+
#' sampled inputs across runs. Include a `param` column whose values select
7+
#' rows from `trait.samples`/`ensemble.samples` plus optional columns named for
8+
#' `settings$run$inputs` tags (e.g. `met`, `soil`) with index (i.e., row number)
9+
#' into each input's `path` list. Provide at least one row per planned run
10+
#' (median + all SA members and/or `ensemble.size`). Usually generated by
11+
#' `generate_joint_ensemble_design()` but custom designs may be supplied.
12+
#' If NULL, `generate_joint_ensemble_design()` will be called internally.
613
#' @return A modified settings object, invisibly
714
#' @importFrom dplyr %>%
815
#' @export
@@ -24,6 +31,13 @@ runModule.run.write.configs <- function(settings,
2431
)
2532
input_design <- design_result$X
2633
}
34+
35+
# Validate design matrix size for MultiSettings
36+
if (!is.null(settings$ensemble$size) && nrow(input_design) != settings$ensemble$size) {
37+
PEcAn.logger::logger.severe("Input_design has", nrow(input_design), "rows but settings$ensemble$size is",
38+
settings$ensemble$size, ". Design matrix must have exactly one row per run.")
39+
}
40+
2741
return(PEcAn.settings::papply(settings,
2842
runModule.run.write.configs,
2943
overwrite = FALSE,
@@ -41,6 +55,13 @@ runModule.run.write.configs <- function(settings,
4155
)
4256
input_design <- design_result$X
4357
}
58+
59+
# Validate design matrix size for Settings
60+
if (!is.null(settings$ensemble$size) && nrow(input_design) != settings$ensemble$size) {
61+
PEcAn.logger::logger.severe("Input_design has", nrow(input_design), "rows but settings$ensemble$size is",
62+
settings$ensemble$size, ". Design matrix must have exactly one row per run.")
63+
}
64+
4465
ensemble_size <- nrow(input_design)
4566

4667

base/workflow/man/run.write.configs.Rd

Lines changed: 2 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

base/workflow/man/runModule.run.write.configs.Rd

Lines changed: 8 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

book_source/03_topical_pages/03_pecan_xml.Rmd

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -636,6 +636,35 @@ This information is currently used by the following PEcAn workflow functions:
636636
- `PEcAn.<MODEL>::write.configs.<MODEL>` -- See [above](#pecan-write-configs)
637637
- `PEcAn.uncertainty::run.sensitivity.analysis` -- Executes the uncertainty analysis
638638

639+
#### Coordinating inputs with the `input_design` design matrix {#xml-input-design}
640+
641+
Multi-site ensembles that sample over input files use an `input_design`
642+
data.frame to keep parameter draws and input files aligned across runs. The
643+
design is created up front (typically via `generate_joint_ensemble_design()`)
644+
and passed to `runModule.run.write.configs()`. It is not saved automatically to
645+
`samples.Rdata`, so keep your copy if you need to reuse it.
646+
647+
- **Parameter column:** `param` gives the index (i.e. row number) of the
648+
posterior draw to use for this run. For example, `param = 5` means use the 5th
649+
parameter sample from `samples.Rdata`.
650+
- **Input columns:** any name that matches a tag under `run/inputs` (for
651+
example `met``soil``veg``poolinitcond`). Values are indices into that
652+
input’s `path` list. Leaving a column out keeps that input fixed across runs.
653+
- **Row count and order:** must include exactly one row per run. For ensembles
654+
this means `ensemble.size` rows.
655+
656+
Example layout (CSV or `data.frame`):
657+
658+
| param | met | soil |
659+
|------:|----:|-----:|
660+
| 1 | 1 | 1 |
661+
| 2 | 2 | 1 |
662+
| 3 | 1 | 2 |
663+
| 4 | 2 | 2 |
664+
665+
In this example, run 2 would reuse the second parameter draw and also switch to
666+
the second met driver while keeping the first soil file.
667+
639668
### Parameter Data Assimilation {#xml-parameter-data-assimilation}
640669

641670
The following tags can be used for parameter data assimilation. More detailed information can be found here: [Parameter Data Assimilation Documentation](#pda)

modules/uncertainty/R/ensemble.R

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -193,7 +193,11 @@ get.ensemble.samples <- function( ensemble.size, pft.samples, env.samples,
193193
##' Given a pft.xml object, a list of lists as supplied by get.sa.samples,
194194
##' a name to distinguish the output files, and the directory to place the files.
195195
##'
196-
##' @param input_design the input indices for samples
196+
##' @param input_design design matrix describing sampled inputs (see
197+
##' `run.write.configs()`). Columns named after `settings$run$inputs` tags give
198+
##' 1-based indices into each input's `path` list and rows follow run order.
199+
##' Requires `nrow(input_design) >= ensemble.size`;
200+
##' extra rows are ignored.
197201
##' @param ensemble.size size of ensemble
198202
##' @param defaults pft
199203
##' @param ensemble.samples list of lists supplied by \link{get.ensemble.samples}

modules/uncertainty/man/write.ensemble.configs.Rd

Lines changed: 5 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)