easystats
diff --git a/‎R/estimate_means.R‎
Lines changed: 26 additions & 59 deletions b/‎R/estimate_means.R‎
Lines changed: 26 additions & 59 deletions
diff --git a/‎man/estimate_contrasts.Rd‎
Lines changed: 26 additions & 65 deletions b/‎man/estimate_contrasts.Rd‎
Lines changed: 26 additions & 65 deletions
diff --git a/‎man/estimate_means.Rd‎
Lines changed: 26 additions & 65 deletions b/‎man/estimate_means.Rd‎
Lines changed: 26 additions & 65 deletions
@@ -58,65 +58,32 @@
 #' might produce biased predictions. In particular for mixed models, using
 #' `"response"` is recommended, because averaging across random effects groups
 #' is more accurate.
-#' @param estimate Character string, indicating the type of target population
-#' predictions refer to. This dictates how the predictions are "averaged" over
-#' the non-focal predictors, i.e. those variables that are not specified in
-#' `by` or `contrast`. We can roughly distinguish between "modelbased" and
-#' "empirical" predictions.
-#' - `"typical"` (default): Predictions are made for observations that are
-#'   represented by a data grid, which is built from all combinations of the
-#'   predictor levels in `by` (the focal predictors). `"typical"` then takes the
-#'   mean value for non-focal numeric predictors and marginalizes over the
-#'   factor levels of non-focal predictors, which computes a kind of "weighted
-#'   average" for the values at which these terms are hold constant. These
-#'   predictions are useful for comparing defined "groups" and are still a good
-#'   representation of the sample, because all possible values and levels of the
-#'   non-focal predictors are considered (averaged over). It answers the
-#'   question, "What would be the average outcome for a 'typical' observation?",
-#'   where 'typical' refers to subjects represented by (i.e., that share the
-#'   characteristics from) the data grid. This approach is the one taken by
-#'   default in the `emmeans` package.
-#' - `"average"`: Predictions are made for each observation in the sample. Then,
-#'   the average of all predictions is calculated within all groups (or levels)
-#'   of the focal predictors defined in `by`. These predictions are the closest
-#'   representation of the sample, because `estimate = "average"` averages
-#'   across the full sample, where groups (in `by`) are not represented by a
-#'   balanced data grid, but rather the empirical distributions of the
-#'   characteristics of the sample. It answers the question, "What is the
-#'   predicted value for an average observation (from a certain group in `by`)
-#'   in my data?".
-#' - `"population"`: Each observation is "cloned" multiple times, where each
-#'   duplicate gets one of the levels from the focal predictors in `by`. We then
-#'   have one "original" and several copies of that original, each varying in
-#'   the levels of the focal predictors. Hence, the sample is replicated
-#'   multiple times to produce "counterfactuals" and then takes the average of
-#'   these predicted values (aggregated/grouped by the focal predictors). It can
-#'   be considered as extrapolation to a hypothetical target population.
-#'   Counterfactual predictions are useful, insofar as the results can also be
-#'   transferred to other contexts (Dickerman and Hernan, 2020). It answers the
-#'   question, "What is the predicted response value for the 'average'
-#'   observation in *the broader target population*?". It does not only refer to
-#'   the actual data in your observed sample, but also "what would be if" we had
-#'   more data, or if we had data from a different sample.
-#'
-#' In other words, the distinction between estimate types resides in whether
-#' the prediction are made for:
-#' - *modelbased predictions* (focus lies on _predictors_), which are useful to
-#'   look at differences between typical groups, or for visualization
-#'   - A specific individual from the sample (i.e., a specific combination of
-#'     predictor values for focal and non-focal predictors): this is what is obtained
-#'     when using [`estimate_relation()`] and the other prediction functions.
-#'   - A typical individual from the sample: obtained with
-#'     `estimate_means(..., estimate = "typical")`
-#' - *empirical predictions* (focus lies on _predictions_ of the outcome), which
-#'   are useful if you want realistic predictions of your outcome, assuming that
-#'   the sample is representative for a special population (option `"average"`),
-#'   or useful for "what-if" scenarios, especially if you want to make unbiased
-#'   comparisons (G-computation, option `"population"`)
-#'   - The average individual from the sample: obtained with
-#'     `estimate_means(..., estimate = "average")`
-#'   - The broader, hypothetical target population: obtained with
-#'     `estimate_means(..., estimate = "population")`
+#' @param estimate The `estimate` argument determines how predictions are
+#' averaged ("marginalized") over variables not specified in `by` or `contrast`
+#' (non-focal predictors). It controls whether predictions represent a "typical"
+#' individual, an "average" individual from the sample, or an "average"
+#' individual from a broader population.
+#' - `"typical"` (Default): Calculates predictions for a balanced data grid
+#'   representing all combinations of focal predictor levels (specified in `by`).
+#'   For non-focal numeric predictors, it uses the mean; for non-focal
+#'   categorical predictors, it marginalizes (averages) over the levels. This
+#'   represents a "typical" observation based on the data grid and is useful for
+#'   comparing groups. It answers: "What would the average outcome be for a
+#'   'typical' observation?". This is the default approach when estimating
+#'   marginal means using the *emmeans* package.
+#' - `"average"`: Calculates predictions for each observation in the sample and
+#'   then averages these predictions within each group defined by the focal
+#'   predictors. This reflects the sample's actual distribution of non-focal
+#'   predictors, not a balanced grid. It answers: "What is the predicted value
+#'   for an average observation in my data?"
+#' - `"population"`: "Clones" each observation, creating copies with all
+#'   possible combinations of focal predictor levels. It then averages the
+#'   predictions across these "counterfactual" observations (non-observed
+#'   permutations) within each group. This extrapolates to a hypothetical
+#'   broader population, considering "what if" scenarios. It answers: "What is
+#'   the predicted response for the 'average' observation in a broader possible
+#'   target population? This approach entails more assumptions about the
+#'   likelihood of different combinations, but can be more apt to generalize.
 #'
 #' You can set a default option for the `estimate` argument via `options()`,
 #' e.g. `options(modelbased_estimate = "average")`