Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
34b157c
Updates nipah map serology forest plot
tristan-myles Feb 10, 2026
6158a72
orderly updates
sangeetabhatia03 Feb 17, 2026
53f0e59
Linting
sangeetabhatia03 Feb 18, 2026
cf44c1c
Rehome stray code
sangeetabhatia03 Feb 18, 2026
87d8ac4
Deduplication of outbreaks
sangeetabhatia03 Feb 18, 2026
5907457
Data processing prep for mapping
sangeetabhatia03 Feb 18, 2026
e66af66
Rationalise code + categorical scale on map
sangeetabhatia03 Feb 18, 2026
6632da4
Updates serology main text plot
tristan-myles Feb 18, 2026
797fcf9
Fixes bugs - col pal name & init plot_alpha sooner
tristan-myles Feb 18, 2026
de6abc0
Applies formatting - no spaces around =
tristan-myles Feb 18, 2026
942ea1f
Adds antibodies to the sero plot col names
tristan-myles Feb 19, 2026
b7ae3d5
Adds hyphen to sero plot col names
tristan-myles Feb 19, 2026
052d7dd
More map prep
sangeetabhatia03 Feb 19, 2026
de0ee20
Merge branch 'nipah_internal_review_updates' of https://github.com/mr…
sangeetabhatia03 Feb 19, 2026
b7a4481
orderly changes
sangeetabhatia03 Feb 19, 2026
5fe8580
Existing outputs with refactored code
sangeetabhatia03 Feb 19, 2026
49ca6c0
Filters parameters to human delays
tristan-myles Feb 24, 2026
47d77d1
Reorders countries for delays by 1st outbreak
tristan-myles Feb 24, 2026
d0f5899
Updates delay colour country mapping
tristan-myles Feb 24, 2026
a51cb71
Updates delay forest lower xlim to -0.5
tristan-myles Feb 24, 2026
c7376d4
Fixes bug - incorrect bsl plot for qa plots
tristan-myles Feb 24, 2026
e052ff9
Updates incubation period legend position
tristan-myles Feb 24, 2026
7ca785b
Adds an arrow to indicate dis/recov is clipped
tristan-myles Feb 24, 2026
c502806
Updates trans pop group fact levels & colour pals
tristan-myles Feb 24, 2026
6c988a8
Adds arrow to long genomic central range
tristan-myles Feb 24, 2026
3693362
Updates tranmission plot legend spacing
tristan-myles Feb 24, 2026
071a1e6
First pass at assembling using tikz
sangeetabhatia03 Feb 25, 2026
4086026
Fine-tuning maps
sangeetabhatia03 Feb 26, 2026
ed93639
Separate task for inc period meta analysis
sangeetabhatia03 Feb 26, 2026
b215af1
Minor edits
sangeetabhatia03 Feb 26, 2026
f00f821
Updates extraction for estimated Nikolay delays
tristan-myles Mar 4, 2026
5866b5c
Updates and cleans curation
tristan-myles Mar 4, 2026
91ce58b
Comments out uncert from custom_SE in curation
tristan-myles Mar 4, 2026
413e48d
stray changes
sangeetabhatia03 Mar 5, 2026
0e4236b
Updates incp meta to match cfr approach
tristan-myles Mar 5, 2026
e3e512f
Updates nipah_serology patchwork plot
tristan-myles Mar 17, 2026
7b76eab
Updates nipah alternate maps text size
tristan-myles Mar 17, 2026
8ee3b8a
Updates alternate maps title
tristan-myles Mar 17, 2026
14344e2
Updates alternate map xmax
tristan-myles Mar 17, 2026
72e6411
Updates alternate map patchwork format
tristan-myles Mar 17, 2026
e92afea
Fix single outbreak being split over multiple rows
sangeetabhatia03 Mar 19, 2026
94d83f5
Removes unnecessary code
tristan-myles Mar 20, 2026
5f6d040
Adds specific square colour to metaprop
tristan-myles Mar 20, 2026
9c55c46
Updates Siliguri to map to Darjeeling state
tristan-myles Mar 20, 2026
4f6c68f
Fixes incorrect mapping of shared city name
tristan-myles Mar 20, 2026
20e30d2
Updates map boundaries and adds boundary lines
tristan-myles Mar 20, 2026
28483ac
Updates Nipah latex tables
tristan-myles Mar 20, 2026
20d877f
Updates NiV transmission plot format
tristan-myles Mar 20, 2026
7335253
Overhauls NiV severity and removes supp data
tristan-myles Mar 20, 2026
85362e9
Adds extracted outbreak severity as a sep task
tristan-myles Mar 20, 2026
0ef614c
Adds IEDCR severity as a sep task
tristan-myles Mar 20, 2026
9a349e4
Updates plot colours for NiV risk factors
tristan-myles Mar 20, 2026
6c23ba9
Updates incp meta plot format & removes old code
tristan-myles Mar 20, 2026
50458a5
Renames NiV severity to indicate extracted params
tristan-myles Mar 20, 2026
a496e56
Renames saved incp meta analysis artefacts
tristan-myles Mar 20, 2026
d547ba8
Major update to NiV delays
tristan-myles Mar 20, 2026
93f450c
Adds more workflow tasks and sets down dep to 0
tristan-myles Mar 20, 2026
b8eb29f
Removes pathogen param from serology task
tristan-myles Mar 20, 2026
e14fbde
Removes explicit select & area assignment
tristan-myles Mar 20, 2026
f90224f
Updates extracted param severity fig name
tristan-myles Mar 20, 2026
7c561ab
Removes orderly_artefact from severity supp data
tristan-myles Mar 20, 2026
33e715c
Applies bug fixes; missing param & wrong group var
tristan-myles Mar 20, 2026
3de98ff
Assigns exploratory plots to a var
tristan-myles Mar 20, 2026
d96ec61
Applies bug fix; meta can't have rows with 0 Cases
tristan-myles Mar 20, 2026
c5ad2f4
Deletes plot var line
tristan-myles Mar 20, 2026
f7ed8d4
Sets debug mode to FALSE in db_cleaning
tristan-myles Mar 23, 2026
d076577
Adds task to prep iedcr supp map data
tristan-myles Apr 10, 2026
ad994de
Adds task to make iedcr supp map
tristan-myles Apr 10, 2026
0929e23
Fixes bug - updates nipah_raw.csv version used
tristan-myles Apr 10, 2026
39fc88a
Adds iedcr alternate map tasks to workflow
tristan-myles Apr 10, 2026
93175de
Updates delays name to fig 4 & adds lancet panels
tristan-myles Apr 10, 2026
0a7cb26
Adds individual lancet panels
tristan-myles Apr 10, 2026
f24b548
Updates to save figures as pdfs
tristan-myles Apr 10, 2026
b1e9ff7
Updates to save individual panels for lancet
tristan-myles Apr 10, 2026
227c26b
Updates nipah incp meta to use 3 digits
tristan-myles Apr 10, 2026
3ab9cd3
Updates to save as pdf & saves individual cols
tristan-myles Apr 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 27 additions & 16 deletions nipah_workflow.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,10 @@ library(orderly2)
# - A REDCap api
# - nipah_config.yaml.that specifies how to run the task
# (relative path: src/db_redcap_download/download_config/nipah_config.yaml)
orderly_run("db_redcap_download",list(pathogen="NIPAH"))

# NOTE: IF YOU ARE A PERG MEMBER WITH A REDCAP API KEY PLEASE UNCOMMENT THE
# LINE BELOW AND SET 'orderly_download_dependency=TRUE' IN LINE 24
# orderly_run("db_redcap_download",list(pathogen="NIPAH"))

# *----------------- Prepare data to generate extraction csvs -----------------*
# Prepares the REDCap data so that double and single extraction csvs can be
Expand All @@ -18,7 +21,7 @@ orderly_run("db_redcap_download",list(pathogen="NIPAH"))
# - config.yaml that specifies how to run the task
# (relative path: src/db_extraction_prep/redcap_task/nipah/config.yaml)
orderly_run("db_extraction_prep",list(pathogen="NIPAH",
orderly_download_dependency=TRUE))
orderly_download_dependency=FALSE))

# *------------------------- Generate extraction csvs -------------------------*
# Extracts double and single extraction csvs used the .rds file from
Expand All @@ -35,34 +38,42 @@ orderly_run("db_double",list(pathogen="NIPAH"))
orderly_run("db_compilation", list(pathogen="NIPAH"))

# *-------------------------------- Clean data --------------------------------*
orderly_run("db_cleaning",list(pathogen="NIPAH", debug_mode=TRUE))
orderly_run("db_cleaning",list(pathogen="NIPAH", debug_mode=FALSE))

# *------------------------------- Latex tables -------------------------------*
# Add cleaning mode
orderly_run("nipah_latex_tables", list(pathogen="NIPAH"))

# *---------------------------- Plots and analysis ----------------------------*
orderly_run("nipah_serology", list(pathogen="NIPAH"))
# Serology
orderly_run("nipah_serology")

# orderly_run("nipah_map", list(pathogen="NIPAH"))
# Maps
orderly_run("nipah_deduplicate_outbreaks")
orderly_run("nipah_map_prep")
orderly_run("nipah_map_alternate")

orderly_run("nipah_transmission", list(pathogen="NIPAH"))
orderly_run("nipah_iedcr_map_prep")
orderly_run("nipah_map_alternate_iedcr")

orderly_run("nipah_severity", list(pathogen="NIPAH"))
# Severity
orderly_run("nipah_severity_extracted_params", list(pathogen="NIPAH"))
orderly_run("nipah_severity_extracted_outbreaks", list(pathogen="NIPAH"))
orderly_run("nipah_severity_IEDCR", list(pathogen="NIPAH"))

orderly_run("nipah_bsl_data_synthesis", list(pathogen="NIPAH"))
# Transmission
orderly_run("nipah_transmission", list(pathogen="NIPAH"))

# I assume the issue below is caused by the BSL library and other packages will
# explicitly reference MASS when a function is needed
# MASS::select masks dplyr::select
# MASS::area masks patchwork::select
select <- dplyr::select
area <- patchwork::area
# Delays
orderly_run("nipah_inc_period_meta")
orderly_run("nipah_delays")

orderly_run("nipah_delays", list(pathogen="NIPAH"))
# Risk factors
orderly_run("nipah_risk_factors", list(pathogen="NIPAH"))

# SI summary plots
orderly_run("nipah_summary", list(pathogen="NIPAH"))

# SI summary tables
orderly_run("nipah_supp_tables", list(pathogen="NIPAH"))

orderly_run("nipah_risk_factors", list(pathogen="NIPAH"))
57 changes: 37 additions & 20 deletions shared/nipah_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -54,23 +54,38 @@ data_curation <- function(articles, outbreaks, models, parameters, plotting,swit
ifelse(parameter_unit %in% "Weeks", 7, 1))
) |>
mutate(parameter_unit = ifelse(parameter_unit %in% "Weeks", "Days", parameter_unit)) |>
mutate(no_unc = is.na(parameter_uncertainty_lower_value) & is.na(parameter_uncertainty_upper_value), #store uncertainty in pu_lower and pu_upper
custom_se = case_when(str_detect(str_to_lower(parameter_2_value_type),"standard deviation") & no_unc & !is.na(population_sample_size) ~ parameter_2_value/sqrt(population_sample_size),
TRUE ~ NA),
mutate(no_unc = (is.na(parameter_uncertainty_lower_value) &
is.na(parameter_uncertainty_upper_value)),
custom_se = ifelse(
str_detect(str_to_lower(parameter_2_value_type),"standard deviation") &
no_unc &
!is.na(population_sample_size),
parameter_2_value/sqrt(population_sample_size),
NA),
# needs to be in order for case when - assumes that if there is an SE
# that will be used first else use the custom SE.
unc_inferred_from_se = str_detect(str_to_lower(parameter_uncertainty_single_type), "standard error") & no_unc,
unc_inferred_from_custom = !is.na(custom_se) & no_unc,
parameter_uncertainty_lower_value = case_when(
str_detect(str_to_lower(parameter_uncertainty_single_type),"standard error") & no_unc ~ parameter_value-parameter_uncertainty_single_value,
!is.na(custom_se) & no_unc ~ parameter_value-custom_se,
str_detect(str_to_lower(distribution_type),"gamma") & no_unc ~ qgamma(0.05, shape = (distribution_par1_value/distribution_par2_value)^2, rate = distribution_par1_value/distribution_par2_value^2),
TRUE ~ parameter_uncertainty_lower_value),
unc_inferred_from_se ~ parameter_value - parameter_uncertainty_single_value,
# unc_inferred_from_custom ~ parameter_value - custom_se,
TRUE ~ parameter_uncertainty_lower_value
),
parameter_uncertainty_upper_value = case_when(
str_detect(str_to_lower(parameter_uncertainty_single_type),"standard error") & no_unc ~ parameter_value+parameter_uncertainty_single_value,
!is.na(custom_se) & no_unc ~ parameter_value+custom_se,
str_detect(str_to_lower(distribution_type),"gamma") & no_unc ~ qgamma(0.95, shape = (distribution_par1_value/distribution_par2_value)^2, rate = distribution_par1_value/distribution_par2_value^2),
TRUE ~ parameter_uncertainty_upper_value)) |>
unc_inferred_from_se ~ parameter_value + parameter_uncertainty_single_value,
# unc_inferred_from_custom ~ parameter_value + custom_se,
TRUE ~ parameter_uncertainty_upper_value
),
uncertainty_inferred_flag = case_when(
unc_inferred_from_se ~ "inferred from SE",
# unc_inferred_from_custom ~ "inferred from custom SE",
TRUE ~ NA
)) |>
mutate(central = coalesce(parameter_value,
100*cfr_ifr_numerator/cfr_ifr_denominator,
0.5*(parameter_lower_bound+parameter_upper_bound))) |>
dplyr::select(-c(no_unc))
100*cfr_ifr_numerator/cfr_ifr_denominator),
central_range_midpoint = 0.5*(parameter_lower_bound+parameter_upper_bound)) |>
dplyr::select(-c(no_unc, unc_inferred_from_se, unc_inferred_from_custom,
custom_se))

if (plotting) {
parameters <- param4plot
Expand All @@ -79,6 +94,7 @@ data_curation <- function(articles, outbreaks, models, parameters, plotting,swit
if(sum(check_param_id)==dim(parameters)[1])
{
parameters$central <- param4plot$central
parameters$central_range_midpoint <- param4plot$central_range_midpoint
} else {
errorCondition('parameters not in right order to match')
}
Expand All @@ -89,9 +105,6 @@ data_curation <- function(articles, outbreaks, models, parameters, plotting,swit
outbreaks <- outbreaks |> mutate(outbreak_location = str_replace_all(outbreak_location, "\xe9" , "é"))
}

# parameters <- parameters |> mutate(parameter_type = str_replace_all(parameter_type, "\x96" , "–"),
# population_country = str_replace_all(population_country, c("昼㸴" = "ô", "�" = "ô")))

if(switch_first_surname) # this is due to legacy access database issue
{
articles <- articles |> rename(first_author_first_name=first_author_surname,first_author_surname=first_author_first_name)
Expand Down Expand Up @@ -551,7 +564,11 @@ metagen_wrap <- function(dataframe, estmeansd_method,
metaprop_wrap <- function(dataframe, subgroup,
plot_pooled, sort_by_subg, plot_study, digits, colour,
width, height, resolution,
at = seq(0,1,by=0.2), xlim = c(0,1)){
at = seq(0,1,by=0.2), xlim = c(0,1),
colour_square=NA){
if (is.na(colour_square)){
colour_sqaure=colour
}

stopifnot(length(unique(dataframe$parameter_unit[!is.na(dataframe$parameter_unit)])) == 1)#values must have same units

Expand Down Expand Up @@ -581,7 +598,7 @@ metaprop_wrap <- function(dataframe, subgroup,
digits = digits,
col.diamond.lines = "black",col.diamond.common = colour,
col.diamond.random = colour,
col.square = colour, col.square.lines = "black",
col.square = colour_square, col.square.lines = "black",
col.study = "black", col.subgroup = "black",
col.inside = "black", weight.study = "same",
at = at, xlim = xlim, xlab="Case Fatality Ratio",
Expand All @@ -608,7 +625,7 @@ metaprop_wrap <- function(dataframe, subgroup,
digits = digits,
col.diamond.lines = "black",col.diamond.common = colour,
col.diamond.random = colour,
col.square = colour, col.square.lines = "black",
col.square = colour_square, col.square.lines = "black",
col.subgroup = "black", col.inside = "black", weight.study = "same",
at = at, xlim = xlim, xlab="Case Fatality Ratio",
fs.predict.labels = 11.5,
Expand Down
290 changes: 290 additions & 0 deletions shared/nipah_raw_data.csv

Large diffs are not rendered by default.

71 changes: 55 additions & 16 deletions src/db_cleaning/nipah/nipah_cleaning.R
Original file line number Diff line number Diff line change
Expand Up @@ -709,26 +709,51 @@ param_cleaning <- function(df){

# Split 2931 incubation row
hd_2931_incp_row_filter <- df$access_param_id=="138_016"
# TODO:
df[hd_2931_incp_row_filter, "parameter_notes"] <-
"Mean is also 13 days. 95% quantile is 16 days which corresponds to Gamma
distribution provided. Note, the relates to the empirical distribution.
The Gamma distribution is extracted from supp section 5.
Sample is secondary cases for whom a single infector could be identified.
Figure 1 in the main text has a bar chart of the serial interval."

new_2931_incp_row <- df[hd_2931_incp_row_filter, ]
new_2931_incp_row$parameter_data_id <- generate_new_id(
df, "parameter_data_id", 10)
# No corresponding redcap entry so make an ID

new_2931_incp_row$access_param_id <- "138_3141"
new_2931_incp_row$parameter_value <- 9.7
new_2931_incp_row$parameter_value_type <- "Mean"

new_2931_incp_row$parameter_statistical_approach <- "Estimated model parameter"
new_2931_incp_row$parameter_paired <- "No"
new_2931_incp_row$parameter_2_unit <- NA
new_2931_incp_row$method_2_from_supplement <- NA
new_2931_incp_row$parameter_2_statistical_approach <- NA
new_2931_incp_row$parameter_2_uncertainty_type <- ""
new_2931_incp_row$parameter_uncertainty_lower_value <- 9.1
new_2931_incp_row$parameter_uncertainty_upper_value <- 10.4

new_2931_incp_row$distribution_par1_type <- NA
new_2931_incp_row$distribution_par1_value <- NA
new_2931_incp_row$distribution_par1_uncertainty <- NA
new_2931_incp_row$distribution_par2_type <- NA
new_2931_incp_row$distribution_par2_value <- NA
new_2931_incp_row$distribution_par2_uncertainty <- NA

new_2931_incp_row$parameter_paired <- "Yes"
new_2931_incp_row$parameter_2_value_type <- "Standard deviation (Sd)"
new_2931_incp_row$parameter_2_value <- 2.2
new_2931_incp_row$parameter_2_unit <- "Days"
new_2931_incp_row$method_2_from_supplement <- "Yes"
new_2931_incp_row$parameter_2_statistical_approach <- "Estimated model parameter"
new_2931_incp_row$parameter_2_uncertainty_type <- "CRI95%"
new_2931_incp_row$parameter_2_uncertainty_lower_value <- 1.7
new_2931_incp_row$parameter_2_uncertainty_upper_value <- 2.8
new_2931_incp_row$distribution_2_type <- new_2931_incp_row$distribution_type
# TODO
# new_2931_incp_row$parameter_notes <- NA

new_2931_incp_row$parameter_2_value_type <- NA
new_2931_incp_row$parameter_2_lower_bound <- NA
new_2931_incp_row$parameter_2_upper_bound <- NA

# Remove dist param values
# Remove dist param values from existing row
df[hd_2931_incp_row_filter, "distribution_type"] <- NA
df[hd_2931_incp_row_filter, "distribution_par1_type"] <- NA
df[hd_2931_incp_row_filter, "distribution_par1_value"] <- NA
Expand All @@ -742,26 +767,40 @@ param_cleaning <- function(df){
# Split 2931 serial interval row
hd_2931_si_row_filter <- df$access_param_id=="138_015"

# Missing method from supplement
df[hd_2931_si_row_filter, "method_from_supplement"] <- "Yes"

new_2931_si_row <- df[hd_2931_si_row_filter, ]
new_2931_si_row$parameter_data_id <- generate_new_id(
df, "parameter_data_id", 10)

# No corresponding redcap entry so make an ID
new_2931_si_row$access_param_id <- "138_2718"
new_2931_si_row$parameter_value <- 13
new_2931_si_row$parameter_value_type <- "Median"

new_2931_si_row$parameter_value <- 12.7
new_2931_si_row$parameter_value_type <- "Mean"
new_2931_si_row$parameter_statistical_approach <- "Estimated model parameter"
new_2931_si_row$parameter_paired <- "No"
new_2931_si_row$parameter_2_unit <- NA
new_2931_si_row$method_2_from_supplement <- NA
new_2931_si_row$parameter_2_statistical_approach <- NA

new_2931_si_row$parameter_2_value_type <- NA
new_2931_si_row$distribution_par1_type <- NA
new_2931_si_row$distribution_par1_value <- NA
new_2931_si_row$distribution_par1_uncertainty <- NA
new_2931_si_row$distribution_par2_type <- NA
new_2931_si_row$distribution_par2_value <- NA
new_2931_si_row$distribution_par2_uncertainty <- NA

new_2931_si_row$parameter_paired <- "Yes"
new_2931_si_row$parameter_2_value_type <- "Standard deviation (Sd)"
new_2931_si_row$parameter_2_value <- 3
new_2931_si_row$parameter_2_unit <- "Days"
new_2931_si_row$method_2_from_supplement <- "Yes"
new_2931_si_row$parameter_2_statistical_approach <- "Estimated model parameter"
new_2931_si_row$distribution_2_type <- new_2931_si_row$distribution_type

new_2931_si_row$parameter_2_lower_bound <- NA
new_2931_si_row$parameter_2_upper_bound <- NA

# Remove dist param values
# Remove dist param values from existing row
df[hd_2931_si_row_filter, "method_2_from_supplement"] <- "Yes"

df[hd_2931_si_row_filter, "distribution_type"] <- NA
df[hd_2931_si_row_filter, "distribution_par1_type"] <- NA
df[hd_2931_si_row_filter, "distribution_par1_value"] <- NA
Expand Down
Loading
Loading