---
title: Causal Effects of Helicopter Emergency Medical Services in Interfacility Transfers to Neuro ICUs - A Study Using Swedish National Registries (2019–2024)

author:
  - name: Johan Olsson
    orcid: 0000-0003-3297-5489
    corresponding: true
    email: johan.olsson@ki.se
    roles:
      - Conceptualization
      - Preprocessing of data
      - Analysis of data 
      - Software
      - Drafting the work
      - Writing the paper
      - Final approval of the version to be published
    affiliations:
      - Department of Pediatric Perioperative Medicine and Intensive Care, ECMO Centre Karolinska, Karolinska University Hospital, Stockholm, Sweden; Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
  - name: Ryan Falck-Jones
    orcid: 0000-0002-0985-8658
    corresponding: false
    roles:
      - Conceptualization
      - Preprocessing of data
      - Analysis of data
      - Software
      - Critical review of important intellectual content
      - Final approval of the version to be published
    affiliations:
      - Department of Perioperative Medicine and Intensive Care, Karolinska University Hospital, Stockholm, Sweden; Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
  - name: Anders Holst
    orcid: 0000-0001-8577-6745
    corresponding: false
    roles:
      - Conceptualization
      - Critical review of important intellectual content
      - Final approval of the version to be published
    affiliations:
      - School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden; RISE Research Institutes of Sweden, Stockholm, Sweden.
  - name: Cecilia Åkerlund
    orcid:  0000-0003-4918-1482
    corresponding: false
    roles:
      - Conceptualization
      - Drafting the work
      - Writing the paper
      - Final approval of the version to be published
    affiliations:
      - Department of Perioperative Medicine and Intensive Care, Karolinska University Hospital, Stockholm, Sweden; Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
  - name: Logan Froese
    orcid: 0000-0002-6076-0189
    corresponding: false
    roles:
      - Critical review of important intellectual content
      - Final approval of the version to be published
      - Software review
    affiliations:
      - Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.
  - name: David Nelson 
    orcid:  0000-0003-2530-8207
    corresponding: false
    roles:
      - Conceptualization
      - Drafting the work
      - Writing the paper
      - Final approval of the version to be published
    affiliations:
      - Department of Perioperative Medicine and Intensive Care, Karolinska University Hospital, Stockholm, Sweden; Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
  - name: Linn Hallqvist
    orcid: 0000-0002-0968-6403
    corresponding: false
    roles:
      - Conceptualization
      - Critical review of important intellectual content
      - Final approval of the version to be published
    affiliations:
      - Department of Perioperative Medicine and Intensive Care, Karolinska University Hospital, Stockholm, Sweden; Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
keywords:
  - Neurocritical care
  - Neurointensive care
  - Intensive care
  - Interfacility transfer
  - Causal inference
  - Instrumental variable analysis
  - Helicopter emergency medical services
  - HEMS
  - Patient transfer

plain-language-summary: |
  This be the plain language thing
date: last-modified
bibliography: references.bib
csl: critical-care-medicine.csl
number-sections: false
---

```{r setup} 
# Load dependencies 
#| output: false
#| echo: false
#| warning: false
library(tidyverse)
library(lubridate)
library(brms)
library(DBI)
library(RSQLite)
library(flowchart)
library(gtsummary)
library(ggplot2)
library(bayestestR)
library(rstan)
library(flowchart)
library("devtools")
library(rcmswe)
library(NatParksPalettes)
library(showtext)
library(insight)

## Define a function for proper calculation of posterior causal contrast of linear model
# Uses epred (expectation in predictions) rather than posterior_pred so as not to model the uncertainty of the
# p = 0.xxx -> Bern(p) step for all instances. Takes a model and, assuming that the treatment is hems_ift
# returns a summary with mean, median etc estimates as well as pr_gt0 which is the posterior probablity that
# the the ATE is >0 (eg. mortality increase)

ate_cf <- function(model, ndraws = 4000) {
  dat  <- insight::get_data(model)
  dat0 <- transform(dat, hems_ift = 0L)
  dat1 <- transform(dat, hems_ift = 1L)

  ep0 <- posterior_epred(model, newdata = dat0, ndraws = ndraws, re_formula = NULL)
  ep1 <- posterior_epred(model, newdata = dat1, ndraws = ndraws, re_formula = NULL)

  ate   <- rowMeans(ep1 - ep0)
  mean0 <- rowMeans(ep0)
  mean1 <- rowMeans(ep1)

  summ <- data.frame(
    mean   = mean(ate),
    median = median(ate),
    sd     = sd(ate),
    q025   = quantile(ate, 0.025),
    q975   = quantile(ate, 0.975),
    pr_gt0 = mean(ate > 0)
  )

  list(summary = summ, draws = ate, mean0 = mean0, mean1 = mean1)
}

font_add_google("Noto Sans", "Noto Sans")  # Downloads from Google Fonts
showtext_auto()

# Set gtsummart theme
theme_gtsummary_compact()

# Load data
d <- read_delim(file = "~/PhD/nsicu-transfers/data/pre-processed-data/patient-df-2025-05-25 22:12:51.563534.csv", delim = ";")[ , -1]

# Prune data
d_pruned <- d %>%
  filter(sir_dsc_time >= "2019-10-01" & sir_dsc_time <= "2024-05-15") %>% # 2019-10-01 -- 2024-05-15
  filter(!is.na(TERTIARY_HADM_ID)) %>%                  # Keep only rows with a tertiary HADM match
  filter(sir_total_time <= 720) %>%                     # Keep only rows with a primary ICU stay 12 hours or shorter
  filter(SIR_PAR_OFFSET_TIGHT %in% (c(-1,0,1))) %>%     # Keep only rows with a recent PAR admit
  filter(road_distance >= 45) %>%                       # Keep only rows with transfers 45 km or longer
  filter(DX_GROUP %in% c("ASAH", "ICH", "AIS", "TBI")) %>% #Keep only admits with ASAH, ICH, AIS or TBI
  filter(min_rank(DX_RANK) == 1, .by = VtfId_LopNr) %>% # Within each unique VtfId_LopNr, keep only the lowest ranking diagnostic PAR admit
  slice_max(sir_dsc_time, by = LopNr, n = 1, with_ties = FALSE) # This gives the latest date by default (if ties, may give more than one)

heavy_users <- d_pruned %>% group_by(formatted_icu_name) %>%
  summarise(
    proportion_hems_ift = mean(hems_ift == TRUE, na.rm = TRUE),
    n = n()
  ) %>%
  filter(proportion_hems_ift >= 0.1) %>%
  select(formatted_icu_name) %>%
  unique()

light_users <- d_pruned %>% group_by(formatted_icu_name) %>%
  summarise(
    proportion_hems_ift = mean(hems_ift == TRUE, na.rm = TRUE),
    n = n()
  ) %>%
  filter(proportion_hems_ift < 0.05) %>%
  select(formatted_icu_name) %>%
  unique()

icu_lookup <- tribble(
  ~formatted_icu_name,    ~icu_category,
  "Hudiksvall IVA", "II",
  "Östersund IVA", "II",
  "Skövde IVA", "II",
  "Eskilstuna IVA", "II",
  "Falun IVA", "II",
  "Kalix IVA", "I",
  "Lycksele IVA", "I",
  "Karlskoga IVA", "II",
  "Lidköping IVA", "II",
  "Skellefteå IVA", "II",
  "Sunderby IVA", "II",
  "Växjö IVA", "II",
  "Trollhättan IVA", "II",
  "Västervik IVA", "II",
  "Karlskrona IVA", "II",
  "Alingsås IVA", "I",
  "Jönköping IVA", "II",
  "Mora IVA", "I",
  "Gävle IVA", "II",
  "Kristianstad IVA", "II",
  "Helsingborg IVA", "II",
  "Kalmar IVA", "II",
  "Nyköping IVA", "II",
  "Sundsvall IVA", "II",
  "Gällivare IVA", "I",
  "Sollefteå IVA", "I",
  "Västerås IVA", "II",
  "Borås IVA", "II",
  "Visby IVA", "II",
  "Lindesberg IVA", "I",
  "Karlstad IVA", "II",
  "Bollnäs IVA", "I",
  "Norrtälje IVA", "I",
  "Värnamo IVA", "II",
  "Örebro IVA", "II",
  "Halmstad IVA", "II",
  "Örnsköldsvik IVA", "II",
  "Piteå IVA", "I",
  "Norrköping IVA", "II",
  "Ystad IVA", "II",
  "Varberg IVA", "II",
  "Ljungby IVA", "I",     
  "Torsby IVA", "I",
  "Arvika IVA", "I",
  "Eksjö IVA", "I",
  "Södertälje IVA", "II"
)

d_pruned <- d_pruned %>%
  left_join(icu_lookup, by = "formatted_icu_name")

```

```{r univariate-model-iv-relevance}
# Check IV relevance in uv model
#| output: false
#| echo: false
#| warning: false

m.modality.univ <- d_pruned %>%
  filter(!formatted_icu_name %in% light_users$formatted_icu_name) %>%
  mutate(z_SAPS_score_minus_age_and_gcs = scale(SAPS_score_minus_age_and_gcs)) %>%
  mutate(z_age = scale(age)) %>%
  mutate(
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .)),
    sir_consciousness_level = forcats::fct_relabel(sir_consciousness_level, ~ gsub("≤", "le", .))  # or use ASCII-safe levels
  ) %>% brm(
  formula = hems_ift ~ hems_minima_both,
  data = .,
  family = bernoulli(),
  cores = 4,
  chains = 4,
  iter = 4000,
)

m.modality.univ_exp_vals <- summary(m.modality.univ)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r univariate-model-iv-exclusion-restriction}
# Check iv model exclusion restriction assumtion
#| output: false
#| echo: false
#| warning: false

m.weather.mort <- d_pruned %>% 
  filter(formatted_icu_name %in% light_users$formatted_icu_name) %>%
  brm(
    formula = MORTALITY_30D ~ hems_minima_both,
    data = .,
    family = bernoulli(),
    cores = 4,
    chains = 4,
    iter = 4000
  )

m.weather.mort_exp_vals <- summary(m.weather.mort)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r night-admit-association-mortality}
# Uv model of of nighttime admit vs mortality 
#| output: false
#| echo: false
#| warning: false
m.night <- d_pruned %>%
 brm(
    formula = MORTALITY_30D ~ icu_admit_nighttime,
    data = .,
    family = bernoulli(),
    cores = 4,
    chains = 4
  )

m.night.mort_exp_vals <- summary(m.night)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r cov-balance-iv}
# build cov balance table
#| label: tbl-cov-balance-iv
#| output: false
#| echo: false
#| warning: false
cov_variance_tbl <- d_pruned %>%
  mutate(
    winter = case_when(
      month(sir_adm_time_UTC) %in% c(11, 12, 1, 2) ~ TRUE,
      month(sir_adm_time_UTC) == 3 & day(sir_adm_time_UTC) <= 15 ~ TRUE,
      TRUE ~ FALSE
    ) 
  ) %>%
  mutate(hems_minima_both = case_when(
    hems_minima_both == 1 ~ "Weather minima *met*",
    hems_minima_both == 0 ~ "Weather minima *not met*"
  )) %>%
  filter(!is.na(hems_minima_both)) %>%
  select(hems_minima_both, age, sex_female, DX_GROUP, SAPS_obtunded, any_AMV, SAPS_total_score, SAPS_score_minus_age_and_gcs, icu_admit_nighttime, MORTALITY_30D, MORTALITY_90D, DAOH_90) %>% 
  tbl_summary(by=hems_minima_both,
              label = list(hems_minima_both ~ "HEMS weather minima met at discharge",
                           age ~ "Age, years",
                           sex_female ~ "Female",
                           DX_GROUP ~ "Inferred diagnosis",
                           SAPS_obtunded ~ "SAPS GCS <15",
                           any_AMV ~ "Pre-transfer Mechanical Ventilation",
                           SAPS_total_score ~ "SAPS Score",
                           SAPS_score_minus_age_and_gcs ~ "SAPS Score excl. age and GCS components",
                           icu_admit_nighttime ~ "Primary ICU admission between 22:00 - 07:00",
                           MORTALITY_30D ~ "30-day mortality",
                           MORTALITY_90D ~ "90-day mortality",
                           DAOH_90 ~ "Days alive and out of hospital 90"
                           ),
              missing_text="Missing") %>%
    modify_caption(caption = "**Patient characterstics given weather conditions**")

```

```{r create-fig-flowchart}
# create patient flowchart
#| label: fig-flowchart
#| output: false
#| echo: false
#| warning: false
# Start from the full dataset
vtf_ids <- unique(d$VtfId_LopNr)

# Create a tracking dataframe
tracking <- tibble(VtfId_LopNr = vtf_ids)

# Step-by-step filters
d1 <- d %>% filter(sir_dsc_time >= as.Date("2019-10-01") & sir_dsc_time <= as.Date("2024-05-15"))
tracking <- tracking %>%
  mutate(pass_date_range = as.integer(VtfId_LopNr %in% d1$VtfId_LopNr))

d2 <- d1 %>% filter(!is.na(TERTIARY_HADM_ID))
tracking <- tracking %>%
  mutate(pass_tertiary = as.integer(VtfId_LopNr %in% d2$VtfId_LopNr))

d3 <- d2 %>% filter(sir_total_time <= 720)
tracking <- tracking %>%
  mutate(pass_short_icu = as.integer(VtfId_LopNr %in% d3$VtfId_LopNr))

d4 <- d3 %>% filter(SIR_PAR_OFFSET_TIGHT %in% c(-1, 0, 1))
tracking <- tracking %>%
  mutate(pass_recent_par = as.integer(VtfId_LopNr %in% d4$VtfId_LopNr))

d5 <- d4 %>% filter(road_distance >= 45)
tracking <- tracking %>%
  mutate(pass_distance = as.integer(VtfId_LopNr %in% d5$VtfId_LopNr))

d6 <- d5 %>% filter(DX_GROUP %in% c("ASAH", "ICH", "AIS", "TBI"))
tracking <- tracking %>%
  mutate(pass_dx_group = as.integer(VtfId_LopNr %in% d6$VtfId_LopNr))

d7 <- d6 %>%
  group_by(VtfId_LopNr) %>%
  filter(min_rank(DX_RANK) == 1) %>%
  ungroup()
tracking <- tracking %>%
  mutate(pass_dx_rank = as.integer(VtfId_LopNr %in% d7$VtfId_LopNr))

d8 <- d7 %>%
  group_by(LopNr) %>%
  slice_max(sir_dsc_time, n = 1, with_ties = FALSE) %>%
  ungroup()
tracking <- tracking %>%
  mutate(pass_latest_admit = as.integer(VtfId_LopNr %in% d8$VtfId_LopNr))


# Generate flowchart object
d_fc <- tracking %>%
  filter(pass_date_range == 1) %>%
  as_fc(label = "Eligible primary ICU admissions") %>%
  fc_filter(pass_short_icu == 1, label = "ICU stay < 12 h", show_exc = TRUE, label_exc = "ICU stay > 12 h", text_pattern_exc = "{label}\n {n}", text_pattern = "{label}\n {n}") %>%
  fc_filter(pass_tertiary == 1 & pass_recent_par == 1, label = "Transferred to tertiary center", show_exc = TRUE, label_exc = "No associated tertiary center admit", text_pattern_exc = "{label}\n {n}", text_pattern = "{label}\n {n}") %>%
  fc_filter(pass_distance == 1, label = "Transfer distance ≥45 km", show_exc = TRUE, label_exc = "Transfer distance <45 km", text_pattern_exc = "{label}\n {n}", text_pattern = "{label}\n {n}") %>%
  fc_filter(pass_dx_group == 1, label = "Diagnosis: ASAH, AIS, ICH, TBI", show_exc = TRUE, label_exc = "Other diagnosis", text_pattern_exc = "{label}\n {n}", text_pattern = "{label}\n {n}") %>%
  fc_filter(pass_dx_rank == 1 & pass_latest_admit == 1, label = "Latest admit per patient", show_exc = TRUE, label_exc = "Not latest admit per patient", text_pattern_exc = "{label}\n {n}", text_pattern = "{label}\n {n}")
```

```{r create-map-of-transfers}
# create map of transfers
#| output: false
#| echo: false
#| warning: false
library(lwgeom)
library(ggspatial)
library(geosphere)
library(rnaturalearth)
library(sf)

# Create a dataframe of routes in the dataset and the frequencies
routes <- d_pruned %>%
  group_by(formatted_icu_name, hospital_name_receiving) %>%
  count() %>%
  ungroup() %>%
  arrange(n)

# Create a tibble of longitude and latitude values of sources
sources_tbl <- d_pruned %>% 
  select(start_longitude, start_latitude)
# Create a tibble of longitude and latitude values of destinations
destinations_tbl <- d_pruned %>%
  select(end_longitude, end_latitude)

# Calculate great circles
sl_routes <- gcIntermediate(sources_tbl, destinations_tbl, 
                            n = 50, addStartEnd = TRUE, 
                            sp = TRUE)

# Get mapdata for Sweden from Naturalearth
swe <- ne_countries(scale = "medium", country="Sweden", type="countries", returnclass="sf")

routes_id <- rowid_to_column(routes, var = "id")
routes_long <- routes_id %>% 
  gather(key = "type", value = "place", formatted_icu_name, hospital_name_receiving)

end <- d_pruned %>% select(hospital_name_receiving, end_latitude, end_longitude) %>% rename("place" = "hospital_name_receiving", "latitude" = "end_latitude", "longitude" = "end_longitude")
start <- d_pruned %>% select(formatted_icu_name, start_latitude, start_longitude) %>% rename("place" = "formatted_icu_name", "latitude" = "start_latitude", "longitude" = "start_longitude")
locations <- bind_rows(end, start) %>% distinct()
routes_long_geo <- left_join(routes_long, locations, by = "place")

routes_long_sf <- st_as_sf(routes_long_geo,
                           coords = c("longitude", "latitude"),
                           crs = 4326)

routes_lines <- routes_long_sf %>% 
  group_by(id) %>% 
  summarise(do_union = FALSE) %>% 
  st_cast("LINESTRING")

routes_lines <- left_join(routes_lines, routes_id, by = "id")

routes_sf_tidy <- routes_lines %>% 
  st_segmentize(units::set_units(20, km))

routes_sf_tidy <- routes_lines %>% 
  st_segmentize(units::set_units(20, km))

plot_transfers <- ggplot() +
  geom_sf(data = swe, fill = gray(0.95), color = gray(0.3)) +
  geom_sf(data = routes_sf_tidy, aes(alpha = routes_sf_tidy$n), show.legend = FALSE) +
  scale_alpha_continuous(range = c(0.1, 1), breaks = pretty(range(routes_sf_tidy$n), n = 4)) +
  geom_sf(data = routes_long_sf) +
  geom_point(data = filter(locations, place %in% unique(d_pruned$hospital_name_receiving)), aes(x = longitude, y = latitude), color = "maroon", size = 2) +
  geom_point(shape=21, data = filter(locations, place %in% heavy_users$formatted_icu_name), aes(x = longitude, y = latitude), color = "#004080", size = 3.5) +
  theme_minimal() + 
  ggspatial::annotation_scale(style="ticks", location="br")
```

```{r create-barplot-of-transfers-summary}
#| output: false
#| echo: false
#| warning: false
# Get palette (manually select better contrast)
clrs <- natparks.pals("Yosemite")
hems_col <- clrs[5]       
non_hems_col <- clrs[20]   
total_col <- "grey30"     # Neutral gray for total bars

# Custom theme using Noto Sans
theme_nice <- function() {
  theme_minimal(base_family = "Noto Sans") +
    theme(
      plot.background = element_rect(fill = "white", color = NA),
      panel.grid.minor = element_blank(),
      panel.grid.major.y = element_line(color = "grey90"),
      plot.title = element_text(face = "bold", size = 16),
      plot.subtitle = element_text(size = 13, margin = margin(b = 10)),
      strip.text = element_text(face = "bold", size = 13),
      strip.background = element_rect(fill = "grey90", color = NA),
      legend.title = element_text(face = "bold", size = 12),
      legend.text = element_text(size = 11),
      axis.text.x = element_text(angle = 45, hjust = 1),
      axis.title.y = element_text(margin = margin(r = 10)),
      plot.margin = margin(10, 15, 10, 15)
    )
}

# Define bins
breaks <- c(seq(0, 300, by = 100), Inf)
labels <- c(paste(head(breaks, -2), breaks[-c(1, length(breaks))], sep = "–"), ">300")

# Prepare data
d_plot <- d_pruned %>%
  mutate(
    distance_bin = cut(road_distance, breaks = breaks, labels = labels, right = FALSE),
    distance_bin = factor(distance_bin, levels = labels),
    hems_label = factor(hems_ift, levels = c(TRUE, FALSE), labels = c("HEMS", "Non-HEMS"))
  )

# Count plot
total_counts <- d_plot %>%
  count(distance_bin) %>%
  mutate(
    plot_type = "Counts",
    hems_label = "Total"
  )

# Proportion plot
proportions <- d_plot %>%
  count(distance_bin, hems_label) %>%
  group_by(distance_bin) %>%
  mutate(
    prop = n / sum(n),
    plot_type = "Proportion"
  ) %>%
  ungroup() %>%
  select(distance_bin, hems_label, n = prop, plot_type)

# Combine and fix stacking order
plot_data <- bind_rows(total_counts, proportions) %>%
  mutate(
    hems_label = factor(hems_label, levels = c("Non-HEMS", "HEMS", "Total"))  # HEMS plotted first
  )

# Final plot
transports_summary_plot <- ggplot(plot_data, aes(x = distance_bin, y = n, fill = hems_label)) +
  geom_col(position = "stack", width = 0.75, color = "white", linewidth = 0.2) +
  facet_grid(plot_type ~ ., scales = "free_y") +
  scale_fill_manual(
    values = c(
      "HEMS" = hems_col,
      "Non-HEMS" = non_hems_col,
      "Total" = total_col
    ),
    breaks = c("HEMS", "Non-HEMS")
  ) +
  labs(
    x = "Road Distance (km)",
    y = NULL,
    fill = "Transport Mode"
  ) +
  theme_nice()
```

```{r create-cohort_description}
#| output: false
#| echo: false
#| warning: false
# Create table with cohort description

cohort_description <- d_pruned %>%
  mutate(hems_ift = case_when(
hems_ift == TRUE ~ "HEMS transfer",
hems_ift == FALSE ~ "Non-HEMS transfer"
  )) %>%
  select(hems_ift, age, sex_female, DX_GROUP, DNR, sir_consciousness_level, SAPS_obtunded, any_AMV, SAPS_total_score, SAPS_score_minus_age_and_gcs, icu_admit_afterhours, icu_admit_nighttime, sir_total_time, road_distance, transit_time_hems, hems_minima_both, daylight, metar_cloud_base, metar_visibility, MORTALITY_30D, MORTALITY_90D, DAOH_90) %>% 
  tbl_summary(by=hems_ift,
              label = list(hems_ift ~ "Interfacility transfer by HEMS",
                           age ~ "Age (years)",
                           sex_female ~ "Female",
                           DX_GROUP ~ "Inferred diagnosis",
                           DNR ~ "DNR order",
                           sir_consciousness_level ~ "SAPS consciousness level",
                           SAPS_obtunded ~ "SAPS GCS <15",
                           any_AMV ~ "Pre-transfer Mechanical Ventilation",
                           SAPS_total_score ~ "SAPS Score",
                           SAPS_score_minus_age_and_gcs ~ "SAPS Score excl. age and GCS components",
                           icu_admit_afterhours ~ "Primary ICU admission outside of office hours",
                           icu_admit_nighttime ~ "Primary ICU admission between 22:00 - 07:00",                     
                           sir_total_time ~ "Time in primary ICU (minutes)",
                           road_distance ~ "Road distance (km)",
                           transit_time_hems ~ "Inferred flight time for HEMS transfers (minutes)",
                           hems_minima_both ~ "HEMS weather minima met at discharge",
                           daylight ~ "Daylight or civil twillight at sending hospital",
                           metar_cloud_base ~ "Cloud base (ft) at sending hospital",
                           metar_visibility ~ "Horizontal visibility (m) at sending hospital",
                           MORTALITY_30D ~ "30-day mortality",
                           MORTALITY_90D ~ "90-day mortality",
                           DAOH_90 ~ "Days alive and out of hospital 90"
                           ),
              statistic = c("transit_time_hems") ~ "{mean}",
              missing_text="Missing"
                        ) %>%
    remove_row_type(variables = c(transit_time_hems, metar_cloud_base, metar_visibility), type = "missing") %>%
    modify_caption(caption = "**Patient characteristics**")
```

```{r create-modality-model}
#| output: false
#| echo: false
#| warning: false
m.modality <- d_pruned %>% 
  mutate(z_SAPS_score_minus_age_and_gcs = scale(SAPS_score_minus_age_and_gcs)) %>%
  mutate(z_age = scale(age)) %>%
    mutate(
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .)),
    sir_consciousness_level = forcats::fct_relabel(sir_consciousness_level, ~ gsub("≤", "le", .))  # or use ASCII-safe levels
  ) %>%
  brm(
  formula = hems_ift ~ DX_GROUP + z_age + SAPS_obtunded + z_SAPS_score_minus_age_and_gcs + icu_admit_nighttime + hems_minima_both + DNR + (1 | formatted_icu_name),
  data = .,
  family = bernoulli(),
  cores = 4,
  chains = 4,
  iter = 4000
)

m.modality.exp_vals <- summary(m.modality)$fixed %>%
  slice(9) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()

```

```{r create-modality-model-tbl}
#| output: false
#| echo: false
#| warning: false
m.modality.summary <- m.modality %>% tbl_regression(exponentiate = TRUE, conf.level = .95,
                              label = c(DX_GROUP ~ "Inferred diagnosis",
                                        icu_admit_nighttime ~ "Primary ICU admission between 22:00 - 07:00",  
                                        SAPS_obtunded ~ "SAPS GCS <15",
                                        DNR ~ "DNR order",
                                        z_age ~ "Age, std",
                                        z_SAPS_score_minus_age_and_gcs ~ "SAPS score (excl. age/GCS components), std",
                                        hems_minima_both ~ "HEMS weather minima met at discharge")
                              ) %>%
  modify_header(estimate = "**aOR**") %>%
  modify_header(label = "**Predictor**") %>%
  modify_caption(caption = "**Multivariate Bayesian mixed effect generalized linear model predicting HEMS transfer**")
```


```{r glm-causal-plain}
#| output: false
#| echo: false
#| warning: false

m.causal.I <- d_pruned %>%
  mutate(
    z_SAPS_score_minus_age_and_gcs = as.numeric(scale(SAPS_score_minus_age_and_gcs)),
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .x)),
    hems_ift = as.integer(hems_ift),
    z_age    = as.numeric(scale(age)),
    sir_consciousness_level = forcats::fct_relevel(sir_consciousness_level, "I (GCS ≥13)")
  ) %>%
  brm(
    formula = MORTALITY_30D ~ hems_ift + DX_GROUP + DNR +
      sir_consciousness_level + z_age + z_SAPS_score_minus_age_and_gcs,
    data   = .,
    family = bernoulli(),
    cores  = 4,
    iter   = 4000
  )

m.causal.I.exp_vals <- summary(m.causal.I)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r glm-causal-sending}
#| output: false
#| echo: false
#| warning: false

m.causal.II <- d_pruned %>%
  mutate(
    z_SAPS_score_minus_age_and_gcs = as.numeric(scale(SAPS_score_minus_age_and_gcs)),
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .x)),
    hems_ift = as.integer(hems_ift),
    z_age    = as.numeric(scale(age)),
    sir_consciousness_level = forcats::fct_relevel(sir_consciousness_level, "I (GCS ≥13)")
  ) %>%
 brm(
    formula = MORTALITY_30D ~ hems_ift + (1 + hems_ift | formatted_icu_name) + DX_GROUP + DNR + sir_consciousness_level + z_age + z_SAPS_score_minus_age_and_gcs, 
    data = .,
    family = bernoulli(),
    cores = 4,
    iter = 4000
  )

m.causal.II.exp_vals <- summary(m.causal.II)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r glm-causal-receiving}
#| output: false
#| echo: false
#| warning: false

m.causal.III <- d_pruned %>%
  mutate(
    z_SAPS_score_minus_age_and_gcs = as.numeric(scale(SAPS_score_minus_age_and_gcs)),
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .x)),
    hems_ift = as.integer(hems_ift),
    z_age    = as.numeric(scale(age)),
    sir_consciousness_level = forcats::fct_relevel(sir_consciousness_level, "I (GCS ≥13)")
  ) %>%
 brm(
    formula = MORTALITY_30D ~ hems_ift + hospital_name_receiving + DX_GROUP + DNR + sir_consciousness_level + z_age + z_SAPS_score_minus_age_and_gcs, 
    data = .,
    family = bernoulli(),
    cores = 4,
    iter = 4000
  )

m.causal.III.exp_vals <- summary(m.causal.III)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r glm-causal-w-night}
#| output: false
#| echo: false
#| warning: false
m.causal.IV <- d_pruned %>%
  mutate(
    z_SAPS_score_minus_age_and_gcs = as.numeric(scale(SAPS_score_minus_age_and_gcs)),
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .x)),
    hems_ift = as.integer(hems_ift),
    z_age    = as.numeric(scale(age)),
    sir_consciousness_level = forcats::fct_relevel(sir_consciousness_level, "I (GCS ≥13)")
  ) %>%
 brm(
    formula = MORTALITY_30D ~ hems_ift + (1 + hems_ift | formatted_icu_name) + DX_GROUP + DNR + sir_consciousness_level + z_age + z_SAPS_score_minus_age_and_gcs + icu_admit_nighttime,
    data = .,
    family = bernoulli(),
    cores = 4,
    iter = 4000
  )

m.causal.IV.exp_vals <- summary(m.causal.IV)$fixed %>%
  slice(2) %>%
  select("Estimate", "l-95% CI", "u-95% CI") %>%
  unlist() %>%
  exp()
```

```{r glm-causal-daoh}
#| output: false
#| echo: false
#| warning: false
m.causal.V <- d_pruned %>%
   mutate(
    z_SAPS_score_minus_age_and_gcs = as.numeric(scale(SAPS_score_minus_age_and_gcs)),
    DX_GROUP = forcats::fct_relabel(DX_GROUP, ~ gsub("_", "-", .x)),
    hems_ift = as.integer(hems_ift),
    z_age    = as.numeric(scale(age)),
    sir_consciousness_level = forcats::fct_relevel(sir_consciousness_level, "I (GCS ≥13)")
  ) %>%
  mutate(DAOH_90 = DAOH_90/90) %>%
 brm(formula =  bf(DAOH_90 ~ hems_ift + (1 | formatted_icu_name) + DX_GROUP + sir_consciousness_level + DNR + z_age + z_SAPS_score_minus_age_and_gcs,
     zi ~ hems_ift + (1 | formatted_icu_name) + DX_GROUP + sir_consciousness_level + DNR + z_age +  z_SAPS_score_minus_age_and_gcs,
     phi ~ hems_ift + (1 | formatted_icu_name) + DX_GROUP + sir_consciousness_level + DNR + z_age + z_SAPS_score_minus_age_and_gcs),
    data = .,
    family = zero_inflated_beta(),
    chains = 4,
    iter = 4000
  )
```



```{r create-tbl-glm-outcome}
#| output: false
#| echo: false
#| warning: false
library(dplyr)
library(stringr)
library(gtsummary)

# Labels
labels_common <- c(
  DX_GROUP ~ "Inferred diagnosis",
  z_age ~ "Age, std",
  sir_consciousness_level ~ "GCS",
  DNR ~ "DNR order",
  hems_ift ~ "HEMS transfer",
  z_SAPS_score_minus_age_and_gcs ~ "SAPS score (excl. age/GCS components), std"
)
labels_iv <- c(
  labels_common,
  icu_admit_nighttime ~ "Primary ICU admission between 22:00 - 07:00"
)

# ------------------ TABLES ------------------

tbl.glm.causal.I <-
  tbl_regression(
    m.causal.I, exponentiate = TRUE, conf.level = 0.95,
    include = c(hems_ift, DX_GROUP, z_age, sir_consciousness_level, DNR, z_SAPS_score_minus_age_and_gcs),
    label = labels_common
  ) %>%
  modify_header(estimate = "**aOR**", label = "**Predictor**") %>%
  modify_table_body(~ .x %>%
    mutate(
      # de-glue GCS level rows
      label = if_else(
        variable == "sir_consciousness_level" & row_type != "label",
        str_remove(label, "^sir_consciousness_level") |> str_trim(),
        label
      ),
      label = case_when(
        variable == "sir_consciousness_level" & label %in% c("IIGCS7M12","IIGCS7-12","IIGCS7–12") ~ "II (GCS 7–12)",
        variable == "sir_consciousness_level" & label == "IIIGCS6" ~ "III (GCS 6)",
        variable == "sir_consciousness_level" & label == "IVGCS5"  ~ "IV (GCS 5)",
        variable == "sir_consciousness_level" & label %in% c("VGCSle4","VGCS≤4") ~ "V (GCS ≤4)",
        TRUE ~ label
      )
    )
  )

tbl.glm.causal.II <-
  tbl_regression(
    m.causal.II, exponentiate = TRUE, conf.level = 0.95,
    include = c(hems_ift, DX_GROUP, z_age, sir_consciousness_level, DNR, z_SAPS_score_minus_age_and_gcs),
    label = labels_common
  ) %>%
  modify_header(estimate = "**aOR**", label = "**Predictor**") %>%
  modify_table_body(~ .x %>%
    mutate(
      label = if_else(
        variable == "sir_consciousness_level" & row_type != "label",
        str_remove(label, "^sir_consciousness_level") |> str_trim(),
        label
      ),
      label = case_when(
        variable == "sir_consciousness_level" & label %in% c("IIGCS7M12","IIGCS7-12","IIGCS7–12") ~ "II (GCS 7–12)",
        variable == "sir_consciousness_level" & label == "IIIGCS6" ~ "III (GCS 6)",
        variable == "sir_consciousness_level" & label == "IVGCS5"  ~ "IV (GCS 5)",
        variable == "sir_consciousness_level" & label %in% c("VGCSle4","VGCS≤4") ~ "V (GCS ≤4)",
        TRUE ~ label
      )
    )
  )

tbl.glm.causal.III <-
  tbl_regression(
    m.causal.III, exponentiate = TRUE, conf.level = 0.95,
    include = c(hems_ift, DX_GROUP, z_age, sir_consciousness_level, DNR, z_SAPS_score_minus_age_and_gcs),
    label = labels_common
  ) %>%
  modify_header(estimate = "**aOR**", label = "**Predictor**") %>%
  modify_table_body(~ .x %>%
    mutate(
      label = if_else(
        variable == "sir_consciousness_level" & row_type != "label",
        str_remove(label, "^sir_consciousness_level") |> str_trim(),
        label
      ),
      label = case_when(
        variable == "sir_consciousness_level" & label %in% c("IIGCS7M12","IIGCS7-12","IIGCS7–12") ~ "II (GCS 7–12)",
        variable == "sir_consciousness_level" & label == "IIIGCS6" ~ "III (GCS 6)",
        variable == "sir_consciousness_level" & label == "IVGCS5"  ~ "IV (GCS 5)",
        variable == "sir_consciousness_level" & label %in% c("VGCSle4","VGCS≤4") ~ "V (GCS ≤4)",
        TRUE ~ label
      )
    )
  )

tbl.glm.causal.IV <-
  tbl_regression(
    m.causal.IV, exponentiate = TRUE, conf.level = 0.95,
    include = c(hems_ift, DX_GROUP, z_age, sir_consciousness_level, DNR, icu_admit_nighttime, z_SAPS_score_minus_age_and_gcs),
    label = labels_iv
  ) %>%
  modify_header(estimate = "**aOR**", label = "**Predictor**") %>%
  modify_table_body(~ .x %>%
    mutate(
      label = if_else(
        variable == "sir_consciousness_level" & row_type != "label",
        str_remove(label, "^sir_consciousness_level") |> str_trim(),
        label
      ),
      label = case_when(
        variable == "sir_consciousness_level" & label %in% c("IIGCS7M12","IIGCS7-12","IIGCS7–12") ~ "II (GCS 7–12)",
        variable == "sir_consciousness_level" & label == "IIIGCS6" ~ "III (GCS 6)",
        variable == "sir_consciousness_level" & label == "IVGCS5"  ~ "IV (GCS 5)",
        variable == "sir_consciousness_level" & label %in% c("VGCSle4","VGCS≤4") ~ "V (GCS ≤4)",
        TRUE ~ label
      )
    )
  )

## DAOH model (strip to mean/"b_" params) -------------------------------
model_mean_only <- m.causal.V
keep_pars <- names(model_mean_only$fit)[
  grepl("^b_", names(model_mean_only$fit)) &
  !grepl("^b_phi_", names(model_mean_only$fit)) &
  !grepl("^b_zi_", names(model_mean_only$fit))
]
model_mean_only$fit@sim$samples   <- lapply(model_mean_only$fit@sim$samples, \(ch) ch[keep_pars])
model_mean_only$fit@sim$pars_oi   <- keep_pars
model_mean_only$fit@sim$fnames_oi <- keep_pars
model_mean_only$fit@model_pars    <- keep_pars
model_mean_only$fit@par_dims      <- model_mean_only$fit@par_dims[keep_pars]

m.causal.V.exp_vals <- as.data.frame(summary(model_mean_only$fit)$summary) %>%
  slice(2) %>%
  select(mean, `2.5%`, `97.5%`) %>%
  mutate(across(everything(), exp))

tbl.glm.causal.V <-
  tbl_regression(
    model_mean_only, exponentiate = TRUE, conf.level = 0.95,
    include = c(hems_ift, DX_GROUP, z_age, sir_consciousness_level, DNR, z_SAPS_score_minus_age_and_gcs),
    label = labels_common
  ) %>%
  modify_header(estimate = "**aOR**", label = "**Predictor**") %>%
  modify_table_body(~ .x %>%
    mutate(
      label = if_else(
        variable == "sir_consciousness_level" & row_type != "label",
        str_remove(label, "^sir_consciousness_level") |> str_trim(),
        label
      ),
      label = case_when(
        variable == "sir_consciousness_level" & label %in% c("IIGCS7M12","IIGCS7-12","IIGCS7–12") ~ "II (GCS 7–12)",
        variable == "sir_consciousness_level" & label == "IIIGCS6" ~ "III (GCS 6)",
        variable == "sir_consciousness_level" & label == "IVGCS5"  ~ "IV (GCS 5)",
        variable == "sir_consciousness_level" & label %in% c("VGCSle4","VGCS≤4") ~ "V (GCS ≤4)",
        TRUE ~ label
      )
    )
  )

# ------------------ MERGE + FINAL TWEAKS ------------------

glm_table <-
  tbl_merge(
    tbls = list(
      tbl.glm.causal.I,
      tbl.glm.causal.II,
      tbl.glm.causal.III
    ),
    tab_spanner = c("**Model I**","**Model II**","**Model III**")
  ) %>%
  modify_caption("**Multivariate Bayesian mixed effect generalized linear models predicting 30-day mortality (I–III)**")

glm_table <- glm_table %>%
  modify_table_body(~ .x %>%
    dplyr::group_by(variable) %>%
    dplyr::mutate(.is_header = row_type == "label" & dplyr::n() > 1) %>%  # only multi-row vars
    dplyr::ungroup() %>%
    dplyr::mutate(
      dplyr::across(
        dplyr::starts_with("ci"),
        ~ dplyr::case_when(
            .is_header           ~ "",                                        # blank only true headers
            is.na(.) | . == ""   ~ .,                                         # leave real NA/blank
            TRUE                 ~ paste0("(", gsub(",\\s+", ",\u00A0", .), ")")  # NBSP + parens
        )
      )
    ) %>%
    dplyr::select(-.is_header)
  )
```

```{r causal-contrasts-glm}
#| output: false
#| echo: false
#| warning: false
res.causal.I <- ate_cf(m.causal.I)
res.causal.II <- ate_cf(m.causal.II)
res.causal.III <- ate_cf(m.causal.III)
res.causal.IV <- ate_cf(m.causal.IV)
```

```{r prep-minimal-dataset}
# prep a minimal dataset for iv models 
#| output: false
#| echo: false
#| warning: false

d_pruned_minimal <- d_pruned %>%
  select(hems_minima_both, hems_ift, MORTALITY_30D, icu_admit_nighttime) %>%
  mutate(hems_minima_both = case_when
          (hems_minima_both == 1 ~ TRUE,
           hems_minima_both == 0 ~ FALSE)) %>%
  mutate(MORTALITY_30D = case_when
          (MORTALITY_30D == 1 ~ TRUE,
           MORTALITY_30D == 0 ~ FALSE
            )) %>%
  drop_na()

n_missing_d_pruned_minimal <- d_pruned %>%
  summarise(n_missing = sum(is.na(hems_minima_both))) %>%
  pull(n_missing)

# Save anonymized minimal dataset for IV models
write_csv(d_pruned_minimal, "data/minimal_iv_dataset.csv")

# Never-takers: among those assigned to treatment (W == 1), was H == 0?
H0_W1_obs <- as.integer(d_pruned_minimal$hems_ift[d_pruned_minimal$hems_minima_both == 1] == 0)

# Always-takers: among those assigned to control (W == 0), was H == 1?
H1_W0_obs <- as.integer(d_pruned_minimal$hems_ift[d_pruned_minimal$hems_minima_both == 0] == 1)

N_obs <- dim(d_pruned_minimal)[1]
W <- as.integer(d_pruned_minimal$hems_minima_both)
H <- as.integer(d_pruned_minimal$hems_ift)
Y <- as.integer(d_pruned_minimal$MORTALITY_30D)
N <- as.integer(d_pruned_minimal$icu_admit_nighttime)

H0_W1_night <- as.integer(d_pruned_minimal$icu_admit_nighttime[d_pruned_minimal$hems_minima_both == 1])
H1_W0_night <- as.integer(d_pruned_minimal$icu_admit_nighttime[d_pruned_minimal$hems_minima_both == 0])

data_list <- list(
  N_i = nrow(d_pruned_minimal),
  N_obs = as.integer(d_pruned_minimal$icu_admit_nighttime),
  W_obs = as.integer(d_pruned_minimal$hems_minima_both),
  H_obs = as.integer(d_pruned_minimal$hems_ift),
  D_obs = as.integer(d_pruned_minimal$MORTALITY_30D),
  H0_W1_obs = H0_W1_obs,
  H1_W0_obs = H1_W0_obs,
  H0_W1_n = length(H0_W1_obs),
  H1_W0_n = length(H1_W0_obs),
  H0_W1_night = H0_W1_night,
  H1_W0_night = H1_W0_night
)
```

```{r define-iv-models}
# fit iv models
#| output: false
#| echo: false
#| warning: false

model_late <-
  "data {
  int<lower=1> N_i;                         // Total observations
  int<lower=0, upper=1> W_obs[N_i];         // Instrument (W)
  int<lower=0, upper=1> H_obs[N_i];         // Treatment (H)
  int<lower=0, upper=1> D_obs[N_i];         // Outcome (Y)

  int<lower=1> H0_W1_n;                     // Number where W=1
  int<lower=0, upper=1> H0_W1_obs[H0_W1_n]; // Was H=0 in those W=1?

  int<lower=1> H1_W0_n;                     // Number where W=0
  int<lower=0, upper=1> H1_W0_obs[H1_W0_n]; // Was H=1 in those W=0?
}

parameters {
  simplex[3] pi;                            // [NT, C, AT]
  real<lower=0, upper=1> P_T_Z[2];        // P(H | W)
  real<lower=0, upper=1> P_Y_Z[2];        // P(Y | W)
}

model {
  // Priors
  pi ~ dirichlet(rep_vector(0.5, 3));
  for (w in 1:2) {
    P_T_Z[w] ~ beta(0.5, 0.5);
    P_Y_Z[w] ~ beta(0.5, 0.5);
  }

  // Likelihoods
  for (i in 1:N_i) {
    int w = W_obs[i] + 1;

    H_obs[i] ~ bernoulli(P_T_Z[w]);
    D_obs[i] ~ bernoulli(P_Y_Z[w]);
  }

  for (i in 1:H0_W1_n)
    H0_W1_obs[i] ~ bernoulli(pi[1]);  // P(H=0 | W=1): Never-taker

  for (i in 1:H1_W0_n)
    H1_W0_obs[i] ~ bernoulli(pi[3]);  // P(H=1 | W=0): Always-taker
}

generated quantities {
  real LATE;
  real P_H0_W1 = pi[1];
  real P_COMP = pi[2];
  real P_H1_W0 = pi[3];

  LATE = (P_Y_Z[2] - P_Y_Z[1]) / P_COMP;
}"

model_late_conditional <-" 
  data {
  int<lower=1> N_i;                            // Total observations
  int<lower=0, upper=1> W_obs[N_i];            // Instrument (W), 0 or 1
  int<lower=0, upper=1> N_obs[N_i];            // Nighttime indicator (N), 0 or 1
  int<lower=0, upper=1> H_obs[N_i];            // Treatment (H)
  int<lower=0, upper=1> D_obs[N_i];            // Outcome (Y)

  int<lower=1> H0_W1_n;
  int<lower=0, upper=1> H0_W1_obs[H0_W1_n];    // H=0 in W=1 group
  int<lower=0, upper=1> H0_W1_night[H0_W1_n];  // Night indicator for those

  int<lower=1> H1_W0_n;
  int<lower=0, upper=1> H1_W0_obs[H1_W0_n];    // H=1 in W=0 group
  int<lower=0, upper=1> H1_W0_night[H1_W0_n];  // Night indicator for those
}

parameters {
  simplex[3] pi[2];                             // Compliance types [NT, C, AT] by N (0,1)
  real<lower=0, upper=1> P_T_Z[2, 2];           // P(H | W, N) — rows: W=0/1, cols: N=0/1
  real<lower=0, upper=1> P_Y_Z[2, 2];           // P(Y | W, N)
  real<lower=0, upper=1> P_N;                   // P(N = 1) — i.e. nighttime
}

model {
  // Priors
  for (n in 1:2) {
    pi[n] ~ dirichlet(rep_vector(0.5, 3));
  }
  for (w in 1:2) {
    for (n in 1:2) {
      P_T_Z[w, n] ~ beta(0.5, 0.5);
      P_Y_Z[w, n] ~ beta(0.5, 0.5);
    }
  }
  P_N ~ beta(0.5, 0.5);

  // Likelihoods
  for (i in 1:N_i) {
    int w = W_obs[i] + 1;
    int n = N_obs[i] + 1;

    H_obs[i] ~ bernoulli(P_T_Z[w, n]);
    D_obs[i] ~ bernoulli(P_Y_Z[w, n]);
    N_obs[i] ~ bernoulli(P_N);  // This is fine with 0/1
  }

  for (i in 1:H0_W1_n) {
    int n = H0_W1_night[i] + 1;
    H0_W1_obs[i] ~ bernoulli(pi[n][1]);  // NT
  }

  for (i in 1:H1_W0_n) {
    int n = H1_W0_night[i] + 1;
    H1_W0_obs[i] ~ bernoulli(pi[n][3]);  // AT
  }
}

generated quantities {
  vector[2] LATE;
  vector[2] P_H0_W1;
  vector[2] P_COMP;
  vector[2] P_H1_W0;

  for (n in 1:2) {
    P_H0_W1[n] = pi[n][1];
    P_COMP[n] = pi[n][2];
    P_H1_W0[n] = pi[n][3];
    LATE[n] = (P_Y_Z[2, n] - P_Y_Z[1, n]) / P_COMP[n];
  }

  real LATE_marginalized = LATE[2] * P_N + LATE[1] * (1 - P_N);
}
"

```

```{r fit-iv-models}
# fit iv models
#| output: false
#| echo: false
#| warning: false
#| 
fit.late <- stan(
  model_code = model_late,
  data = data_list,
  chains = 4,
  iter = 4000,
  seed = 1989
)

fit.late.conditional <- stan(
  model_code = model_late_conditional,
  data = data_list,
  chains = 4,
  iter = 4000,
  seed = 1989
)

# Extract posteriors
posterior.late <- extract(fit.late)
posterior.late <- as.data.frame(rstan::extract(fit.late))

posterior.conditional <- extract(fit.late.conditional)
posterior.late.conditional <- as.data.frame(rstan::extract(fit.late.conditional))

```

```{r}
# calc iv bounds
#| output: false
#| echo: false
#| warning: false
library(ivtools)

iv.b <- ivbounds(data = d_pruned_minimal, Z = "hems_minima_both", X = "hems_ift", Y = "MORTALITY_30D", monotonicity = TRUE)
CRD_lower <- iv.b$p1["min"] - iv.b$p0["max"]
CRD_upper <- iv.b$p1["max"] - iv.b$p0["min"]
```

## Abstract
**Background:** The optimal transfer team setup and transportation modality in long distance transfers of neuro ICU patients is unknown. Rigorous causal inference in observational intensive care research requires well thought through approaches to address selection bias. The purpose of this study was to estimate the effect of helicopter emergency medical services (HEMS) on 30-day mortality in neuro ICU patients transferred to tertiary centers for definitive care. 

**Methods:** We performed a retrospective cohort study using Swedish healthcare registries and flight movement data. Patients transferred by HEMS were compared to those transferred by other modalities. Weather conditions affecting flight feasibility were used as a natural randomization mechanism in an instrumental variable analysis.

**Results:** A total of `r dim(d_pruned)[1]` patients were transferred from `r length(unique(d_pruned$formatted_icu_name))` Swedish ICUs. There were `r sum(d_pruned$hems_ift, na.rm = TRUE)` HEMS transfers (`r round(mean(d_pruned$hems_ift, na.rm = TRUE) * 100, 1)`%). The core assumptions for instrumental variable analysis using weather observations were met. The local average treatment effect of HEMS versus non-HEMS transfer on 30-day mortality was estimated at `r round(mean(posterior.late$LATE), 3) * 100`% (95% CI: `r round(hdi(posterior.late$LATE, ci = 0.95)$CI_low, 3) * 100`% to `r round(hdi(posterior.late$LATE, ci = 0.95)$CI_high, 3) * 100`%). The posterior probability that HEMS increases 30-day mortality in the complier group is `r round(mean(posterior.late$LATE > 0), 3) * 100`%. In a risk-adjusted logistic model, the estimated marginal risk difference in 30-day mortality associated with HEMS was `r round(res.causal.I$summary$mean, 3) * 100`% (95% CI: `r round(res.causal.I$summary$q025, 3) * 100`% to `r round(res.causal.I$summary$q975, 3)`%), but including receiving center in the model removed all association.

**Conclusion:** Findings are weakly suggestive of a potential mortality benefit associated with HEMS transfers in the complier subgroup. The estimates are limited by statistical uncertainty. Weather-based instrumental variables could provide a transferable framework for causal inference in transport research.

## Keywords
"Neurocritical care", "Neurointensive care", "Intensive care", "Interfacility transfer", "Causal inference", "Instrumental variable analysis", "Helicopter emergency medical services", "HEMS", "Patient transfer"

## Introduction

Interfacility transfer is often necessary for critically ill patients with neurological emergencies admitted to community or regional ICUs. In Sweden, as in many other countries, neurosurgical and neurocritical care expertise is centralized to tertiary centers, requiring patients to be transferred for definitive care.

Transporting a critically ill patient between hospitals is challenging. These patients require ongoing treatment and monitoring during transit — ideally at a level at least equivalent to the referring ICU. Environmental factors such as changes in barometric pressure, temperature, and acceleration can exert physiological stress. Operational challenges — limited access to the patient, low lighting, noise, and confined workspace — further complicate care. The transport team must also, concurrent with patient management, solve several non-clinical tasks such as planning the logistics with dispatch centers and communicating with ambulance personnel, receiving providers, and flight crews.

Such challenges may explain the high complication rates observed during interhospital transfers, reported in some studies to affect up to 25% of patients [@singh_incidence_2009; @van_lieshout_nurses_2016; @lyphout_patient_2018; @sundbom_total_2021; @seymour_adverse_2008; @swickard_patient_2018; @wiegersma_quality_2011; @grier_critical_2020; @jeyaraju_safety_2021]. Actual rates may be higher due to reliance on self-reporting [@grier_critical_2020; @archer_development_2017]. Longer transport times are consistently linked to increased complication risk [@singh_incidence_2009; @sundbom_total_2021; @swickard_patient_2018].

Some studies suggest that interhospital transfers conducted by specialized retrieval teams may be associated with better care and reduced mortality in adult populations [@van_lieshout_nurses_2016; @wiegersma_quality_2011; @kennedy_impact_2015; @bellingan_comparison_2000]. Similar findings have been reported in pediatric cohorts [@ramnarayan_effect_2010; @orr_pediatric_2009; @vos_comparison_2004; @meyer_pediatric_2016]. Only one single-center randomized controlled trial has been conducted to assess the impact of specialized transport teams on patient outcome, which could not establish non-inferiority of nurse led transports [@van_lieshout_nurses_2016]. Despite the limited evidence base, several expert groups and professional societies recommend that interfacility transfers of critically ill patients be conducted by specialized teams [@droogh_transferring_2015; @foex_guidance_2019; @warren_guidelines_2004; @nathanson_guidelines_2020].

However, few studies have specifically examined interhospital transfers of neurocritical care patients [@nuno_effect_2012;@safaee_interfacility_2019], and only one has evaluated the impact of transport modality on outcomes in this population, finding no difference between air and ground transfers [@weyhenmeyer_effects_2018].

In Sweden, two similar patients may be transferred by different means, and under markedly different standards of care. Broadly, there are two provider configurations for interhospital transfers. Specialized retrieval teams — typically composed of an anesthesiologist-intensivist and a nurse trained in anesthesia and/or intensive care — operate at a regional level and have access to both air and ground transport. In contrast, local solutions are based at the referring hospital. These may consist of a single provider, typically a nurse with anesthetic training or a physician in training. Local providers or teams rely solely on ground ambulances. All patients transferred between hospitals via HEMS in Sweden are managed by specialized retrieval teams.

The factors influencing the choice of transport modality in interhospital ICU transfers have not been described. It is possible that estimated transport time, clinical urgency, weather, patient characteristics and local practice patterns play a role in the decision-making process. Some of these factors might be confounding variables. Establishing causal relationships from observational transport data is challenging, as most previous studies may be limited by residual confounding from unmeasured factors that influence both transport decisions and patient outcomes.

In this study, we describe a national cohort of neurocritically ill patients transferred to tertiary centers. We examine predictors of HEMS use and assess whether HEMS — serving in part as a proxy for specialized teams — is associated with 30-day mortality. To strengthen causal inference and address potential residual confounding, we use weather data as an instrumental variable to estimate the effect of transport modality in the presence of potentially uncontrolled confounders, following systematic validation of its validity as an instrumental variable.

## Methods

### Ethics Approval

Ethical approval was granted by the Swedish Ethical Review Authority (Dnr 2022-03054-01, Dnr 2024-00776-02). The manuscript follows the STROBE reporting guidelines for observational studies [@elm_strengthening_2007].

### Data Sources

Clinical data were obtained from the Swedish Intensive Care Registry (SIR), a national quality registry covering over 80 ICUs, representing most community and regional hospitals in Sweden. SIR includes detailed information on ICU admissions, discharge times, diagnostic categories, interventions, and physiological scores such as SAPS 3 and SOFA.

These data were linked to the National Patient Register (NPR), which contains administrative records, procedure codes, and ICD-coded discharge diagnoses. The NPR has near-complete national coverage and fair diagnostic validity for several acute neurological conditions [@koster_refinement_2013].

Mortality data was retrieved from the National Cause of Death Register (CDR), which includes the date and, when available, the cause of death, with near-complete national coverage.

Additional non-clinical data sources were integrated. Flight movement data were provided by FlightRadar24 AB, Stockholm. Historical weather data were sourced from the Swedish Civil Aviation Administration via Ogimet (www.ogimet.com). Geographic coordinates for hospitals, helipads, and airports — as well as road distances between facilities — were obtained using the openrouteservice.org API (HeiGIT), based on OpenStreetMap data.

Data preprocessing and analyses were performed using SQLite version 3.49.1, `r version$version.string`, Python version 3.11, and a set of open source software packages [@jordahl_geopandasgeopandas_2020; @pollard_metar_nodate; @stan_development_team_rstan_2025; @burkner_brms_2017; @sjoberg_reproducible_2021; @arvid_sjolander_elisabeth_dahlqwist_torben_martinussen_ivtools_2018].

### Patients and Clinical Data
ICU admissions were eligible if they met the following criteria: (1) admission to a non-tertiary ICU between October 1, 2019, and May 15, 2024; (2) patient age ≥18 years; (3) admission belonging to a patient who had at least one ICU admission during the study period with a diagnosis relevant to neurointensive care; and (4) a primary ICU stay shorter than 12 hours; (5) a transfer to a tertiary center ≥ 45 km away from the primary ICU; (6) a diagnosis of aneurysmal subarachnoid hemorrhage (ASAH), acute ischemic stroke (AIS), intracerebral hemorrhage (ICH), or traumatic brain injury (TBI)—the most common indications for neurocritical care. Finally, only the first such ICU admission per patient was included in the final cohort.

Because not all tertiary ICUs report to the Swedish Intensive Care Registry (SIR), interhospital transfers were identified via the National Patient Register (NPR). The start date corresponds to the introduction of a new helicopter at a major transport organization, improving flight tracking accuracy. Follow-up on mortality and hospital stays was available through December 31, 2024.

Transfers were defined as admissions to a tertiary center on the same calendar day as discharge from the initial ICU. To account for late-day transitions, we also included tertiary admissions on the day before or after ICU discharge if consistent with a continuous care episode. Long-distance transfers were defined as those exceeding 45 km. Based on an estimated road ambulance speed of 90 km/h and 15 minutes for embarking and disembarking, this threshold implies at least one hour spent outside the safety of an ICU during road transport.

Diagnostic data were sourced from the NPR based on ICD-10 and procedural codes (detailed in the code supplement).

Physiological data and pre-transfer interventions were retrieved from the SIR record for the initial ICU stay. Glasgow Coma Scale (GCS) scores reflected pre-sedation consciousness and were taken from SAPS assessments. To avoid strong collinearity in regression models, we also calculated a modified SAPS excluding the age and GCS components. 

Do-not-resuscitate (DNR) status was based on SIR records from the initial ICU admission. SIR timestamps were used to classify admissions as occurring outside office-hours (office-hours being defined as Monday to Friday, 08:00–17:00) or during nighttime (22:00–07:00).

### Exposure
Helicopter flight movements were parsed from Automatic Dependent Surveillance–Broadcast (ADS-B) data. Details on the flight-matching algorithm and processing steps are provided in Supplement A.

Transfers were classified as carried out by HEMS if a matching flight was identified along the same hospital-to-hospital route, with a departure time within ±120 minutes of the ICU discharge. This window accounts for potential clerical discrepancies in discharge times. See the supplementary material for details. 

### Instrumental Variable

Automated weather reports (METeorological Aerodrome Reports, METARs) were used to assess whether meteorological conditions met the minimum requirements for HEMS operations, assuming two-pilot crews operating under performance class 1 or 2, in accordance with European Union Aviation Safety Agency (EASA) regulations [@noauthor_regulation_nodate]. Details on weather data parsing and classification criteria are provided in the Supplement A.

Each hospital was matched to the nearest consistently METAR-reporting airport. Weather conditions were evaluated at both sending and receiving sites at the time of ICU discharge. HEMS was considered technically feasible only if EASA weather minima were met at both locations.

### Outcomes

The primary outcome was 30-day mortality following initial ICU admission. The secondary outcome was days alive and out of hospital (DAOH) within 90 days. Mortality data were obtained from the National Cause of Death Register, and hospitalization data from the National Patient Register. For patients who died within each time window, DAOH was set to zero.

### Statistical Methods

In the descriptive part of the study, continuous variables were presented as medians with interquartile ranges (IQRs). No univariate hypothesis testing was performed between HEMS and non-HEMS groups to avoid overinterpreting descriptive differences as causal. Transfer routes were visualized using a adapted code from J. Sadler [@sadler_great_2018].

All analyses used Bayesian modeling with non-informative priors, given the limited evidence base for intensive care transfers in Sweden. Model estimates were reported as odds ratios or marginal effects with 95% credible intervals (CI).

Predictors of transport modality were assessed using a Bayesian generalized linear model with Bernoulli likelihood, group-level ("random intercept") effects for sending ICU and a logit link function, implemented in `brms`. We examined the relationship between transport modality and patient outcomes using three complementary approaches. First, to describe outcome differences while adjusting for measured confounders, we fit Bayesian generalized linear models with Bernoulli likelihood for 30-day mortality and zero-inflated beta likelihood for days alive and out of hospital at 90 days (DAOH-90). The models used a a logit link function. One of the models (model II) employed partial pooling across sending ICUs, allowing each ICU to have its own intercept and HEMS effect parameter drawn from common priors in order to regularize esimated coefficients for the >40 centers. Clinical and demographic covariates were selected based on clinical expertise.

Second, the local average treatment effect (LATE) was estimated using an instrumental variable (IV) approach, leveraging weather-based HEMS operating minima as a pseudo-randomization mechanism [@angrist_identification_1996; @hernan_causal_2020]. IV analysis allows for causal inference estimation in the presence of unmeasured confounders. LATE describes the effect of treatment in the "complier" subset of patients—those whose treatment status is influenced by the instrumental variable [@hernan_causal_2020]. LATE was estimated using a model implemented in `rstan` with non-informative Jeffreys priors. Details of the estimand and estimator are provided in the methodological supplement, with code available online.

Third, nonparametric Balke-Pearl bounds for the average treatment effect (ATE) were estimated under minimal assumptions using the `ivtools` package [@balke_bounds_1997;@arvid_sjolander_elisabeth_dahlqwist_torben_martinussen_ivtools_2018].

#### Instrumental Variable Model Assumptions
Identification of LATE requires several assumptions: (1) instrumental variable relevance, (2) exclusion restriction, (3) exchangeability, and (4) monotonicity [@angrist_identification_1996; @balke_bounds_1997; @swanson_commentary_2013; @labrecque_understanding_2018; @hernan_causal_2020; @lousdal_introduction_2018].

Instrumental variable relevance (*assumption 1*) was assessed by fitting a generalized linear model with HEMS minima as the sole predictor of transport modality. The estimated odds ratio (OR) was `r round(m.modality.univ_exp_vals["Estimate"], 1)` (95% CI: `r round(m.modality.univ_exp_vals["l-95% CI"], 1)` to `r round(m.modality.univ_exp_vals["u-95% CI"], 1)`).

While untestable, the exclusion restriction (*assumption 2*) is plausible given that weather at ICU discharge is unlikely to affect mortality directly. Further, the assumption can be challenged by instrumental inequalities or by detecting an association between the instrumental variable and outcome in subgroups where the instrumental variable cannot influence exposure [@labrecque_understanding_2018; @balke_bounds_1997]. In our analysis, the instrumental inequality was not violated. Additionally, in a subgroup of patients from ICUs with low HEMS utilization (10%), the instrumental variable showed no clear association with outcome: the univariate GLM yielded an odds ratio of `r round(m.weather.mort_exp_vals["Estimate"], 1)` (95% CI: `r round(m.weather.mort_exp_vals["l-95% CI"], 1)` to `r round(m.weather.mort_exp_vals["u-95% CI"], 1)`).

```{r @tbl-cov-balance-iv}
#| label: tbl-cov-balance-iv

cov_variance_tbl
```

The exchangeability assumption (*assumption 3*), also known as the no unmeasured confounders condition, cannot be formally tested. One proposed falsification strategy is to assess covariate balance across levels of the instrumental variable [@labrecque_understanding_2018]. As shown in @tbl-cov-balance-iv, selected covariates appear balanced, with no strong evidence of systematic differences. Covariance of weather and patient outcomes across regions could introduce confounding. As shown in Supplementary Figure 1, regional variability in weather is modest, with departures from the national mean observed at only a few sites and confined to brief periods of the year, suggesting limited potential for substantial confounding.

Since weather varies not only randomly, but also diurnally and seasonally, the timing of ICU discharge may influence instrumental variable status. Diurnal variation in ICU strain, staffing, or case mix could affect outcomes through site-specific mechanisms, potentially confounding results. In our cohort, nighttime admission (22:00–07:00) was associated with lower rates of 30-day mortality: OR `r round(m.night.mort_exp_vals["Estimate"], 1)` (95% CI: `r round(m.night.mort_exp_vals["l-95% CI"], 1)` to `r round(m.night.mort_exp_vals["u-95% CI"], 1)`). Therefore, in sensitivity analyses, we included nighttime admission as a covariate in the regression model and used weather as an instrumental variable conditional on admission time.  

The monotonicity assumption (*assumption 4*) requires that the instrumental variable influences exposure in only one direction [@labrecque_understanding_2018]. In this context, poor weather cannot increase HEMS transfer likelihood given regulatory and operational constraints, and we consider this assumption satisfied. Thus, the four core assumptions for identifying the local average treatment effect (LATE) appear to be met.

#### Missing data
Patients with missing relevant variables were excluded prior to statistical modelling. The rate of missingness was low. For the linear models up to `r round( (nrow(d_pruned) - nrow(m.modality$data)) / nrow(d_pruned), 3 ) * 100`% of patients were excluded due to a missing variable. For the LATE estimation, only `r n_missing_d_pruned_minimal` of `r nrow(d_pruned)` patients (`r round( n_missing_d_pruned_minimal / nrow(d_pruned) , 3 ) * 100`%
) missing weather data, were excluded.

### Utilization of Large Language Models (AI)
Language editing, proofreading, code review, and code documentation were assisted by Claude Sonnet 4 (Anthropic, 2025) and GPT-4o (OpenAI, 2024). No AI assistance was used in study planning, conceptualization, data analysis, interpretation, or conclusions.

### Code and Data Availability
All code used for data preprocessing and analysis is publicly available [@mycode]. A fully anonymized, abbreviated dataset used for the instrumental variable analysis is also archived and accessible at reasonable request. (Use Zenodo to create doi when repo is finalized.)

## Results

### Patient Selection

```{r fig-flowchart, echo=FALSE, warning=FALSE, fig.width=8, fig.height=10, fig.cap="Flowchart of patient inclusion", fig.popup=TRUE}
#| label: fig-flowchart
#| echo: false
#| warning: false
#| fig-cap: "Flowchart of patient inclusion. ICU admissions were eligible if they met the following criteria: (1) admission to a non-tertiary ICU between October 1, 2019, and May 15, 2024; (2) patient age ≥18 years; (3) admission belonging to a patient who had at least one ICU admission during the study period with a diagnosis relevant to neurointensive care. Next, all primary ICU stays longer than 12 hours were excluded. ICU admissions without a transfer to a tertiary center were excluded in the next step. ICU admissions with a transfer to a tertiary center less than 45 km away were excluded. ICU admissions with diagnoses at the receiving center other than aneurysmal subarachnoid hemorrhage (ASAH), acute ischemic stroke (AIS), intracerebral hemorrhage (ICH), or traumatic brain injury (TBI) were excluded next. Finally, only the first such ICU admission per patient was included in the final cohort."
#| fig-popup: true

# Render flowchart
d_fc %>% fc_draw(title = "Cohort selection flowchart")
```

Patient selection is described in @fig-flowchart. A total of `r dim(d_pruned)[1]` patients were included in the cohort.

### Travel Routes

The paths traveled by patients in the cohort are plotted in @fig-sweden. The distribution of transport distances and transport modality distribution per distance bin are shown in @fig-barplot-summary-both. We note that the proportion of HEMS transfers increase with longer transfer distances, however it is likely that some long distance transfers (>300 km) are carried out by fixed-wing.

```{r map_of_transfers, fig.popup=TRUE, fig.cap="Routes travelled by patients in the full cohort", fig.width=14, fig.height=7}
#| label: fig-sweden
#| fig-cap: "Routes travelled by patients in the full cohort by all modes of transport. The opacity of the line denotes the relative number of transfers. Receiving tertiary centers are maroon. Centers where >10% of patients are transferred by HEMS are circled. Note that one of the receiving tertiary centers, Örebro University Hospital, also is also acting as a sending center."
#| output: true
#| echo: false
#| warning: false
#| fig-popup: true

plot_transfers
```

```{r transports_summary_plot_both, fig.popup=TRUE, fig.cap="Distribution of travel distances and transportation mode in the cohort"}
#| label: fig-barplot-summary-both
#| fig-cap: "Distribution of transport distances (calculated road distance between sending and receiving center) in the cohort and the proportion of transport modality per distance bin."
#| output: true
#| echo: false
#| warning: false
#| fig-popup: true

transports_summary_plot
```

### Description of the Cohort

```{r tbl-cohort-description}
#| label: tbl-cohort-description

cohort_description
```

Minor differences between the HEMS and non-HEMS groups were observed: the HEMS group has a slightly lower median age, fewer DNR orders, and longer transport distances. Overall, the groups appear similar across median and mean values of the presented covariates.

### Factors Associated With HEMS Transfer

```{r tbl-modality-model}
#| label: tbl-modality-model

m.modality.summary
```

In the multivariate predictive model of transfer modality, mainly HEMS minima being met clearly predicted the choice of HEMS (aOR `r round(m.modality.exp_vals["Estimate"], 1)`, 95% CI `r round(m.modality.exp_vals["l-95% CI"], 1)` to `r round(m.modality.exp_vals["u-95% CI"], 1)`), see @tbl-modality-model. Also, having a DNR order was negatively associated with HEMS transfers.

### Risk Adjusted Models of HEMS Transfer vs. 30-day Mortality and DAOH-90

```{r tbl-glm-outcome}
#| label: tbl-glm-outcome
glm_table
```

Table @tbl-glm-outcome presents the adjusted odds ratios (aORs) from a set of regression models. Models I–III estimate 30-day mortality using a Bernoulli likelihood with a logit link (i.e. logistic regression). In addition to the base predictors of Model I, Model II adds *sending center* as a variable with partial pooling of center level intercepts and HEMS effect parameters. Model III instead adds *receving center* as a variable in the regression. Coefficients for individual centers are omitted in the table for brevity.

The fitted models I-III were used to estimate the effect of HEMS transfer on 30-day mortality, while conditioning on the observed distribution of covariates ($C$). The estimated mean risk difference, $\mathbb{E}[Y^{H=1, C}] - \mathbb{E}[Y^{H=0,C}]$, for model I was `r round(res.causal.I$summary$mean, 3) * 100`% (95% CI: `r round(res.causal.I$summary$q025, 3) * 100`% to `r round(res.causal.I$summary$q975, 3) * 100`%). The posterior probability that HEMS transfer increases 30-day mortality was `r round(mean(res.causal.I$summary$pr_gt0), 3) * 100` %.

Adding sending center as a variable in model II changed estimates slightly, here the estimated mean risk difference was `r round(res.causal.II$summary$mean, 3) * 100`% (95% CI: `r round(res.causal.II$summary$q025, 3) * 100`% to `r round(res.causal.II$summary$q975, 3) * 100`%). The probability of HEMS transfer increasing 30-day mortality was `r round(mean(res.causal.II$summary$pr_gt0), 3) * 100`%. However, having receiving center as a variable removed the positive effect of HEMS transfers with an estimated mean risk difference of `r round(res.causal.III$summary$mean, 3) * 100`% (95% CI: `r round(res.causal.III$summary$q025, 3) * 100`% to `r round(res.causal.III$summary$q975, 3) * 100`%). The probability of HEMS transfer increasing 30-day mortality was `r round(mean(res.causal.III$summary$pr_gt0), 3) * 100`%.

In a sensitivity analysis including ICU nighttime admission as an additional covariate to model I, the estimated mean risk difference was `r round(res.causal.IV$summary$mean, 3) * 100`% (95% CI: `r round(res.causal.IV$summary$q025, 3) * 100`% to `r round(res.causal.IV$summary$q975, 3) * 100`%), i.e. yielding results that were essentially unchanged. Finally a model using a zero-inflated beta likelihood with a logit link function was fitted. Here, the odds ratios from the count part of the model represent the multiplicative change in the expected number of days alive and out of hospital (DAOH-90) associated with a one-unit increase in the predictor, holding other variables constant. HEMS transfer was not associated with DAOH-90 (aOR `r round(m.causal.V.exp_vals[["mean"]], 1)`, 95% CI `r round(m.causal.V.exp_vals[["2.5%"]], 1)` to `r round(m.causal.V.exp_vals[["97.5%"]], 1)`).

### Instrumental Variable Analysis

#### Estimation of the Local Average Treatment Effect (LATE)

The local average treatment effect (LATE) of being transferred by HEMS versus non-HEMS on 30-day mortality was estimated at `r round(mean(posterior.late$LATE), 3) * 100`% (95% CI: `r round(hdi(posterior.late$LATE, ci = 0.95)$CI_low, 3) * 100`% to `r round(hdi(posterior.late$LATE, ci = 0.95)$CI_high, 3) * 100`%). 

Estimates of the LATE using weather minima as a conditional instrumental variable (given nighttime ICU admission) did not differ meaningfully: `r round(mean(posterior.late.conditional$LATE_marginalized), 3) * 100`% (95% CI: `r round(hdi(posterior.late.conditional$LATE_marginalized, ci = 0.95)$CI_low, 3) * 100`% to `r round(hdi(posterior.late.conditional$LATE_marginalized, ci = 0.95)$CI_high, 3) * 100`%).

Given the data, model, and assumptions, the posterior probability that HEMS reduces 30-day mortality by at least 2.5 percentage points in the complier group (patients whose transport modality was influenced by weather conditions) is `r round(mean(posterior.late$LATE < -0.025), 3) * 100`%. We propose this threshold as a rough benchmark for a clinically and economically meaningful benefit in typical neuro ICU patients, given the high cost and limited availability of helicopter transfers. The posterior probability that HEMS increases 30-day mortality in the complier group is `r round(mean(posterior.late$LATE > 0), 3) * 100`%.

#### Estimation of Bounds on the Average Treatment Effect (ATE)

Balke-Pearl bounds for the risk difference between HEMS and non-HEMS transfers were calculated. The lower bound was `r round(CRD_lower, 2) * 100`% and the upper bound was `r round(CRD_upper, 2) * 100`%

## Discussion

### Key findings
In this study aimed to evaluate outcome effects of HEMS transports in a Swedish cohort of neuro ICU patients requiring urgent transfer to a higher level of care, weather minima emerged as the strongest predictor of whether HEMS was utilized. Given the random nature of weather this was recognized as a possibly ideal candidate for IV analysis and thus a means to more closely research causal effects. Younger age and higher SAPS scores were also associated with an increased likelihood of HEMS use, whereas having a DNR order decreased the likelihood of HEMS transfer.

Both mixed model generalized linear models (GLMs), assessing the adjusted association between HEMS transfers and 30-day mortality, and IV analyses estimating the local average treatment effect (LATE), suggested a possible reduction in 30-day mortality associated with HEMS transfers. Adding receiving center as a covariate in a GLM attenuated the weak association between HEMS transfer and mortality. However, wide uncertainty intervals warrant cautious interpretation.

### Findings in Relation to Previous Research
Previous studies examining the association between HEMS interfacility transfers and outcomes have reported mixed findings, with some suggesting benefits and others showing no clear effect [@van_lieshout_nurses_2016; @wiegersma_quality_2011; @kennedy_impact_2015; @bellingan_comparison_2000]. Our estimates, though imprecise, are consistent with a potential mortality benefit.

Our secondary finding of an unadjusted association between nighttime ICU admission and lower 30-day mortality, sits within a body of conflicting evidence. A 2018 meta-analysis found no association between nighttime ICU admissions and mortality [@galloway_effect_2018]. A more recent study reported a lower mortality risk for out-of-hours ICU admissions [@namikata_association_2022]. 

While previous studies provide valuable evidence, most face methodological challenges in establishing causal relationships. Several rely on unadjusted comparisons [@bellingan_comparison_2000; @vos_comparison_2004] or use statistical significance to guide variable selection [@kennedy_impact_2015; @orr_pediatric_2009], which may not adequately control for confounding. Other studies do not specify the selection process for covariates in multivariable regression [@ramnarayan_effect_2010; @weyhenmeyer_effects_2018] or propensity score methods [@meyer_pediatric_2016], which decreases transparency and may limit inferences due to lingering unmeasured confounders.

Adjusting for receiving center in a GLM removed the apparent effect of HEMS on mortality. Three of seven centers rarely (<1%) or never received HEMS patients, and these centers also had higher unadjusted mortality. This pattern could explain why adjustment eliminated the observed effect. These differences, potentially reflecting unmeasured variations in case-mix or quality of care, are themselves an interesting finding.

However, this explanation assumes that receiving center is a confounder rather than an instrumental variable. If receiving center primarily influences mortality through its effect on HEMS transfer, including it as a covariate could bias causal estimates, particularly in the presence of unmeasured confounding [@myers_effects_2011]. Prior neuro-ICU research has leveraged center-level differences as a quasi-randomization source, effectively treating center as an instrument rather than adjusting for it [@volovici_comparative_2022]. Moreover, the sparse overlap in some centers indicates a violation of the positivity assumption, or lack of common support [@hernan_causal_2020; @igelstrom_causal_2022], highlighting the limitations of regression for causal inference and the potential value of alternative approaches such as instrumental variable analysis.

Instrumental variable methods remain relatively uncommon in intensive care research. A few notable exceptions have applied parametric two-stage regression models to estimate treatment effects using IVs [@wilson_clinician_2022; @harris_impact_2018; @tsuchiya_outcomes_2016]. These models rely on assumptions about the functional form of the model and the distribution of error terms, which, if violated, can compromise estimates.

The instrumental variables used in those studies include: prior regional HEMS usage (as an instrument for prehospital HEMS vs. ground EMS selection), ICU bed availability (as a proxy for ICU admission promptness), and provider preference (as an instrument for prophylactic magnesium use to prevent atrial fibrillation). All of these IVs are, in effect, variants of region-, hospital-, or provider-level preferences. However, such instruments risk violating the exchangeability assumptions, since they may influence patient outcomes through multiple pathways—not solely through the treatment of interest.

While we commend these studies for advancing causal inference in intensive care research, they also underscore the challenge of identifying valid instrumental variables. In contrast, weather conditions may provide a defensible instrumental variable for studying the effectiveness of HEMS. To our knowledge, this is the first study to use weather minima as an instrumental variable to examine the effect of HEMS interfacility transfers.

### Limitations
Patients who die during transfer may be underrepresented, as such patients may not be admitted at the receiving center. Nevertheless, in-transfer deaths are rare in ICU interfacility transfers [@singh_incidence_2009; @jeyaraju_safety_2021], making substantial bias from this source unlikely.

Another limitation concerns the studied exposure. We believe that the composition and experience of the transfer team, rather than the choice of vehicle, is the key factor influencing patient outcomes. While HEMS transfers serve as a proxy for specialized team involvement, this is an imperfect surrogate. Ground-based transfers may also involve specialized teams, though more commonly they are conducted by local providers. Fixed-wing transfers are performed either by specialized teams or by a single retrieval nurse with anesthetic training. Importantly, transport modality reflects not only team composition but also unmeasured factors such as transit time. Finally, the exposure was inferred from flight records rather than systematically collected clinical data.

The primary outcome used in this study, 30-day mortality, is relatively coarse. For patients with severe brain injury, functional outcomes are more informative. As a proxy, we included Days Alive and Out of Hospital (DAOH) within 90 days as a secondary outcome. However, prior research shows mixed associations between DAOH and functional outcome [@granholm_association_2023; @delaney_association_2023; @donnelly_days_2024]. It is possible that HEMS transfers affect functional outcome more than mortality rates. Future studies should preferably include functional outcomes as an endpoint.

In the IV analysis, the estimand was the LATE—reflecting the causal effect among compliers, i.e., patients whose transport modality would vary based on weather minima. The generalizability of this estimate depends on how representative this group is of the broader population and whether effect sizes are consistent across subgroups.

The modest sample size limits precision, particularly given that even optimistic effect sizes would be small (on the order of a few percentage points). Notably, as we moved from GLMs to IV analyses to ATE bounds estimation — each requiring fewer assumptions — uncertainty intervals widened. This reflects the trade-off between statistical certainty and the strength of modeling assumptions.

### Strengths
A key strength of this study is the use of a transparent and reproducible causal inference framework. By leveraging observed weather data to approximate random assignment of transport modality, we strengthen causal inference beyond what is typically achievable with conventional regression methods. Although tailored to our setting, this approach may be applicable in other countries and healthcare systems where HEMS operations are subject to well-defined weather criteria or similar external constraints. In such contexts, comparable methods could support causal inference using observational data to evaluate the effectiveness of HEMS in both interfacility and prehospital care.

Our study integrates high-quality national registry data, which strengthens external validity and supports the generalizability of findings to ICU settings across Sweden.

### Implications
Our findings may modestly shift prior beliefs in future studies of interfacility ICU transfer modalities, suggesting a positive effect of HEMS on patient outcomes. As a secondary observation, we found an unadjusted association between nighttime ICU admission and lower 30-day mortality, which may warrant further investigation.

If the LATE estimates reflect a real mortality benefit, practical implications include prioritizing HEMS availability for patients most likely to benefit. This could involve ensuring adequate funding and implementing Point-in-Space (PinS) procedures to expand HEMS operational capacity in poor weather conditions.

We propose that weather minima can serve as a valid instrumental variable (IV) for causal inference regarding the effect of HEMS transfers. Where testable, the core IV assumptions appear to be met. Beyond this specific clinical context, the study illustrates how natural experiments can be leveraged to strengthen causal inference in observational healthcare research. The framework demonstrates that valid instruments can be identified even in complex clinical settings, potentially inspiring similar approaches across intensive care research.

Future steps to strengthen inferences about the choice of transfer modality in ICU patients include expanding the cohort to encompass all ICU diagnoses, which—using Swedish registries—could increase the sample to approximately 10,000 patients from 2019 to 2025. Incorporating fixed-wing transport data would help disentangle the effects of team composition from transport mode. Ideally, prospective collection of data on transfer team composition and operational metrics is needed to better characterize the ICU transfer process.

A central principle of neurocritical care is the prevention and management of “secondary insults” — physiological disturbances that exacerbate primary brain injury [@signorini_adding_1999]. Hypotension and hypoxia, for example, are such insults that are also commonly reported complications during interfacility transfers [@lyphout_patient_2018; @sundbom_total_2021; @seymour_adverse_2008; @swickard_patient_2018]. Future studies should, in addition to examining transport logistics and team composition, ideally incorporate detailed records of interventions and physiological parameters during transfer. This would enable a better understanding of how care delivery during transport affects patient outcomes and help identify opportunities for improvement. Expanding the Swedish Intensive Care Registry to include data on the transfer process could be a feasible and impactful way to collect a large, generalizable dataset. Since the infrastructure for entering data into the registry is already broadly available, such data gathering could be implemented relatively quickly.

## Conclusion
In this national Swedish cohort of emergency interfacility neuro ICU transfers, we demonstrate a rigorous approach to causal inference using weather-based instrumental variables. Findings weakly suggest a potential mortality benefit associated with HEMS. We establish and present a transferable methodological framework that addresses fundamental challenges in observational transport research. The systematic validation of IV assumptions and transparent analytical approach provide a foundation for future causal inference studies in the field.

## List of Abbreviations
ADS-B   Automatic Dependent Surveillance–Broadcast

AIS     Acute Ischemic Stroke

AMV     Assisted Mechanical Ventilation

ASAH    Aneurysmal Subarachnoid Hemorrhage

DAOH    Days Alive and Out of Hospital

EASA    European Union Aviation Safety Agency

GLM     Generalized Linear Model

HEMS    Helicopter Emergency Medical Services

ICH     Intracerebral Hemorrhage

IV      Instrumental Variable

LATE    Local Average Treatment Effect

METAR   Meteorological Aerodrome Report

NPR     National Patient Register

SIR     Swedish Intensive Care Registry

TBI     Traumatic Brain Injury

## Funding
Cecilia Åkerlund was supported by Region Stockholm (clinical postdoctoral appointment).

## References

::: {#refs}
:::
