import geopandas
import pandas as pd
import numpy as np
import os
import re
import matplotlib.pyplot as plt
import seaborn as snsIn [1]:
Outline
This supplementary material provides information about the parsing of flight data, the case ascertainment strategy and parsing of weather data.
Inferring Flights Using ADS-B Data
The primary objective was to infer flights for all helicopters in the dataset. The data provided by FlightRadar24 AB was delivered as comma-separated value (CSV) files, derived from aircraft transmissions. Each data point includes an aircraft identifier, timestamp, position, altitude, and so forth. Departure and destination information from flight plans can be transmitted; however, most HEMS flights do not routinely file a destination. Complicating data parsing further, signals can be lost mid-flight when helicopters traverse areas with poor ground receiver coverage—typically in sparsely populated regions—resulting in a single flight being fragmented into multiple segments within the dataset.
Our algorithm utilizes defined areas of interest around known hospital landing sites and HEMS bases to infer inter-hospital flights. In essence, the algorithm detects movements into and out of these zones and infers a flight if the elapsed time between an aircraft exiting one hospital zone and entering another is realistic relative to the distance between the zones. A time threshold is imposed to avoid misclassifying prehospital HEMS missions as interfacility flights. Some time variations are permitted to account for factors such as refueling stops en route and holding patterns prior to landing.
First, let’s examine the defined landing zones on the interactive map below. By hovering over the zones, readers can identify the associated hospitals. Some hospitals have multiple zones (e.g., a hospital helipad and a nearby airport). Notably, the sizes of these zones vary significantly to accommodate areas with poor receiver coverage in certain parts of the country. Larger zones increase the risk of a passing helicopter being incorrectly classified as having landed within a zone. To address this, an estimated minimum “dwell time” was established for each zone, representing the minimal time an aircraft would spend in the area after entering it if a patient were to be picked up. This estimation considers the zone’s radius and the minimal time required for landing and takeoff. Both “dwell times” and zone radii were adjusted following manual inspection of hundreds of flights.
In [2]:
airports = pd.read_csv("/Users/JO/PhD/hemspy/data/helipad-data/raw-data/helipad-coordinates.csv", sep=";")
airports_gdf = geopandas.GeoDataFrame(airports, geometry=geopandas.points_from_xy(airports.longitude, airports.latitude), crs="EPSG:4326")
airports_gdf = airports_gdf.to_crs("EPSG:32634") #to metric coords
airports_gdf.geometry = airports_gdf.geometry.buffer(distance=airports.radius)
airports_gdf = airports_gdf.to_crs("EPSG:4326") #back to conventional
airports_gdf.explore()Next, let’s examine a real-world example of flight data. Here, we will look at data points from the helicopter based at Karlstad Airport over a period of approximately 30 hours. All data points are plotted on the map.
In [3]:
def create_flight_df(csv_path, airports):
flight = pd.read_csv(csv_path)
flight[['latitude', 'longitude']] = flight['Position'].str.split(',', expand=True)
flight = geopandas.GeoDataFrame(flight, geometry=geopandas.points_from_xy(flight.longitude, flight.latitude), crs="EPSG:4326")
flight = geopandas.sjoin(flight, airports, how="left", predicate="within")
flight.drop(['Position', 'index_right', 'latitude_left', 'latitude_right', 'latitude_right', 'longitude_right', 'longitude_left'], axis=1, inplace=True)
return flight
def plot_flight_df(flight_df, airports):
m = flight_df.explore()
return airports.explore(m=m, color='maroon', style_kwds={'fillOpacity': 0.1, 'weight':2})
# Define the folder path containing the CSV files
folder_path = '/Users/JO/PhD/hemspy/data/sample-data/karlstad-examples'
# Get a list of all CSV files in the folder and read them into DataFrames
data_frames = [pd.read_csv(os.path.join(folder_path, f)) for f in os.listdir(folder_path) if f.endswith('.csv')]
# Concatenate all DataFrames into a single DataFrame
combined_df = pd.concat(data_frames, ignore_index=True)
def create_flight_df(df, airports):
flight = df.copy()
flight[['latitude', 'longitude']] = flight['Position'].str.split(',', expand=True)
flight = geopandas.GeoDataFrame(flight, geometry=geopandas.points_from_xy(flight.longitude, flight.latitude), crs="EPSG:4326")
flight = geopandas.sjoin(flight, airports, how="left", predicate="within")
flight.drop(['Position', 'index_right', 'latitude_left', 'latitude_right', 'latitude_right', 'longitude_right', 'longitude_left'], axis=1, inplace=True)
return flight
k = create_flight_df(combined_df, airports_gdf)
plot_flight_df(k, airports_gdf)The data likely represents, first, a prehospital mission heading west to Årjäng. The helicopter then returns to base. Next, it flies north to the hospital in Torsby (a primary hospital), where it lands. It subsequently flies to the airport—presumably to refuel—and returns to the hospital before flying west to Uppsala University Hospital (a tertiary hospital). After a stop in Uppsala, the helicopter returns to its base at Karlstad Airport via Örebro Airport, again presumably for refueling.
During this leg, the helicopter passes through the zones for the hospitals in Västerås, Örebro, and Karlskoga. This highlights the necessity of incorporating the concept of “dwell time” to avoid falsely inferring patient pickups at hospitals that are merely overflown.
Next, we will examine the resulting inferred flights from primary to tertiary hospitals for this aircraft on 2024-04-29.
In [4]:
flights_path = '/Users/JO/PhD/hemspy/data/fr24-data/raw-data-unzipped-rearranged/flights'
flights_files = [f for f in os.listdir(flights_path) if f.endswith('.csv')]
flights_df_list, positions_df_list = [], []
for file in flights_files:
flights_df_list.append(pd.read_csv(flights_path+'/'+file))
flights_df = pd.concat(flights_df_list, ignore_index=True)
flights_df['flight_id'] = flights_df['flight_id'].astype(int)
positions_path = '/Users/JO/PhD/hemspy/data/fr24-data/raw-data-unzipped-rearranged/positions'
positions_files = [os.path.join(positions_path, file) for file in os.listdir(positions_path) if file.endswith('.csv')]
# List to store DataFrames
positions_df_list = []
regex_pattern = r'_(.*?)\.'
regex = re.compile(regex_pattern)
# Iterate over each file
for file in positions_files:
# Extract the flight_id from the file name
flight_id_match = regex.search(os.path.basename(file))
if flight_id_match:
flight_id = flight_id_match.group(1)
# Read the CSV file into a DataFrame
df = pd.read_csv(file, usecols=['snapshot_id', 'altitude', 'latitude', 'longitude', 'speed'])
# Add a column for flight_id
df['flight_id'] = flight_id
# Append the DataFrame to the list
positions_df_list.append(df)
# Combine all DataFrames into a single DataFrame
positions_df = pd.concat(positions_df_list, ignore_index=True)
positions_df['flight_id'] = positions_df['flight_id'].astype(int)
d = pd.merge(positions_df, flights_df, on='flight_id', how='left')
d['UTC'] = pd.to_datetime(d['snapshot_id'], unit='s', utc=True)
d['date'] = d['UTC'].dt.date
d['year'] = d['UTC'].dt.year
d = d[d['equip'].isin(['EC45', 'A139', 'A169', 'S76', 'AS65'])]
# drop sthlm, and HMF
d = d[~(d['reg'].isin(['SEJSR', 'SEJSP', 'SEJRH', 'SEJRI', 'SERJR', 'SEJRK', 'SEJRL', 'SEJRM', 'SEJRN']))]In [5]:
d = geopandas.GeoDataFrame(d, geometry=geopandas.points_from_xy(d.longitude, d.latitude), crs="EPSG:4326")
d = geopandas.sjoin(d, airports_gdf, how="left", predicate="within")
d.drop(['index_right', 'icao', 'is_primary_helipad', 'latitude_right', 'reserved', 'flight', 'callsign', 'longitude_right', 'real_to', 'schd_from', 'schd_to'], axis=1, inplace=True)
d = d.groupby('aircraft_id').apply(lambda x: x.sort_values('UTC'), include_groups=False)
# Identify transitions into and out of helipad zones
d['in_helipad_zone'] = ~d['zone_name'].isna()
d['zone_change'] = d['in_helipad_zone'].ne(d['in_helipad_zone'].shift())
# Extract the entry and exit times
d['entry'] = (d['zone_change'] == True) & (d['in_helipad_zone'] == True)
d['exit'] = (d['zone_change'] == True) & (d['in_helipad_zone'] == False)
d['UTC_str'] = d['UTC'].astype(str)
# Keep only entries and exits
entries_and_exits = d[d['entry'] | d['exit']]
entries_and_exits = entries_and_exits.reset_index().drop(columns='level_1')
# Create columm for UTC_out of landing zone
def UTC_out(x):
return x.shift(-1)['UTC']
utc_out = entries_and_exits.groupby('aircraft_id').apply(lambda x: UTC_out(x), include_groups=False)
utc_out = utc_out.reset_index(drop=True)
entries_and_exits['UTC_out'] = utc_out
# Keep only the zone dwellings with time longer > min_dwell_time
min_dwell_time = 15
entries_and_exits['time_in_zone'] = (entries_and_exits['in_helipad_zone'] * (entries_and_exits['UTC_out'] - entries_and_exits['UTC'])).dt.seconds / 60
entries_and_exits = entries_and_exits[entries_and_exits['time_in_zone'] > min_dwell_time]
# Clean up primary hospital row
entries_and_exits['is_primary_hospital'] = entries_and_exits['is_primary_hospital'].map({1: True, 0: False})
# Identify primary -> tertiary transfers within a time frame, assign ID number
def find_transfers(df, time_window_hours):
df['transfer_id'] = np.nan # Initialize the transfer_id column
# Filter for primary and university hospital landings
df_primary = df[df['is_primary_hospital']]
df_university = df[~df['is_primary_hospital']]
transfer_id = 0
# Iterate over each primary hospital landing
for idx_primary, row_primary in df_primary.iterrows():
# Find matching university hospital landings for the same aircraft within the time window
valid_university_landings = df_university[
(df_university['aircraft_id'] == row_primary['aircraft_id']) &
(df_university['UTC'] >= row_primary['UTC_out']) &
(df_university['UTC'] <= row_primary['UTC_out'] + pd.Timedelta(hours=time_window_hours))
]
if not valid_university_landings.empty:
# Pick the first valid university landing
first_valid_idx = valid_university_landings.index[0]
transfer_id += 1
df.at[idx_primary, 'transfer_id'] = transfer_id
df.at[first_valid_idx, 'transfer_id'] = transfer_id
return df
max_transit_time = 2.5
res = find_transfers(df=entries_and_exits, time_window_hours=max_transit_time)
transfers = res.dropna(subset='transfer_id')
# Make a new df with one transfer per row
## Split into sending and receiving dataframe
transfers_sending = transfers[transfers['is_primary_hospital'] == True]
transfer_receiving = transfers[transfers['is_primary_hospital'] == False]
## Left join in transfer id, remove columns
transfers_merged = pd.merge(transfers_sending, transfer_receiving, on='transfer_id', suffixes=('_sending', '_receiving'))
final = transfers_merged[['transfer_id', 'hospital_name_sending', 'ambulance_meetup_sending', 'hospital_name_receiving', 'year_sending',
'reg_sending', 'UTC_sending', 'UTC_out_sending', 'time_in_zone_sending', 'UTC_receiving', 'zone_name_sending','zone_name_receiving', 'flight_id_receiving', 'aircraft_id_receiving']]
## Calculate transit time (this will be the time from exiting a zone to entering a zone)
final['transit_time'] = (final['UTC_receiving'] - final['UTC_out_sending']).dt.total_seconds() / 60
# This function takes a transfer and from the raw data plots all flights for that aircraft within the times of starting the transfer
# and ending it, i.e. if needed all legs will be included
def plot_transfer(raw_df, transfer_df, transfer_id):
transfer = transfer_df[transfer_df['transfer_id'] == transfer_id]
aircraft = transfer['aircraft_id_receiving'].iloc[0]
start_time = transfer['UTC_out_sending'].iloc[0]
end_time = transfer['UTC_receiving'].iloc[0]
filtered_df = raw_df[raw_df.index.get_level_values(0) == aircraft]
filtered_df = filtered_df[filtered_df['UTC'] > start_time]
filtered_df = filtered_df[filtered_df['UTC'] < end_time]
flights_set = filtered_df['flight_id'].unique()
transfer_flights = raw_df[raw_df['flight_id'].isin(flights_set)]
transfer_flights = transfer_flights[['geometry', 'UTC_str', 'flight_id', 'reg', 'zone_name', 'altitude', 'speed', 'snapshot_id']]
n_legs = len(flights_set)
return transfer_flights.explore()
# Filter for rows where reg_sending is 'SEJSK' and the date is 2024-04-29
filtered = final[
(final['reg_sending'] == 'SEJSK') &
(final['UTC_sending'].dt.date == pd.to_datetime('2024-04-29').date())
]/var/folders/s_/hnbjtb757x10k0pyf6_x_9mw0000gn/T/ipykernel_90906/1893652255.py:74: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
final['transit_time'] = (final['UTC_receiving'] - final['UTC_out_sending']).dt.total_seconds() / 60
In [8]:
filtered| transfer_id | hospital_name_sending | ambulance_meetup_sending | hospital_name_receiving | year_sending | reg_sending | UTC_sending | UTC_out_sending | time_in_zone_sending | UTC_receiving | zone_name_sending | zone_name_receiving | flight_id_receiving | aircraft_id_receiving | transit_time | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5317 | 5928.0 | Torsby sjukhus | 0.0 | Akademiska sjukhuset | 2024 | SEJSK | 2024-04-29 16:54:18+00:00 | 2024-04-29 18:15:42+00:00 | 81.4 | 2024-04-29 19:02:41+00:00 | Torsby sjukhus | Akademiska sjukhuset | 888952060 | 4893291 | 46.983333 |
Here, the 1,700+ rows of flight data for the aircraft on 2024-04-29 are distilled into a single row representing the presumed interfacility transfer from Torsby Hospital to Uppsala University Hospital.
Case Ascertainment
Diagnostic information was drawn from the National Patient Register. The query is relatively complex to reduce the risk of ambiguity in the case ascertainment. In short, patients with a main diagnosis of I60, a procedural code denoting aneurysm coiling or clipping, or an aneurysmal subarachnoid hemorrhage (ASAH) as the cause of death were classified as having an ASAH. A diagnosis of I61 was deemed to indicate intracerebral hemorrhage (ICH). Traumatic brain injury (TBI) cases were identified by having S06, S020, S021, S028, S029, S071, S04, S09, or S12 as the main diagnosis. Acute ischemic stroke was identified by a main diagnosis of I63.
In [9]:
import requests
from IPython.display import display, Markdown
# Fetch SQL content
url = "https://raw.githubusercontent.com/johol355/neuro-ascertainment/refs/heads/main/candidate-queries/transferred-nsicu-cohort-daoh/transferred-cohort-daoh.sql"
response = requests.get(url)
if response.status_code == 200:
sql_code = response.text
# Triple backticks must be escaped to avoid breaking the f-string
backticks = "```"
collapsible_md = (
"<details>\n"
"<summary>Click to view SQL source</summary>\n\n"
f"{backticks}sql\n"
f"{sql_code}\n"
f"{backticks}\n"
"</details>"
)
display(Markdown(collapsible_md))
else:
print(f"Failed to fetch SQL file: {response.status_code}")Click to view SQL source
-- OUTLINE OF QUERY
-- 1. Define PAR_HADM as only PAR admits at tertiary centers
-- 2. Define all SIR admits at primary ICUs
-- 3. Match all PAR_HADM on SIR admits via Patient ID (LopNr),
-- this will give irrelevant matches (from years back)
-- as well.
-- 4. Create columns PAR_SIR_OFFSET and, more importantly,
-- PAR_SIR_OFFSET_TIGHT, that gives you the number of
-- calendar days between the SIR dsc and PAR adm.
-- The _TIGHT column only allows the value to be -1 or 1
-- if the dsc is late/early in the day respectively.
-- 5. Find and flag Örebro SIR - Örebro PAR matches in OREBRO_INTERNAL
-- to be able to get rid of these later
-- 6. Add diagnostic data, limited to a few neuro ICU dx
-- 7. Add clinical data per SIR ICU
-- 8. Add outcome data per SIR ICU
-- HOW TO USE FURTHER
-- In a flow chart:
-- Filter out on SIR_OFFSET_TIGHT != -1,0,1
-- Select lowest DX_RANK per SIR vtf
-- Filter out on OREBRO_INTERNAL
-- Keep only PRICU dsc latest within patient LopNr
--- PAR_HADM is a redefined PAR table adding a unique HADM_ID to each admission
-- Keep only admissions where the patient was >= 18 yrs at admission
-- Keep only admissions after 2010/01/01
-- Keep only admissions at a tertitiary center (inlcunding Örebro from 2014-01-01)
WITH PAR_HADM AS (
SELECT *,
ROW_NUMBER() OVER (
ORDER BY LopNr,
INDATUM,
UTDATUM,
CASE
WHEN SJUKHUS NOT IN ('11001', '11003', '51001', '12001', '21001', '64001', '41001', '41002', '55010')
THEN 0 ELSE 1
END
) AS HADM_ID
FROM PAR
WHERE ALDER >= 18
AND (
(INDATUM >= 14610 AND SJUKHUS IN ('11001', '11003', '51001', '12001', '21001', '64001', '41001', '41002'))
OR (SJUKHUS = '55010' AND INDATUM > 16071)
)
),
-- Also, for DAOH calcuations, we need PAR admits at not only tertiary centers
PAR_HADM_ALL AS (
SELECT *,
ROW_NUMBER() OVER (
ORDER BY LopNr,
INDATUM,
UTDATUM,
CASE
WHEN SJUKHUS NOT IN ('11001', '11003', '51001', '12001', '21001', '64001', '41001', '41002', '55010')
THEN 0 ELSE 1
END
) AS HADM_ID
FROM PAR
WHERE ALDER >= 18
),
------------------------------------------------------------------------------
-- CTE PAR_HADM_LAST_DSC:
-- Calculates days since last discharge for patients who have a previous
-- admission
------------------------------------------------------------------------------
PAR_HADM_LAST_DSC AS (
SELECT
*,
INDATUM - (MAX(UTDATUM) OVER (PARTITION BY LopNr ORDER BY INDATUM ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)) AS DAYS_SINCE_LAST_DSC
FROM PAR_HADM
),
------------------------------------------------------------------------------
-- CTE PAR_HADM_CONT:
-- Flags the occurrence of a new admission to hospital if re-admission date is
-- more than 1 day after previous discharge.
-- Using this, a unique ID for each coherent hospital admission is created.
------------------------------------------------------------------------------
PAR_HADM_CONT AS (
SELECT *,
CASE
WHEN DAYS_SINCE_LAST_DSC IS NULL OR DAYS_SINCE_LAST_DSC > 1 THEN 1
ELSE 0
END AS ADMISSION_FLAG,
SUM(CASE WHEN DAYS_SINCE_LAST_DSC IS NULL OR DAYS_SINCE_LAST_DSC > 1 THEN 1 ELSE 0 END)
OVER (PARTITION BY LopNr ORDER BY INDATUM ROWS UNBOUNDED PRECEDING) AS AdmissionSequence,
-- Explicit identifier combining patient and admission sequence:
LopNr || '-CONT_PAR_ADM-' || SUM(CASE WHEN DAYS_SINCE_LAST_DSC IS NULL OR DAYS_SINCE_LAST_DSC > 1 THEN 1 ELSE 0 END)
OVER (PARTITION BY LopNr ORDER BY INDATUM ROWS UNBOUNDED PRECEDING) AS CONT_HADM_ID
FROM PAR_HADM_LAST_DSC
),
------------------------------------------------------------------------------
-- CTE PAR_HADM_CONT_DATES:
-- Summarises coherent hospital admissions and shows its constituents (i.e.
-- the included HADM_ID's for each CONT_HADM_ID)
------------------------------------------------------------------------------
PAR_HADM_CONT_DATES AS (
SELECT
LopNr,
CONT_HADM_ID,
MIN(INDATUM) OVER (PARTITION BY CONT_HADM_ID) AS CONT_HADM_ADM_DATE,
MAX(UTDATUM) OVER (PARTITION BY CONT_HADM_ID) AS CONT_HADM_DSC_DATE,
HADM_ID,
INDATUM,
UTDATUM
FROM PAR_HADM_CONT
),
-- PR_ICU_ADMISSIONS has some basic information about all
-- ICU admissions to a primary ICU
-- And admitted less than 24 hrs (1440 minutes)
PR_ICU_ADMISSIONS AS (
SELECT
S.VtfId_LopNr,
S.LopNr,
S.InskrTidpunkt,
S.UtskrTidpunkt,
S.AvdNamn,
S.Sjukhus
FROM SIR_BASDATA S
WHERE ((S.AvdNamn NOT IN (
'S-CIVA',
'S-NIVA',
'KS/THIVA',
'KS ECMO',
'Astrid Lindgren',
'IVA Lund',
'Lund - BIVA',
'Lund - NIVA',
'Linköping',
'Linköping NIVA',
'Linköping BRIVA',
'SU/NIVA',
'SU/CIVA',
'SU/TIVA',
'Umeå IVA',
'Umeå - Thorax',
'Uppsala',
'Uppsala BRIVA',
'Uppsala TIVA',
'Uppsala BIVA',
'Uppsala NIVA'
)))
),
-- Match SIR admits with PAR admits at tertiary centers
-- Creates 3 columns:
-- SIR_PAR_OFFSET, the number of calendar days between SIR dsc and PAR adm
-- SIR_PAR_OFFSET, the same as above but truncated at (-1,1) with the
-- additional rule that admits the day after SIR dsc must have SIR dsc in the evening
-- and vice versa.
-- OREBRO_INTERNAL, since Örebro acts both as a primary and tertiary center,
-- depending on case severity, "internal" matches between
-- IVAUSÖ and Örebro in PAR are flagged
ICU_ADMISSIONS_MATCHED_WITH_PAR AS (
SELECT
S.VtfId_LopNr,
P.HADM_ID,
S.LopNr,
S.InskrTidpunkt,
S.UtskrTidpunkt,
S.AvdNamn,
P.INDATUM,
P.UTDATUM,
P.SJUKHUS,
P.MVO,
S.UtskrTidpunkt/86400 - P.INDATUM AS SIR_PAR_OFFSET,
CASE
WHEN S.UtskrTidpunkt/86400 - P.INDATUM = 0 THEN 0
WHEN S.UtskrTidpunkt/86400 - P.INDATUM = 1
AND strftime('%H', datetime(S.UtskrTidpunkt, 'unixepoch')) BETWEEN '00' AND '06' THEN 1
WHEN S.UtskrTidpunkt/86400 - P.INDATUM = -1
AND strftime('%H', datetime(S.UtskrTidpunkt, 'unixepoch')) BETWEEN '18' AND '23' THEN -1
ELSE NULL
END AS SIR_PAR_OFFSET_TIGHT,
CASE
WHEN S.AvdNamn IN ('IVAUSÖ', 'Örebro - Thorax') AND P.SJUKHUS = 55010 THEN 1 ELSE 0 END AS OREBRO_INTERNAL
FROM PR_ICU_ADMISSIONS S
LEFT JOIN PAR_HADM P ON S.LopNr == P.LopNr
),
----------------------------------------------------------------------------
-- Creation of CTE's for each diagnosis
------------------------------------------------------------------------------
-- Each of the diagnosis-groups is subdivided into a CTE called {dx}
------------------------------------------------------------------------------
-- aSAH ----------------------------------------------------------------------
asah AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
LEFT JOIN
DORS D ON P.LopNr = D.LopNr
WHERE (
(
(P.Diagnos LIKE "I60%" -- Note the placement of the wildcard, i.e. the regex will search for the main diagnosis
OR ((P.Op LIKE "%AAC00%" OR P.Op LIKE "%AAL00%") AND P.Diagnos NOT LIKE "I671%")) -- I671 is NOT included
AND P.Diagnos NOT LIKE "%S06%")
OR (P.Diagnos LIKE "I60%" AND P.Diagnos LIKE "%S06%" AND (P.Op LIKE "%AAC00%" OR P.Op LIKE "%AAL00%"))
-- AND P.Diagnos NOT LIKE "%Q28%" There are tens of cases where Q28 is a 2nd dx, I60 a first and patient coiled,
-- AND P.Diagnos NOT LIKE "I671%" Hey, if you are sick enough to get admitted to an ICU, it is still an emergency
--)
OR
(
(D.Ulorsak LIKE "I60%")
AND JULIANDAY(strftime('%Y-%m-%d', substr(D.DODSDAT, 1, 4) || '-' || substr(D.DODSDAT, 5, 2) || '-' || substr(D.DODSDAT, 7, 2) || ' 00:00:00')) - JULIANDAY(date(P.INDATUM * 86400, 'unixepoch')) <= 30
AND P.Diagnos NOT LIKE "%S06%"
AND P.Diagnos NOT LIKE "%Q28%"
)
)
),
-- Non-ruptured aneurysm -----------------------------------------------------
ane AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
P.Diagnos LIKE "I671%"
AND
P.Diagnos NOT LIKE "%I60%" -- To decrease risk of mixing in asah
),
------------------------------------------------------------------------------
-- TBI -----------------------------------------------------------------------
tbi AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
((P.Diagnos LIKE "S06%") OR
((P.Diagnos LIKE "S020%" OR
P.Diagnos LIKE "S021%" OR
P.Diagnos LIKE "S028%" OR
P.Diagnos LIKE "S029%" OR
P.Diagnos LIKE "S071%" OR
P.Diagnos LIKE "S04%" OR
P.Diagnos LIKE "S09%" OR
P.Diagnos LIKE "S12%"))) -- Previously had "AND (P.Diagnos LIKE "%S06%")""
),
------------------------------------------------------------------------------
-- Cerebral venous thrombosis ------------------------------------------------
cvt AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(P.Diagnos LIKE "G08%"
OR P.Diagnos LIKE "I676%"
OR P.Diagnos LIKE "I636%"
OR P.Diagnos LIKE "O225%"
OR P.Diagnos LIKE "O873%")
-- Some aSAH that fulfill the above criteria will need to be filtered out as such:
AND (P.Op NOT LIKE "%AAC00%")
AND (P.Op NOT LIKE "%AAL00%")
AND (P.Diagnos NOT LIKE "%I60%")
),
------------------------------------------------------------------------------
-- ICH -----------------------------------------------------------------------
ich AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(P.Diagnos LIKE "I61%")
-- A few likely aSAH that fulfill the above criteria will need to be filtered out as such:
AND (P.Op NOT LIKE "%AAC00%")
AND (P.Op NOT LIKE "%AAL00%")
),
------------------------------------------------------------------------------
-- AVM -----------------------------------------------------------------------
avm AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(P.Diagnos LIKE "Q28%")
AND (P.Op NOT LIKE "%AAL00%")
AND (P.Op NOT LIKE "%AAC00%")
),
------------------------------------------------------------------------------
-- Acute ischemic stroke -----------------------------------------------------
ais AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(P.Diagnos LIKE "I63%")
AND (P.Diagnos NOT LIKE "I636%")
-- a few likely aSAH that fulfill the above criteria will need to be filtered out as such:
AND (P.Op NOT LIKE "%AAC00%")
AND (P.Op NOT LIKE "%AAL00%")
),
------------------------------------------------------------------------------
-- Acute bacterial meningitis ------------------------------------------------
abm AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE "G00%" OR
P.Diagnos LIKE "A390%" OR
-- Also include intracranial abcess and "abscess i skalle eller ryggradskanal"
P.Diagnos LIKE "G06%" OR
P.Diagnos LIKE "G039%" -- also include Meningitis uns, again, if they are sick enough to be in the icu...
)
),
------------------------------------------------------------------------------
-- Encephalitis --------------------------------------------------------------
ence AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE "G04%" OR
P.Diagnos LIKE "G05%" OR
P.Diagnos LIKE "B004%" OR
P.Diagnos LIKE "B020%" OR
P.Diagnos LIKE "A841%"
)
),
------------------------------------------------------------------------------
-- Status epilepticus --------------------------------------------------------
se AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE "G41%" OR
-- also include epilepsy
P.Diagnos LIKE "G40%"
)
),
------------------------------------------------------------------------------
-- "Isolated" cervical spine frx ---------------------------------------------
cfx AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE 'S12%' OR
P.Diagnos LIKE 'S13%' OR
P.Diagnos LIKE 'S14%'
)
AND P.Diagnos NOT LIKE '%S06%'
),
------------------------------------------------------------------------------
-- SDH -----------------------------------------------------------------------
-- Turns out there are loads of I62 (non traumatic SDH), many get evacuated,
-- some have a traumatic sdh as secondary dx... let's pick the "clean"
-- Most get AD005 (evac chron sdh) or AD010 (evac acute sdh).
-- Not sure it's reasonable to "split" on that.
sdh AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE 'I62%'
)
AND P.Diagnos NOT LIKE '%S06%'
--- exclude a few asah
AND P.Op NOT LIKE "%AAL00%"
AND P.Op NOT LIKE "%AAC00%"
),
------------------------------------------------------------------------------
-- "Isolated" hydrocephalus (shunt dysfunctions et al) -----------------------
hc AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE 'G91%'
)
-- several likely aSAH that fulfill the above criteria will need to be filtered out as such:
AND (P.Op NOT LIKE "%AAC00%")
AND (P.Op NOT LIKE "%AAL00%")
),
------------------------------------------------------------------------------
-- Tumours ------------------------------------------------------------------
tum AS (
SELECT
P.HADM_ID
FROM
PAR_HADM P
WHERE
(
P.Diagnos LIKE 'D43%'
OR P.Diagnos LIKE 'C71%'
OR P.Diagnos LIKE 'C793%'
OR P.Diagnos LIKE 'D33%'
)
--- exclude a few asah
AND P.Op NOT LIKE "%AAL00%"
AND P.Op NOT LIKE "%AAC00%"
),
------------------------------------------------------------------------------
-- CTE DX:
-- Finds the primary diagnostic group for each entry in PAR_HADM_CONT_DATES.
-- Note that one HADM could theoretically have several diagnostic criteria fulfilled (although
-- it is rare since the criteria are designed to define a PRIMARY diagnosis.)
-- In practice, this only happens for patients that are diagnosed with
-- ICH and die within 30 days and have their case of death assigned to ASAH
-- This CTE will try to assign a PAR HADM with one main diagnosis.
-- The CASE WHEN statement will identify the FIRST fulfilled diagnostic criteria only.
-- Also note, that this CTE does not identify ANY secondary diagnosis.
------------------------------------------------------------------------------
DX AS (
SELECT DISTINCT
P.HADM_ID,
P.CONT_HADM_ID,
P.LopNr,
P.Diagnos,
CASE
WHEN P.HADM_ID IN (SELECT HADM_ID FROM asah) THEN 'ASAH'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM ane) THEN 'ANE'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM avm) THEN 'AVM'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM ich) THEN 'ICH'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM tbi) THEN 'TBI'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM ais) THEN 'AIS'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM abm) THEN 'ABM'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM cvt) THEN 'CVT'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM ence) THEN 'ENC'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM se) THEN 'SEP'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM cfx) THEN 'CFX'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM sdh) THEN 'SDH'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM hc) THEN 'HC'
WHEN P.HADM_ID IN (SELECT HADM_ID FROM tum) THEN 'TUM'
ELSE 'OTHER'
END AS DX_GROUP
FROM
PAR_HADM_CONT P
),
ICU_ADMISSIONS_MATCHED_WITH_PAR_WITH_DX AS (
SELECT
T.*,
D.DX_GROUP
FROM ICU_ADMISSIONS_MATCHED_WITH_PAR T
LEFT JOIN DX D ON T.HADM_ID = D.HADM_ID
),
-- PR_ICU_ADMISSIONS_MATCHED_WITH_PAR_WITH_DX_TIME_HIERARCHY helps resolving tie situations
-- where one SIR admission is associated with several PAR admissions.
-- In this case the earliest (within the time window) admissions is chosen,
-- if there still is a tie a hierarchical ordering of diagnosis will choose one admission only
ICU_ADMISSIONS_MATCHED_WITH_PAR_WITH_DX_RANKED AS (
SELECT *,
CASE DX_GROUP
WHEN 'ASAH' THEN 1
WHEN 'ANE' THEN 2
WHEN 'AVM' THEN 3
WHEN 'ICH' THEN 4
WHEN 'TBI' THEN 5
WHEN 'AIS' THEN 6
WHEN 'ABM' THEN 7
WHEN 'CVT' THEN 8
WHEN 'ENC' THEN 9
WHEN 'SEP' THEN 10
WHEN 'CFX' THEN 11
WHEN 'SDH' THEN 12
WHEN 'HC' THEN 13
WHEN 'TUM' THEN 14
ELSE 15
END AS DX_RANK
FROM ICU_ADMISSIONS_MATCHED_WITH_PAR_WITH_DX
)
,
TRANSFERS AS (
SELECT
S.VtfId_LopNr,
S.HADM_ID AS TERTIARY_HADM_ID,
S.LopNr,
S.SIR_PAR_OFFSET,
S.SIR_PAR_OFFSET_TIGHT,
S.OREBRO_INTERNAL,
S.DX_GROUP,
S.DX_RANK
FROM ICU_ADMISSIONS_MATCHED_WITH_PAR_WITH_DX_RANKED S
),
--------------------------------------------------------------------------------
-- DESCRIPTIVE_SIR collects data from several SIR tables (including SAPS, SOFA)
-- joined on SIR admission ID
--------------------------------------------------------------------------------
DESCRIPTIVE_SIR AS (
SELECT
-----------------------------------------
------------- DEMOGRAPHICS --------------
-----------------------------------------
S.VtfId_LopNr,
S.AvdNamn AS sir_icu_name,
CASE S.SjukhusTyp
WHEN 'Länssjukhus' THEN 'Regional Hospital'
WHEN 'Länsdelssjukhu' THEN 'Community Hospital'
WHEN 'Regionsjukhus' THEN 'University Hospital'
END AS sir_hospital_type,
S.InskrTidpunkt AS sir_adm_time,
S.UtskrTidPunkt AS sir_dsc_time,
S.VardTidMinuter AS sir_total_time,
--- Height, weight, BMI ---
S.Lengd AS admission_height,
S.AnkIvaVikt AS admission_weight,
S.AnkIvaVikt / (S.Lengd * S.Lengd) AS BMI,
--- DNR orders (in the SIR_BEGRANSNINGAR VtfId_LopNr are only present if there is a DNR order) ---
CASE WHEN S.VtfId_LopNr IN (SELECT VtfId_LopNr FROM SIR_BEGRANSNINGAR) THEN 1 ELSE 0 END AS DNR,
--- After hours discharge ---
CASE
WHEN strftime('%H', datetime(S.UtskrTidPunkt, 'unixepoch')) IN ("8","9","10","11","12","13","14","15","16") THEN 1 ELSE 0 END AS icu_discharge_daytime,
CASE
WHEN strftime('%H', datetime(S.UtskrTidPunkt, 'unixepoch')) IN ("22","23","00","01","02","03","04","05","06") THEN 1 ELSE 0 END AS icu_discharge_nighttime,
CASE
WHEN (
strftime('%H', datetime(S.UtskrTidPunkt, 'unixepoch')) NOT IN ("8","9","10","11","12","13","14","15","16")
OR
strftime('%w', datetime(S.UtskrTidPunkt, 'unixepoch')) IN ("0","6")
)
THEN 1 ELSE 0 END AS icu_discharge_afterhours
,
--- After hours admit ---
CASE
WHEN strftime('%H', datetime(S.InskrTidpunkt, 'unixepoch')) IN ("8","9","10","11","12","13","14","15","16") THEN 1 ELSE 0 END AS icu_admit_daytime,
CASE
WHEN strftime('%H', datetime(S.InskrTidpunkt, 'unixepoch')) IN ("22","23","00","01","02","03","04","05","06") THEN 1 ELSE 0 END AS icu_admit_nighttime,
CASE
WHEN (
strftime('%H', datetime(S.InskrTidpunkt, 'unixepoch')) NOT IN ("8","9","10","11","12","13","14","15","16")
OR
strftime('%w', datetime(S.InskrTidpunkt, 'unixepoch')) IN ("0","6")
)
THEN 1 ELSE 0 END AS icu_admit_afterhours
,
-----------------------------------------
--- PHYSIOLOGY AND SEVERITY OF ILLNESS---
-----------------------------------------
--- Conciousness level ---
-- Get SAPS3 values
SAPS.SAPS3_GCS AS SAPS_GCS,
SAPS.SAPS3_GCS_Motorik AS SAPS_GCSm,
SAPS.SAPS3_RLS85 AS SAPS_RLS85,
-- Get worst SOFA conciousness values --
SOFA.MAX_RLS85 as SOFA_worst_RLS85,
SOFA.MIN_GCS as SOFA_worst_GCS,
SOFA.MIN_GCS_Motorik as SOFA_worst_GCSm,
-- Overall worst conciousness recorded in either SOFA or SAPS
CASE
WHEN SAPS.SAPS3_RLS85 IS NULL AND SOFA.MAX_RLS85 IS NULL THEN NULL
WHEN SAPS.SAPS3_RLS85 IS NULL THEN SOFA.MAX_RLS85
WHEN SOFA.MAX_RLS85 IS NULL THEN SAPS.SAPS3_RLS85
ELSE MAX(SAPS.SAPS3_RLS85, SOFA.MAX_RLS85)
END AS overall_worst_RLS85,
CASE
WHEN SAPS.SAPS3_GCS IS NULL AND SOFA.MIN_GCS IS NULL THEN NULL
WHEN SAPS.SAPS3_GCS IS NULL THEN SOFA.MIN_GCS
WHEN SOFA.MIN_GCS IS NULL THEN SAPS.SAPS3_GCS
ELSE MIN(SAPS.SAPS3_GCS, SOFA.MIN_GCS)
END AS overall_worst_GCS,
CASE
WHEN SAPS.SAPS3_GCS_Motorik IS NULL AND SOFA.MIN_GCS_Motorik IS NULL THEN NULL
WHEN SAPS.SAPS3_GCS_Motorik IS NULL THEN SOFA.MIN_GCS_Motorik
WHEN SOFA.MIN_GCS_Motorik IS NULL THEN SAPS.SAPS3_GCS_Motorik
ELSE MIN(SAPS.SAPS3_GCS_Motorik, SOFA.MIN_GCS_Motorik)
END AS overall_worst_GCSm,
-- Define if obtunded as per SAPS3 or SAPS3 and SOFA recording
CASE
WHEN ((SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS != 15) OR (SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 != 1)) THEN 1
WHEN (SAPS.SAPS3_GCS IS NULL AND SAPS.SAPS3_RLS85 IS NULL) THEN NULL
ELSE 0
END AS SAPS_obtunded,
CASE
WHEN (
(SOFA.MIN_GCS IS NOT NULL AND SOFA.MIN_GCS != 15) OR
(SOFA.MAX_RLS85 IS NOT NULL AND SOFA.MAX_RLS85 != 1) OR
(SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS != 15) OR
(SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 != 1)
) THEN 1
WHEN (SAPS.SAPS3_GCS IS NULL AND SAPS.SAPS3_RLS85 IS NULL AND SOFA.MIN_GCS IS NULL AND SOFA.MAX_RLS85 IS NULL) THEN NULL
ELSE 0
END AS overall_obtunded,
-- Define unconciousness as per SAPS3 or SAPS3 and SOFA recording
-- Note that a few ICUs report both RLS and GCS for some patients, this means that simple checks
-- like number SAPS_RLS > 3 + SAPS_GCS <9 != SAPS_UNCONCIOUS will fail
CASE
WHEN ((SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS < 9) OR (SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 > 3)) THEN 1
WHEN (SAPS.SAPS3_GCS IS NULL AND SAPS.SAPS3_RLS85 IS NULL) THEN NULL
ELSE 0
END AS SAPS_unconcious,
CASE
WHEN (
(SOFA.MIN_GCS IS NOT NULL AND SOFA.MIN_GCS < 9) OR
(SOFA.MAX_RLS85 IS NOT NULL AND SOFA.MAX_RLS85 > 3) OR
(SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS < 9) OR
(SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 > 3)
) THEN 1
WHEN (SAPS.SAPS3_GCS IS NULL AND SAPS.SAPS3_RLS85 IS NULL AND SOFA.MIN_GCS IS NULL AND SOFA.MAX_RLS85 IS NULL) THEN NULL
ELSE 0
END AS overall_unconcious,
CASE
WHEN (SAPS.SAPS3_RLS85 IS NULL AND SAPS.SAPS3_GCS IS NULL) THEN NULL
WHEN ((SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 IN ('1', '2'))
OR (SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS IN ('13', '14', '15'))) THEN "I (GCS ≥13)"
WHEN ((SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 IN ('3', '4'))
OR (SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS IN ('7', '8', '9', '10', '11', '12'))) THEN "II (GCS 7-12)"
WHEN ((SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 IN ('5'))
OR (SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS IN ('6'))) THEN "III (GCS 6)"
WHEN ((SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 IN ('6'))
OR (SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS IN ('5'))) THEN "IV (GCS 5)"
WHEN ((SAPS.SAPS3_RLS85 IS NOT NULL AND SAPS.SAPS3_RLS85 IN ('7', '8'))
OR (SAPS.SAPS3_GCS IS NOT NULL AND SAPS.SAPS3_GCS IN ('3', '4'))) THEN "V (GCS ≤4)"
ELSE 0
END AS sir_consciousness_level,
--- Assisted ventilation ---
CASE
WHEN SAPS.SAPS3_Ventilator = "Ja" THEN 1
WHEN SAPS.SAPS3_Ventilator IS NULL THEN NULL
ELSE 0 END as SAPS_AMV,
CASE
WHEN S.VtfId_LopNr IN (SELECT VtfId_LopNr FROM SIR_ATGARDER WHERE KvaKod = "DG021") THEN 1 ELSE 0 END AS KVA_IMV,
CASE
WHEN S.VtfId_LopNr IN (SELECT VtfId_LopNr FROM SIR_ATGARDER WHERE KvaKod = "DG023") THEN 1 ELSE 0 END AS KVA_NIV,
CASE
WHEN
SAPS.SAPS3_Ventilator = "Ja"
OR
S.VtfId_LopNr IN (SELECT VtfId_LopNr FROM SIR_ATGARDER WHERE KvaKod IN ('DG021', 'DG023'))
THEN 1 ELSE 0 END AS any_AMV,
--- Respiratory status ---
-- per SAPS3 --
SAPS.SAPS3_PaO2 / (SAPS.SAPS3_FiO2 / 100) as SAPS_PFI,
SAPS.SAPS3_PaO2 as SAPS_PAO2,
CASE
WHEN SAPS.SAPS3_PaO2 IS NULL THEN NULL
WHEN SAPS.SAPS3_PaO2 < 8 THEN 1
WHEN SAPS.SAPS3_PaO2 >= 8 THEN 0
END AS SAPS_hypoxia,
-- ARDS criteria per SAPS3, conditional assisted mechanical ventilation --
CASE
WHEN SAPS.SAPS3_PaO2 / (SAPS.SAPS3_FiO2 / 100) < 26.6 AND SAPS.SAPS3_Ventilator = 'Ja' AND SAPS.SAPS3_PaO2 / (SAPS.SAPS3_FiO2 / 100) IS NOT NULL THEN 1
WHEN SAPS.SAPS3_PaO2 / (SAPS.SAPS3_FiO2 / 100) OR SAPS.SAPS3_Ventilator IS NULL THEN NULL
ELSE 0
END AS ARDS,
--
CASE
WHEN S.VtfId_LopNr IN (
SELECT VtfId_LopNr
FROM SIR_DIAGNOSER
WHERE ICD10 LIKE 'J96%'
OR ICD10 LIKE 'J80%')
OR
S.VtfId_LopNr IN (
SELECT VtfId_LopNr
FROM SIR_ATGARDER
WHERE KvaKod IN ("GAA10", "TGA35", "TGA30", 'UGC12')
)
THEN 1 ELSE 0 END AS respiratory_instability_markers,
--- Cardiovascular ---
-- SAPS3 min SBP
SAPS.SAPS3_SystBTMin as SAPS_min_SBP,
-- SAPS3 max HR
SAPS.SAPS3_HjartfrekvMax as SAPS_max_HR,
-- SAPS3 tachycardia HR > 110
CASE
WHEN SAPS.SAPS3_HjartfrekvMax > 110 THEN 1
WHEN SAPS.SAPS3_HjartfrekvMax IS NULL THEN NULL
ELSE 0 END AS SAPS_tachycardia,
-- SAPS3 bradycardia HR < 50
CASE
WHEN SAPS.SAPS3_HjartfrekvMax < 50 THEN 1
WHEN SAPS.SAPS3_HjartfrekvMax IS NULL THEN NULL
ELSE 0 END AS SAPS_bradycardia,
-- SAPS3 hypotension SBP <90 --
CASE
WHEN SAPS.SAPS3_SystBTMin < 90 THEN 1
WHEN SAPS.SAPS3_SystBTMin IS NULL THEN NULL
ELSE 0 END AS SAPS_hypotension,
-- SAPS3 hypertension SBP >180 --
CASE
WHEN SAPS.SAPS3_SystBTMin > 180 THEN 1
WHEN SAPS.SAPS3_SystBTMin IS NULL THEN NULL
ELSE 0 END AS SAPS_hypertension,
-- ICD10 and KVÅ makers for CV instability -
CASE
WHEN S.VtfId_LopNr IN (
SELECT VtfId_LopNr
FROM SIR_DIAGNOSER
WHERE ICD10 LIKE 'I46%'
OR ICD10 LIKE 'I490%'
OR ICD10 LIKE 'I47%'
OR ICD10 LIKE 'I21%'
OR ICD10 LIKE 'R57%'
OR ICD10 LIKE 'I71%'
OR ICD10 LIKE 'I441%'
OR ICD10 LIKE 'I442%'
OR ICD10 LIKE 'I26%'
OR ICD10 LIKE 'I31%'
OR ICD10 LIKE 'I42%'
OR ICD10 LIKE 'I50%'
)
OR
S.VtfId_LopNr IN (
SELECT VtfId_LopNr
FROM SIR_ATGARDER
WHERE KvaKod IN ('SQ351','SS199','DF025', 'DF027', 'DF028', 'FPE96', 'TFE00')
)
THEN 1
ELSE 0
END AS hemodynamic_instability_markers,
CASE
WHEN S.VtfId_LopNr IN (
SELECT VtfId_LopNr
FROM SIR_SOFA
WHERE Noradrenalin = "> 0,1"
) THEN 1 ELSE 0 END AS SOFA_high_norepi_dose,
--- General SAPS3 data ---
SAPS.SAPS3_Score as SAPS_total_score,
SAPS.SAPS3_pHMin as SAPS_min_pH,
SAPS.SAPS3_KroppstempMax as SAPS_max_temp,
--- SAPS 3 acidosis (pH <7.25) ---
CASE
WHEN SAPS.SAPS3_pHMin < 7.25 THEN 1
WHEN SAPS.SAPS3_pHMin IS NULL THEN NULL
ELSE 0 END AS SAPS_acidosis,
--- SAPS 3 hypothermia (t <35)---
CASE
WHEN SAPS.SAPS3_KroppstempMax < 35 THEN 1
WHEN SAPS.SAPS3_KroppstempMax IS NULL THEN NULL
ELSE 0 END AS SAPS_hypothermia
FROM SIR_BASDATA S
LEFT JOIN SIR_SAPS3 SAPS on S.VtfId_LopNr= SAPS.VtfId_LopNr
LEFT JOIN (
SELECT
VtfId_LopNr,
MAX(RLS85) as MAX_RLS85,
MIN(GCS_Motorik + GCS_Ogon + GCS_Verbal) as MIN_GCS,
MIN(GCS_Motorik) as MIN_GCS_Motorik
FROM SIR_SOFA
GROUP BY VtfId_LopNr
) AS SOFA on S.VtfId_LopNr = SOFA.VtfId_LopNr
LEFT JOIN DORS DO ON S.LopNr = DO.LopNr
),
--------------------------------------------------------------------------------
-- DESCRIPTIVE_PAR collects data from PAR and DORS (data on death date) on PAR_HADM id and in
-- the case of death date data on patient ID + admission date in PAR.
--------------------------------------------------------------------------------
DESCRIPTIVE_PAR AS (
SELECT
P.HADM_ID,
P.Alder AS age,
CASE P.Sjukhus
WHEN '11001' THEN 'Karolinska universitetssjukhuset, Solna'
WHEN '11003' THEN 'Karolinska universitetssjukhuset, Solna'
WHEN '51001' THEN 'Sahlgrenska universitetssjukhuset'
WHEN '12001' THEN 'Akademiska sjukhuset'
WHEN '21001' THEN 'Universitetssjukhuset i Linköping'
WHEN '64001' THEN 'Norrlands universitetssjukhus'
WHEN '41001' THEN 'Universitetssjukhuset i Lund'
WHEN '41002' THEN 'Universitetssjukhuset i Lund'
WHEN '55010' THEN 'Universitetssjukhuset i Örebro'
ELSE P.Sjukhus -- If none of the above cases match, keep the original value
END AS par_tertiary_center,
CASE WHEN P.Kon = '1' THEN 0 ELSE 1 END AS sex_female
FROM PAR_HADM P
),
PROCESSED_DORS AS (
SELECT
D.LopNr,
-- Keep the original date as a string but remove trailing zeroes to enable
-- future parsing as YMD YM and Y
CASE
WHEN SUBSTR(D.DODSDAT, -4, 2) = '00' THEN SUBSTR(D.DODSDAT, 1, 4)
WHEN SUBSTR(D.DODSDAT, -4, 2) != '00' AND SUBSTR(D.DODSDAT, -2) = '00' THEN SUBSTR(D.DODSDAT, 1, 6)
ELSE D.DODSDAT END AS DODSDAT_CLEAN,
-- Keep correct dates only
CASE
WHEN SUBSTR(D.DODSDAT, -2) != '00'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-' || SUBSTR(D.DODSDAT, 7, 2))
ELSE NULL
END AS DODSDAT_DATE,
-- Keep all dates but for incorrect dates, round to the 1st day of the observation period
CASE
WHEN SUBSTR(D.DODSDAT, -4, 2) = '00'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-01-01')
WHEN SUBSTR(D.DODSDAT, -4, 2) != '00' AND SUBSTR(D.DODSDAT, -2) = '00'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-01')
ELSE DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-' || SUBSTR(D.DODSDAT, 7, 2))
END AS DODSDAT_ROUND_DOWN,
-- Keep all date but for incorrect dates, round up to the last day of observation period
CASE
WHEN SUBSTR(D.DODSDAT, -4, 4) = '0000'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-12-31')
WHEN SUBSTR(D.DODSDAT, -4, 2) != '00' AND SUBSTR(D.DODSDAT, -2) = '00'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-01', 'start of month', '+1 month', '-1 day')
ELSE DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-' || SUBSTR(D.DODSDAT, 7, 2))
END AS DODSDAT_ROUND_UP,
-- Keep all dates but for incorrect dates, round to july 1st for dates with
-- YYYY0000 and to the 15th of the month for dates with YYYYMM00
CASE
WHEN SUBSTR(D.DODSDAT, -4, 2) = '00'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-07-01')
WHEN SUBSTR(D.DODSDAT, -4, 2) != '00' AND SUBSTR(D.DODSDAT, -2) = '00'
THEN DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-15')
ELSE DATE(SUBSTR(D.DODSDAT, 1, 4) || '-' || SUBSTR(D.DODSDAT, 5, 2) || '-' || SUBSTR(D.DODSDAT, 7, 2))
END AS DODSDAT_ROUND_MID,
-- Add a flag for date with an error in formatting
CASE WHEN SUBSTR(D.DODSDAT, -2) = '00' THEN 1 ELSE 0 END AS ERROR_DATE
FROM DORS D
LEFT JOIN DORS_AVI DA ON D.LopNr = DA.LopNr
),
-- DAOH_STEP_1_90, DAOH_STEP_2_90, DAOH_STEP_3_90 and finally DAOH_90 calculates "Days Alive and Out of Hospital",
-- i.e. the number of hospital free days alive within a 90 day time period starting from a given ICU admission in SIR
-- The first 3 steps are cumbersome preprocessing steps
-- STEP 1: Joins the PAR admissions on all SIR admissions for a given patient, some "helper columns" such as SIR adm date + 89 days are created
-- PAR admissions that do not overlap the SIR admission date or +89 date are discarded in this step.
-- STEP 2: Works per SIR ICU admission over all joined rows with unique PAR admissions. Here the time-cohorent PAR admissions will be
-- grouped together and given a common index. The code should be able to handle the odd overlapping admissions (eg. cases where there are
-- admissions ranging from Unix epoch days 17000-17030, 17010-17020, 17030-17031). Also if a patient has a PAR discharge on day D and
-- an admission on day D + 1, it will be counted as a continious admission, allowing for some clerical error and clinic transfers around midnight.
-- While it is possible that the patient actually was at her home, the significance of such a short stay at home should not be overestimated.
-- STEP 3: This "caps" the PAR admission dates within the bounds of the SIR admission date + 89 days. Next, all days admitted are summed.
-- DAOH_90: Finally, in DAOH_90 the number of DAOH_90 are calculated. If the patient dies within 90 days, DAOH_90 is set to zero.
DAOH_90_STEP_1 AS (
SELECT
S.LopNr,
S.VtfId_LopNr,
S.InskrTidPunkt,
S.InskrTidPunkt / 86400 as S_INDATUM,
-- Add end date for DAOH-90
S.InskrTidPunkt / 86400 + 89 as S_INDATUM_90,
S.AvdNamn,
P.INDATUM,
P.UTDATUM,
P.HADM_ID,
P.Diagnos,
P.Op,
P.MVO,
P.Sjukhus
FROM SIR_BASDATA S
LEFT JOIN PAR_HADM_ALL P on S.LopNr = P.LopNr
-- Keep only joined PAR admits with discharge on the same day or later as ICU admit
WHERE S.InskrTidPunkt / 86400 - P.UTDATUM <= 0
-- Keep only joined PAR admimts with admission date within the DAOH-90 end date (inclusive of end date)
AND (S.InskrTidPunkt / 86400 + 89) - P.INDATUM >= 0
),
DAOH_90_STEP_2 AS (
SELECT *,
SUM(new_admission) OVER (PARTITION BY VtfId_LopNr ORDER BY INDATUM) AS ADM_GROUP
FROM (
SELECT *,
CASE
-- "+ 1" allows to group admissions where the first ends at day D and the second starts at D + 1
WHEN INDATUM <= MAX(prev_UTDATUM) OVER (PARTITION BY VtfId_LopNr ORDER BY INDATUM ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + 1
THEN 0
ELSE 1
END AS new_admission
-- This part gets the latest discharge date among all previous admissions for a given patient (technically: for a given ICU admit)
-- This is necessary because due to overlapping admissions, the preceding row could represent an overlapping admission that ends well before the one before it
FROM (
SELECT *,
MAX(UTDATUM) OVER (PARTITION BY VtfId_LopNr ORDER BY INDATUM ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_UTDATUM
FROM DAOH_90_STEP_1
)
)
),
-- WITHIN VtfId_LopNr, merge dates for all groups of par admissions. Also HOSPITAL_LOS is calculated for the respective ADM_GROUP.
-- To clean up cases where patients die before PAR discharge, DORS is joined and the MIN(discharge date, dod) is chosen,
-- thus if a patient dies before their discharge their day of death will be chosen for the calculaton of hospital LOS.
DAOH_90_STEP_3 AS (
SELECT
D.VtfId_LopNr,
D.LopNr,
D.InskrTidPunkt,
D.AvdNamn,
D.S_INDATUM,
D.S_INDATUM_90,
D.INDATUM,
D.UTDATUM,
D.HADM_ID,
D.UTDATUM - D.S_INDATUM + 1 as HOSPITAL_LOS,
MIN(
D.UTDATUM,
(julianday(SUBSTR(DO.DODSDAT, 1, 4) || '-' ||
SUBSTR(DO.DODSDAT, 5, 2) || '-' ||
SUBSTR(DO.DODSDAT, 7, 2)) - julianday('1970-01-01')))
AS PROPER_UTDATUM,
(julianday(SUBSTR(DO.DODSDAT, 1, 4) || '-' ||
SUBSTR(DO.DODSDAT, 5, 2) || '-' ||
SUBSTR(DO.DODSDAT, 7, 2)) - julianday('1970-01-01')) AS dod,
MIN(
D.UTDATUM,
(julianday(SUBSTR(DO.DODSDAT, 1, 4) || '-' ||
SUBSTR(DO.DODSDAT, 5, 2) || '-' ||
SUBSTR(DO.DODSDAT, 7, 2)) - julianday('1970-01-01')))
AS PROPER_UTDATUM,
(julianday(SUBSTR(DO.DODSDAT, 1, 4) || '-' ||
SUBSTR(DO.DODSDAT, 5, 2) || '-' ||
SUBSTR(DO.DODSDAT, 7, 2)) - julianday('1970-01-01')) - D.S_INDATUM + 1 AS HOSPITAL_LOS_ALIVE,
D.ADM_GROUP,
MAX(MIN(D.INDATUM), D.S_INDATUM) AS min_INDATUM,
MIN(MAX(D.UTDATUM), D.S_INDATUM_90) AS max_UTDATUM, -- Cap max_UTDATUM at S_INDATUM_90
MIN(MAX(D.UTDATUM), D.S_INDATUM_90) - MAX(MIN(D.INDATUM), D.S_INDATUM) + 1 as ADMITTED_DAYS,
D.Diagnos,
D.Op,
D.MVO,
D.Sjukhus
FROM DAOH_90_STEP_2 D
LEFT JOIN DORS DO ON D.LopNr = DO.LopNr
GROUP BY VtfId_LopNr, ADM_GROUP
),
-- A CTE that gets the Hospital LOS for the relevant ADM GROUP above, i.e. the Hospital LOS from the ICU admission at hand
DAOH_H_LOS AS (
SELECT
VtfId_LopNr,
LopNr,
HOSPITAL_LOS,
HOSPITAL_LOS_ALIVE
FROM DAOH_90_STEP_3
WHERE ADM_GROUP == 1
),
DAOH_90 AS (
SELECT
VtfId_LopNr,
-- Calculate days_alive
JULIANDAY(strftime('%Y-%m-%d', substr(DO.DODSDAT, 1, 4) || '-' || substr(DO.DODSDAT, 5, 2) || '-' || substr(DO.DODSDAT, 7, 2) || ' 00:00:00'))
- JULIANDAY(date(D.InskrTidPunkt, 'unixepoch')) AS days_alive,
-- Calculate DAOH_90 using a CASE statement
CASE
WHEN JULIANDAY(strftime('%Y-%m-%d', substr(DO.DODSDAT, 1, 4) || '-' || substr(DO.DODSDAT, 5, 2) || '-' || substr(DO.DODSDAT, 7, 2) || ' 00:00:00'))
- JULIANDAY(date(D.InskrTidPunkt, 'unixepoch')) <= 90
THEN 0
ELSE 90 - SUM(ADMITTED_DAYS)
END AS DAOH_90
FROM DAOH_90_STEP_3 D
LEFT JOIN DORS DO ON D.LopNr = DO.LopNr
GROUP BY VtfId_LopNr
),
-- DAOH_180: repeat the same as above
DAOH_180_STEP_1 AS (
SELECT
S.LopNr,
S.VtfId_LopNr,
S.InskrTidPunkt,
S.InskrTidPunkt / 86400 as S_INDATUM,
-- Add end date for DAOH-180
S.InskrTidPunkt / 86400 + 179 as S_INDATUM_180,
P.INDATUM,
P.UTDATUM
FROM SIR_BASDATA S
LEFT JOIN PAR_HADM_ALL P on S.LopNr = P.LopNr
-- Keep only joined PAR admits with discharge on the same day or later as ICU admit
WHERE S.InskrTidPunkt / 86400 - P.UTDATUM <= 0
-- Keep only joined PAR admimts with admission date within the DAOH-90 end date (inclusive of end date)
AND (S.InskrTidPunkt / 86400 + 179) - P.INDATUM >= 0
),
DAOH_180_STEP_2 AS (
SELECT *,
SUM(new_admission) OVER (PARTITION BY VtfId_LopNr ORDER BY INDATUM) AS ADM_GROUP
FROM (
SELECT *,
CASE
-- "+ 1" allows to group admissions where the first ends at day D and the second starts at D + 1
WHEN INDATUM <= MAX(prev_UTDATUM) OVER (PARTITION BY VtfId_LopNr ORDER BY INDATUM ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + 1
THEN 0
ELSE 1
END AS new_admission
-- This part gets the latest discharge date among all previous admissions for a given patient (technically: for a given ICU admit)
-- This is necessary because due to overlapping admissions, the preceding row could represent an overlapping admission that ends well before the one before it
FROM (
SELECT *,
MAX(UTDATUM) OVER (PARTITION BY VtfId_LopNr ORDER BY INDATUM ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_UTDATUM
FROM DAOH_180_STEP_1
)
)
),
-- WITHIN VtfId_LopNr, merge dates for all groups of par admissions
DAOH_180_STEP_3 AS (
SELECT
VtfId_LopNr,
LopNr,
InskrTidPunkt,
S_INDATUM,
S_INDATUM_180,
ADM_GROUP,
MAX(MIN(INDATUM), S_INDATUM) AS min_INDATUM,
MIN(MAX(UTDATUM), S_INDATUM_180) AS max_UTDATUM, -- Cap max_UTDATUM at S_INDATUM_180
MIN(MAX(UTDATUM), S_INDATUM_180) - MAX(MIN(INDATUM), S_INDATUM) + 1 as ADMITTED_DAYS
FROM DAOH_180_STEP_2
GROUP BY VtfId_LopNr, ADM_GROUP
),
DAOH_180 AS (
SELECT
VtfId_LopNr,
-- Calculate days_alive
JULIANDAY(strftime('%Y-%m-%d', substr(DO.DODSDAT, 1, 4) || '-' || substr(DO.DODSDAT, 5, 2) || '-' || substr(DO.DODSDAT, 7, 2) || ' 00:00:00'))
- JULIANDAY(date(D.InskrTidPunkt, 'unixepoch')) AS days_alive,
-- Calculate DAOH_90 using a CASE statement
CASE
WHEN JULIANDAY(strftime('%Y-%m-%d', substr(DO.DODSDAT, 1, 4) || '-' || substr(DO.DODSDAT, 5, 2) || '-' || substr(DO.DODSDAT, 7, 2) || ' 00:00:00'))
- JULIANDAY(date(D.InskrTidPunkt, 'unixepoch')) <= 180
THEN 0
ELSE 180 - SUM(ADMITTED_DAYS)
END AS DAOH_180
FROM DAOH_180_STEP_3 D
LEFT JOIN DORS DO ON D.LopNr = DO.LopNr
GROUP BY VtfId_LopNr
),
FINAL AS (
SELECT
T.*,
P.*,
S.*,
S.SAPS_total_score - CASE
WHEN S.sir_consciousness_level = "I (GCS ≥13)" THEN 0
WHEN S.sir_consciousness_level = "II (GCS 7-12)" THEN 2
WHEN S.sir_consciousness_level = "III (GCS 6)" THEN 7
WHEN S.sir_consciousness_level = "IV (GCS 5)" THEN 10
WHEN S.sir_consciousness_level = "V (GCS ≤4)" THEN 15
ELSE NULL
END AS SAPS_score_minus_gcs,
S.SAPS_total_score - CASE
WHEN P.age < 40 THEN 0
WHEN P.age BETWEEN 40 AND 59 THEN 5
WHEN P.age BETWEEN 60 AND 69 THEN 9
WHEN P.age BETWEEN 70 AND 74 THEN 13
WHEN P.age BETWEEN 75 AND 79 THEN 15
WHEN P.age >= 80 THEN 18
ELSE NULL
END AS SAPS_score_minus_age,
S.SAPS_total_score - CASE -- Age points
WHEN P.age < 40 THEN 0
WHEN P.age BETWEEN 40 AND 59 THEN 5
WHEN P.age BETWEEN 60 AND 69 THEN 9
WHEN P.age BETWEEN 70 AND 74 THEN 13
WHEN P.age BETWEEN 75 AND 79 THEN 15
WHEN P.age >= 80 THEN 18
ELSE NULL
END
- CASE -- GCS points
WHEN S.sir_consciousness_level = "I (GCS ≥13)" THEN 0
WHEN S.sir_consciousness_level = "II (GCS 7-12)" THEN 2
WHEN S.sir_consciousness_level = "III (GCS 6)" THEN 7
WHEN S.sir_consciousness_level = "IV (GCS 5)" THEN 10
WHEN S.sir_consciousness_level = "V (GCS ≤4)" THEN 15
ELSE NULL
END AS SAPS_score_minus_age_and_gcs,
DO.DODSDAT_ROUND_UP,
DA90.DAOH_90,
DA180.DAOH_180,
CASE
WHEN DATE(DO.DODSDAT_ROUND_UP)
BETWEEN DATE(S.sir_dsc_time, 'unixepoch')
AND DATE(S.sir_dsc_time, 'unixepoch', '+30 days')
THEN 1
ELSE 0
END AS MORTALITY_30D,
CASE
WHEN DATE(DO.DODSDAT_ROUND_UP)
BETWEEN DATE(S.sir_dsc_time, 'unixepoch')
AND DATE(S.sir_dsc_time, 'unixepoch', '+90 days')
THEN 1
ELSE 0
END AS MORTALITY_90D
FROM TRANSFERS T
LEFT JOIN DESCRIPTIVE_PAR P ON T.TERTIARY_HADM_ID = P.HADM_ID
LEFT JOIN DESCRIPTIVE_SIR S ON T.VtfId_LopNr = S.VtfId_LopNr
LEFT JOIN PROCESSED_DORS DO ON T.LopNr = DO.LopNr
LEFT JOIN DAOH_90 DA90 ON S.VtfId_LopNr = DA90.VtfId_LopNr
LEFT JOIN DAOH_180 DA180 ON S.VtfId_LopNr = DA180.VtfId_LopNr
)Parsing Weather Data
Weather data for the relevant airports were fetched from www.ogimet.net and parsed using the metar and ephem packages. The metar package parses METAR strings, and ephem derives the light conditions at the airport at each time point. This information was checked against the logic for HEMS operating minima published by EASA.
Code availability
All code necessary for parsing flight data of this format is freely available on https://github.com/johol355/hemspy.git
All code for case ascertainment is freely available on https://github.com/johol355/neuro-ascertainment/
The code for infering HEMS minima from the parsed data by
metarandephemcan be provided on request.