Extracting a Prediction Task

To easily extract prediction tasks from MEDS Data, we will use the ACES package. This package allows you to define simple configuration files that specify the inclusion/exclusion criteria for tasks you want to extract and have them be automatically extractable from MEDS data via a command line interface. See the ACES documentation for more information.

In this tutorial, we'll explore both the motivation behind the ACES language and how it works technically in the jupyter notebook below. You can also check it out online on Google Colab or on our GitHub Repository

Extracting a Prediction Task

Now that we know how to convert a dataset into MEDS, lets do something useful with it! For this section of the tutorial, we'll demonstrate how to extract a set of labels of a prediction task for a health AI model.

What's the first step in extracting this prediction task cohort? Figuring out what we actually want our task to be!

We said at the start of this tutorial that our goal was to:

Identify patients who will have a long length of stay in the ICU

Simple, right? Well, not quite. This goal is clinically motivated, but vague from a computational and modeling standpoint. For whom exactly? How long is "long"? When should the prediction be made?

To determine how to answer these questions and operationalize this task definition, one must work with clinical and local dataset experts to understand whether or not a given patient should be included in our cohort and even what their label should be.

For this tutorial, we'll simulate a dialogue between yourself (acting as the role of the implementing AI developer) and a clinical and local dataset domain expert. The purpose of this simulated dialogue is to show (1) what the right kinds of questions to ask are when refining your task definition and (2) how to translate answers to those questions into operationalized criteria on a task of interest. To make this clear, after each question and answer exchange, we'll curate our evolving "formal definition" of the task in a semi-structured format to help us be clear in our thoughts. Much like our "conceptual spec" turned out to be very similar to the actually MEDS Extract specification file, we'll see that our conceptual model here will help us define our true final task definition file, which will use the ACES language to make task extraction deterministic and reproducible.

Note 1: This is just a simulated dialogue, and is not intended to be an accurate reflection of how you really would or should work with clinical collaborators or what their opinions or answers might be. No clinical experts were consulted in making this tutorial, as we were constrained in what tasks can be used that permit meaningful prediction over the demo dataset in use.

Note 2: You do not need to understand the clinical context of this example to follow along -- just how to translate what the domain expert requests into a concrete task specification.

Question 1: Why predict this task?

The first thing to understand when defining a prediction task is why do we care about that task. This question is easy to answer superficially, but is suprisingly hard to answer in a more concrete way. A good way to ensure you're being sufficiently concrete in your answer is to reframe your question under a more quantitative, rather than qualitative framework, such as the Value of Information framing. Under this framing, instead of asking "why do we care about this task?", we ask "What would the quantitative benefit be to being able to predict this task perfectly?" -- and, ideally, "How would that benefit be realized?". A similar approach is to leverage the prediction-action dyad approach, asking about what actions a prediction will enable or prompt and the benefits those actions will offer.

Done properly, understanding why we care about a task will help reveal:

When would a prediction for this task be useful?
For what patients would predicting this task be useful?
What metrics do we care about in this prediction? What is our best proxy for eventual utility?
What trade-offs are we navigating in making this prediction? E.g., what are our constraints in trying to predict this outcome?

So, let's explore this question in our hypothetical scenario!

Q (You): What's the goal here? What would you do differently if you knew a patient would have a long ICU stay?

A (Domain Expert): There are a lot of things that prediction could be used for. It could be used to help us optimize patient flow, for example, by allowing us to know in advance when ICU beds would be available or not. For this case, though, I'm most interested in what a prolonged length of stay (LOS) means about the patient's acuity. The idea here is that patients with prolonged ICU stays often experience worse outcomes—more complications, hospital-acquired infections, and higher mortality risk. If we can predict early on which patients are likely to have extended ICU stays, we could proactively intervene to prevent complications. Specifically, we're thinking about applying interventions like early mobilization, targeted nutritional support, or additional clinical monitoring resources.

Question 1.A: When are we predicting this?

Q (You): Ok, given that goal, when would understanding that these interventions might be needed be most useful?

A (Domain Expert): Obviously the earlier the better, but critically for some of these interventions, what really matters is beginning them sufficiently early in the course to improve long term outcomes. So, we'd want to predict these as far in advance as we can.

Q (You): Do we need to predict that the stay will remain ongoing for a long time continuously throughout the stay? E.g., are we predicting that the remaining time in the ICU will be long? Or is this a one-time prediction?

A (Domain Expert): Often it will become clear pretty quickly whether a patient is going to be in this state or not -- but the issue is that we want to know and be able to plan for these interventions even earlier than that. So, rather than predicting that the remaining time will be long, the biggest impact would be if we could predict earlier that the patient will remain in the ICU for a long time. Maybe if we could use their first day in the ICU as an input before making our prediction that their stay would be long, that'd be a good balance?

With this, we have our first piece of operationalization of our task: we want to make the prediction using data up to and including the first 24 hours of their ICU stay. Let's record that for now:

input: ICU admission + 24h

Note that this actually implies something about our patient population: Let's clarify further.

Q (You): Ok, sounds great, but what about patients who are in the ICU for less than 24 hours?

A (Domain Expert): We should omit those patients. If we're trying to plan for interventions, we don't need to plan around those patients who are already discharged.

So, this has revealed our first constraint in defining our task. Namely, the patient's ICU stay can't end within the first 24 hours. To represent that in our "conceptual specification", we'll add a note that there can't be an "ICU discharge" event in the input window:

input:
  start: ICU admission
  end: ICU admission + 24h
  has: no ICU discharge

Question 1.B: For which patients would this task actually help?

Our last exchange indicated that we want to exclude some patients from our cohort, because predicting this task label on those patients would not be meaningful. Are there more patients like this?

Q (You): Are there other patients for whom we wouldn't use this predictor, beyond those who aren't in the ICU for long enough?

A (Domain Expert): Great question. Some patients come in and are already so ill that the care team knows what to expect in advance. The most obvious example of these are patients who actually have do not recussitate (DNR) or comfort measures only (CMO) orders on record, or add them in their first day -- these patients are under palliative care, and so we already know what kinds of interventions are most suited and may be needed in that setting. We should probably exclude them too.

We can add this to our specification too:

input:
  start: ICU admission
  end: ICU admission + 24h
  has:
    ICU discharge: None
    CMO: None
    DNR: None

Note In a real task, we might have many more exclusion or inclusion criteria than just this.

Question 1.C: What does "long LOS" actually mean?

Q (You): What threshold defines a "long" ICU stay?

A (Domain Expert): Generally speaking, if a patient is in the ICU for more than a few days, something is wrong. We can use three days as a threshold for our purposes here.

This lets us define not just our input window, but also the window that defines our prediction target.

input:
  start: ICU admission
  end: ICU admission + 24h
  has:
    ICU discharge: None
    CMO: None
    DNR: None
target:
  start: ICU admission
  end: ICU discharge
  label: Longer than 3 days?

Question 2: Accounting for Edge Cases and Messy Data

So far, our questions have been very focused on an "idealized" view of the task and the dataset. But, in reality, there are a lot of complications we need to consider in clinical predictions; things like censoring, data leakage, incomplete labels, and more. We can explore some of these too to refine our task definition.

Question 2.A: Future Leakage and Gap Windows

Q (You): How reliable is our data? Is it ever possible that a patient would be recorded as being discharged from the ICU at a certain time, but in reality the clinical team would all know they were getting discharged or the patient might have actually been discharged earlier, for example?

A (Domain Expert): Oh, that's definitely possible. There could be a deviation of up to a few hours where that might be known.

This sort of error (the possibility of future leakage) suggests we should consider adding a gap window to our task. This is a window that extends some of our constraints out beyond the end of our input time to avoid having very easy patients in our training set who are only so easy because in reality the care team already knows the answer. We'll add one for 6 hours here:

input:
  start: ICU admission
  end: ICU admission + 24h
  has:
    ICU discharge: None
    CMO: None
    DNR: None
target:
  start: ICU admission
  end: ICU discharge
  label: Longer than 3 days?
gap:
  start: input.end
  end: input.end + 6h
  has:
    ICU discharge: None
    CMO: None
    DNR: None

Question 2.B: Conditional Predictions

Q (You): What about patients who don't stay a long time in the ICU becuase they die? Or what about patients who leave the ICU quickly, but then die shortly thereafter. In both cases, these patients clearly had severe illness, but our current task would count them as negative samples. What do we want to do about that?

A (Domain Expert): Oh, that's a big issue. We should really predict that separately -- if a patient is at severe risk of mortality, then we'd take a potentially different set of actions than we would if we thought they were going to stay in the ICU for a long time, but survive.

Q (You): So for this task, is it acceptable to think of it as prediction of prolonged length of stay conditioned on the patient not dying?

A (Domain Expert): Yes, that's what we're going for here.

input:
  start: ICU admission
  end: ICU admission + 24h
  has:
    ICU discharge: None
    CMO: None
    DNR: None
target:
  start: ICU admission
  end: ICU discharge
  label: Longer than 3 days?
  has:
    death: None
gap:
  start: input.end
  end: input.end + 6h
  has:
    ICU discharge: None
    CMO: None
    DNR: None

Operationalizing our "conceptual task description": The ACES Language

So far, we've put together this description of our task:

input:
  start: ICU admission
  end: ICU admission + 24h
  has:
    ICU discharge: None
    CMO: None
    DNR: None
target:
  start: ICU admission
  end: ICU discharge
  label: Longer than 3 days?
  has:
    death: None
gap:
  start: input.end
  end: input.end + 6h
  has:
    ICU discharge: None
    CMO: None
    DNR: None

How can we turn this into something that is precise and operationalizable? The key idea we'll use here is formalizing some of the notions of the "building blocks" of this conceptual specification. As we do so, what we'll really be doing is building up the formal ACES task configuration language -- so if you want to peek ahead, check out the documentation!

Building Block 1: Windows

The first and most important such building block is that of the windows we've specified. Here, we have an input window, a target window, and a gap window -- but the names of the windows themselves aren't important. Really, what's important here is we're using these to indicate specific relative parts of a patient's timeline -- parts that are dependent on one another and could possibly be realized multiple times over a patient record. Here, these windows are connected in that we have:

A "root" or "trigger" event of an ICU admission.
Our input window spanning from that trigger event until 24 hours later.
Our gap window spanning from the end of that input window until 6 hours later.
Our target window spanning from that trigger event until _the next "ICU discharge" event.

Let's re-write this specification to make these connections between windows more apparent.

trigger: ICU admission

windows:
  input:
    start: trigger
    end: start + 24h
    has:
      ICU discharge: None
      CMO: None
      DNR: None
  target:
    start: trigger
    end: start -> ICU discharge
    label: Longer than 3 days?
    has:
      death: None
  gap:
    start: input.end
    end: start + 6h
    has:
      ICU discharge: None
      CMO: None
      DNR: None

Building Block 2: Predicates

The next building block of our language we'll highlight is that of the key concepts or "predicates" we rely on in order to define both our relative windows and their constraints. Namely, here, we need to know how to identify within our given dataset all of the following:

An ICU admission
An ICU discharge
A death
A CMO event
A DNR event

If we don't know how to recognize those events within our data, our nice description of our windows isn't helpful! This is where we need that "local data expertise" we described above -- understanding how we can recognize these key events within our dataset.

Note: If your input data source were from a harmonized standard, such as OHDSI OMOP, you could also use those standard vocabularies and existing tools; unfortunately, however, not all datasets are harmonized at the start.

What if you didn't have a collaborator with local data expertise? Then there's a reasonable question to be asked about why are you working with that dataset! But, in the event that you are, you can always inspect the MEDS dataset directly across three easy to use axes to see if you can figure out which codes are reasonable starting points:

The code strings themselves may be human readable (e.g., in MEDS, death events often use codes that begin with MEDS_DEATH).
The text_value field may indicate the events of interest (see below for an example).
The metadata/codes.parquet file may contain free-text descriptions for the codes or links to external ontologies that can be used.

To use these strategies here, let's pull in our dataset from the tutorial resources, then we can dive in!

%%bash
set -e

wget -q -c https://github.com/Medical-Event-Data-Standard/MEDS_KDD_2025_Tutorial/raw/refs/heads/main/MEDS_data.zip
unzip -q -o MEDS_data.zip
rm MEDS_data/labels -r # We don't need that here

apt-get -qq install tree
tree MEDS_data

Selecting previously unselected package tree.
(Reading database ... 
(Reading database ... 5%
(Reading database ... 10%
(Reading database ... 15%
(Reading database ... 20%
(Reading database ... 25%
(Reading database ... 30%
(Reading database ... 35%
(Reading database ... 40%
(Reading database ... 45%
(Reading database ... 50%
(Reading database ... 55%
(Reading database ... 60%
(Reading database ... 65%
(Reading database ... 70%
(Reading database ... 75%
(Reading database ... 80%
(Reading database ... 85%
(Reading database ... 90%
(Reading database ... 95%
(Reading database ... 100%
(Reading database ... 126284 files and directories currently installed.)
Preparing to unpack .../tree_2.0.2-1_amd64.deb ...
Unpacking tree (2.0.2-1) ...
Setting up tree (2.0.2-1) ...
Processing triggers for man-db (2.10.2-1) ...
MEDS_data
├── data
│   ├── held_out
│   │   └── 0.parquet
│   ├── train
│   │   └── 0.parquet
│   └── tuning
│       └── 0.parquet
└── metadata
    ├── codes.parquet
    ├── dataset.json
    └── subject_splits.parquet

5 directories, 6 files

Now that we have the data, how can we look through it? To help here, we'll use a simple function that will let us search through a dataframe via a text string. It's far from perfect, but will help us here:

import pandas as pd
import ipywidgets as widgets
from IPython.display import display
from functools import partial

def df_search(df: pd.DataFrame):
  # Create a Text widget for the search input
  search_input = widgets.Text(
      value='',
      placeholder='Enter search term...',
      description='Search:',
      disabled=False
  )

  # Function to filter the DataFrame based on the search term
  def filter_dataframe(df: pd.DataFrame, search_term: str):
      if not search_term:
        display(df)
      else:
        filtered_df = df[df.apply(lambda row: row.astype(str).str.contains(search_term, case=False).any(), axis=1)]
        display(filtered_df)

  # Create an interactive output to display the filtered DataFrame
  output = widgets.interactive_output(
      partial(filter_dataframe, df=df), {'search_term': search_input}
  )

  # Display the search input and the output
  display(search_input, output)

Let's see it in action by searching through the metadata first!

1 
2

metadata_df = pd.read_parquet("MEDS_data/metadata/codes.parquet")[["code", "description", "parent_codes"]]
data_df = pd.read_parquet("MEDS_data/data/train/0.parquet")[["code", "text_value"]].drop_duplicates()

df_search(metadata_df)

Text(value='', description='Search:', placeholder='Enter search term...')

Output()

In this case, the metadata doesn't seem that helpful. What about the data itself, via the code strings and text value fields?

NOTE The search operation may take a long time and may seem to complete, but really only show results from a partial search.

df_search(data_df)

Text(value='', description='Search:', placeholder='Enter search term...')

Output()

In this case, for this tutorial, we'll just tell you how to define each of these predicates for our tutorial dataset (the MIMIC demo dataset):

ICU admissions have codes of the form ICU_ADMISSION//*
ICU discharges have codes of the form ICU_DISCHARGE//*
Death events have codes of the from MEDS_DEATH*
CMO events are either recorded with codes of the form LAB//220001//UNK or LAB//223758//UNK and with text values that are either "Comfort measures only" or "Comfort care (CMO, Comfort Measures)".
DNR events are either recorded with the same two codes used for CMO events, but with text values of any of the following forms:

- `"DNR / DNI"`
- `"DNAR (Do Not Attempt Resuscitation)  [DNR]"`
- `"DNAR (Do Not Attempt Resuscitation) [DNR] / DNI"`
- `"DNR (do not resuscitate)"`

So, how can we represent this in our specification? Here, we'll introduce a bit of the formal ACES language, and show you how to represent all of that in an ACES predicate block -- it isn't quite a clean as some of the windows specification is, but it should be transparent when you see it.

predicates:
  icu_admission:
    code: { regex: "^ICU_ADMISSION//.*" }
  icu_discharge:
    code: { regex: "^ICU_DISCHARGE//.*" }
  death:
    code: { regex: "MEDS_DEATH.*" }

  # CMO predicates
  cmo_1:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "Comfort measures only"
  cmo_2:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "Comfort care (CMO, Comfort Measures)"

  # DNR predicates
  dnr_1:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNR / DNI"
  dnr_2:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNAR (Do Not Attempt Resuscitation)  [DNR]"
  dnr_3:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNAR (Do Not Attempt Resuscitation) [DNR] / DNI"
  dnr_4:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNR (do not resuscitate)"

  # derived predicates
  cmo:
    expr: or(cmo_1, cmo_2)
  dnr:
    expr: or(dnr_1, dnr_2, dnr_3, dnr_4)
  death_or_discharge:
    expr: or(icu_discharge, death)

Putting it all together

Lastly, we need to clean up a few small aspects of our configuration file to put it all together. Let's see what the final config looks like:

predicates:
  icu_admission:
    code: { regex: "^ICU_ADMISSION//.*" }
  icu_discharge:
    code: { regex: "^ICU_DISCHARGE//.*" }
  death:
    code: { regex: "MEDS_DEATH.*" }

  # CMO predicates
  cmo_1:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "Comfort measures only"
  cmo_2:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "Comfort care (CMO, Comfort Measures)"

  # DNR predicates
  dnr_1:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNR / DNI"
  dnr_2:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNAR (Do Not Attempt Resuscitation)  [DNR]"
  dnr_3:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNAR (Do Not Attempt Resuscitation) [DNR] / DNI"
  dnr_4:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNR (do not resuscitate)"

  # derived predicates
  cmo:
    expr: or(cmo_1, cmo_2)
  dnr:
    expr: or(dnr_1, dnr_2, dnr_3, dnr_4)

trigger: icu_admission

windows:
  input:
    start: null
    end: trigger + 24h
    start_inclusive: True
    end_inclusive: True
    index_timestamp: end
    has:
      cmo: (None, 0) # Exclude patients on comfort measures only
      dnr: (None, 0) # Exclude patients with DNR orders
  gap:
    start: trigger
    end: start + 30h
    start_inclusive: False
    end_inclusive: True
    has:
      cmo: (None, 0)
      dnr: (None, 0)
      icu_discharge: (None, 0)
  target:
    start: trigger
    end: start + 3d
    start_inclusive: True
    end_inclusive: True
    label: icu_discharge
    has:
      death: (None, 0)

We can see there are a few small changes here.

Our syntax for specifying constraints is a bit different -- we give a range of a lower bound and an upper bound for the number of times a predicate can occur, instead of just saying "None".
We've revised the gap window to encompass the discharge constraint over the full first 30 hours, and have it link to the trigger directly, rather than the input. This lets us also make the input window restrict the CMO or DNR constraint over the full prior record, not just the 24 hours since the ICU stay started.
We've added an index_timestamp key to the input window. This tells ACES when in the relative set of window endpoints we are allowed to use data up until for prediction.
We've swapped the nature of the prediction for the target window. As ACES doesn't support a label being defined by the length of a window, only by predicates present or not, we now define the window to go for 3 days after the trigger and predict whether or not there is a discharge in that period. If so, the ICU stay must end within 3 days (and therefore is a short ICU stay). Note this inverts our notion of positive vs. negative label, but otherwise makes no other change. This also lets us exclude patients who die in that period appropriately.

Running ACES Extraction

Now that we have a fully defined task in the form of an ACES configuration file, how can we actually extract the labels for this task? Well, we can use the ACES CLI, of course!

To use this, all we have to do is write our task configuration YAML specification to a file on disk, then run the ACES CLI in the right way. Let's see it!

First, to write the task config:

from pathlib import Path

task_cfg_yaml = """
predicates:
  icu_admission:
    code: { regex: "^ICU_ADMISSION//.*" }
  icu_discharge:
    code: { regex: "^ICU_DISCHARGE//.*" }
  death:
    code: { regex: "MEDS_DEATH.*" }

  # CMO predicates
  cmo_1:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "Comfort measures only"
  cmo_2:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "Comfort care (CMO, Comfort Measures)"

  # DNR predicates
  dnr_1:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNR / DNI"
  dnr_2:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNAR (Do Not Attempt Resuscitation)  [DNR]"
  dnr_3:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNAR (Do Not Attempt Resuscitation) [DNR] / DNI"
  dnr_4:
    code: { any: ["LAB//220001//UNK", "LAB//223758//UNK"] }
    text_value: "DNR (do not resuscitate)"

  # derived predicates
  cmo:
    expr: or(cmo_1, cmo_2)
  dnr:
    expr: or(dnr_1, dnr_2, dnr_3, dnr_4)

trigger: icu_admission

windows:
  input:
    start: null
    end: trigger + 24h
    start_inclusive: True
    end_inclusive: True
    index_timestamp: end
    has:
      cmo: (None, 0) # Exclude patients on comfort measures only
      dnr: (None, 0) # Exclude patients with DNR orders
  gap:
    start: trigger
    end: start + 30h
    start_inclusive: False
    end_inclusive: True
    has:
      cmo: (None, 0)
      dnr: (None, 0)
      icu_discharge: (None, 0)
  target:
    start: trigger
    end: start + 3d
    start_inclusive: True
    end_inclusive: True
    label: icu_discharge
    has:
      death: (None, 0)
"""

Path("task_config.yaml").write_text(task_cfg_yaml);

Now to run the command -- note that we've renamed the task to short_LOS here, because we had to invert the label. A True indicates the subject has a length of stay less than 3 days, and False meant it was more.

%%bash
pip install --quiet es-aces==0.7.1
aces-cli \
    config_path=task_config.yaml \
    cohort_name="short_LOS" \
    cohort_dir="MEDS_data/labels" \
    data=sharded \
    data.standard=meds \
    data.root=MEDS_data/data \
    data.shard=$(expand_shards train/1 tuning/1 held_out/1) \
    -m

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 2.0 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.8/61.8 kB 2.7 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.2/76.2 kB 3.0 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 7.9 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.6/118.6 kB 5.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.5/154.5 kB 4.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 12.6 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.3/36.3 MB 10.2 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 739.1/739.1 kB 15.7 MB/s eta 0:00:00
[2025-07-31 15:10:22,677][HYDRA] Launching 3 jobs locally
[2025-07-31 15:10:22,677][HYDRA] 	#0 : config_path=task_config.yaml cohort_name=short_LOS cohort_dir=MEDS_data/labels data=sharded data.standard=meds data.root=MEDS_data/data data.shard=train/0
[2025-07-31 15:10:22,803][aces.run][INFO] - Loading config from 'task_config.yaml'
[2025-07-31 15:10:22,815][aces.config][INFO] - Parsing windows...
[2025-07-31 15:10:22,816][aces.config][INFO] - Parsing trigger event...
[2025-07-31 15:10:22,816][aces.config][INFO] - Parsing predicates...
[2025-07-31 15:10:22,827][aces.run][INFO] - Attempting to get predicates dataframe given:
standard: meds
ts_format: '%m/%d/%Y %H:%M'
root: MEDS_data/data
shard: train/0
path: ${data.root}/${data.shard}.parquet
_prefix: /${data.shard}

[2025-07-31 15:10:22,829][aces.predicates][INFO] - Loading MEDS data...
[2025-07-31 15:10:23,246][aces.predicates][INFO] - Generating plain predicate columns...
[2025-07-31 15:10:23,279][aces.predicates][INFO] - Added predicate column 'icu_admission'.
[2025-07-31 15:10:23,312][aces.predicates][INFO] - Added predicate column 'icu_discharge'.
[2025-07-31 15:10:23,359][aces.predicates][INFO] - Added predicate column 'death'.
[2025-07-31 15:10:23,376][aces.predicates][INFO] - Added predicate column 'cmo_1'.
[2025-07-31 15:10:23,392][aces.predicates][INFO] - Added predicate column 'cmo_2'.
[2025-07-31 15:10:23,408][aces.predicates][INFO] - Added predicate column 'dnr_1'.
[2025-07-31 15:10:23,424][aces.predicates][INFO] - Added predicate column 'dnr_2'.
[2025-07-31 15:10:23,441][aces.predicates][INFO] - Added predicate column 'dnr_3'.
[2025-07-31 15:10:23,457][aces.predicates][INFO] - Added predicate column 'dnr_4'.
[2025-07-31 15:10:23,457][aces.predicates][INFO] - Cleaning up predicates dataframe...
[2025-07-31 15:10:23,663][aces.predicates][INFO] - Loaded plain predicates. Generating derived predicate columns...
[2025-07-31 15:10:23,664][aces.predicates][INFO] - Added predicate column 'cmo'.
[2025-07-31 15:10:23,666][aces.predicates][INFO] - Added predicate column 'dnr'.
[2025-07-31 15:10:23,666][aces.predicates][INFO] - Generating special predicate columns...
[2025-07-31 15:10:23,666][aces.query][INFO] - Checking if '(subject_id, timestamp)' columns are unique...
[2025-07-31 15:10:23,677][aces.utils][INFO] - 
trigger
┣━━ input.end
┃   ┗━━ input.start
┣━━ gap.end
┗━━ target.end

[2025-07-31 15:10:23,677][aces.query][INFO] - Beginning query...
[2025-07-31 15:10:23,677][aces.query][INFO] - No static variable criteria specified, removing all rows with null timestamps...
[2025-07-31 15:10:23,680][aces.query][INFO] - Identifying possible trigger nodes based on the specified trigger event...
[2025-07-31 15:10:23,681][aces.constraints][INFO] - Excluding 72,774 rows as they failed to satisfy '1 <= icu_admission <= None'.
[2025-07-31 15:10:23,682][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'input.end'...
[2025-07-31 15:10:23,807][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'input.start'...
[2025-07-31 15:10:24,044][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= cmo <= 0'.
[2025-07-31 15:10:24,045][aces.constraints][INFO] - Excluding 3 rows as they failed to satisfy 'None <= dnr <= 0'.
[2025-07-31 15:10:24,049][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'gap.end'...
[2025-07-31 15:10:24,178][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= cmo <= 0'.
[2025-07-31 15:10:24,179][aces.constraints][INFO] - Excluding 1 rows as they failed to satisfy 'None <= dnr <= 0'.
[2025-07-31 15:10:24,179][aces.constraints][INFO] - Excluding 35 rows as they failed to satisfy 'None <= icu_discharge <= 0'.
[2025-07-31 15:10:24,182][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'target.end'...
[2025-07-31 15:10:24,315][aces.constraints][INFO] - Excluding 3 rows as they failed to satisfy 'None <= death <= 0'.
[2025-07-31 15:10:24,319][aces.query][INFO] - Done. 74 valid rows returned corresponding to 57 subjects.
[2025-07-31 15:10:24,319][aces.query][INFO] - Extracting label 'icu_discharge' from window 'target'...
[2025-07-31 15:10:24,320][aces.query][INFO] - Setting index timestamp as 'end' of window 'input'...
[2025-07-31 15:10:24,323][aces.run][WARNING] - Output contains columns that are not valid MEDS label columns. For now, we are dropping them.
If you need these columns, please comment on https://github.com/justin13601/ACES/issues/97
Columns:
  - trigger
  - input.end_summary
  - input.start_summary
  - gap.end_summary
  - target.end_summary
[2025-07-31 15:10:24,331][aces.run][INFO] - Completed in 0:00:01.526181. Results saved to 'MEDS_data/labels/short_LOS/train/0.parquet'.
[2025-07-31 15:10:24,332][HYDRA] 	#1 : config_path=task_config.yaml cohort_name=short_LOS cohort_dir=MEDS_data/labels data=sharded data.standard=meds data.root=MEDS_data/data data.shard=tuning/0
[2025-07-31 15:10:24,467][aces.run][INFO] - Loading config from 'task_config.yaml'
[2025-07-31 15:10:24,477][aces.config][INFO] - Parsing windows...
[2025-07-31 15:10:24,478][aces.config][INFO] - Parsing trigger event...
[2025-07-31 15:10:24,478][aces.config][INFO] - Parsing predicates...
[2025-07-31 15:10:24,479][aces.run][INFO] - Attempting to get predicates dataframe given:
standard: meds
ts_format: '%m/%d/%Y %H:%M'
root: MEDS_data/data
shard: tuning/0
path: ${data.root}/${data.shard}.parquet
_prefix: /${data.shard}

[2025-07-31 15:10:24,480][aces.predicates][INFO] - Loading MEDS data...
[2025-07-31 15:10:24,520][aces.predicates][INFO] - Generating plain predicate columns...
[2025-07-31 15:10:24,524][aces.predicates][INFO] - Added predicate column 'icu_admission'.
[2025-07-31 15:10:24,530][aces.predicates][INFO] - Added predicate column 'icu_discharge'.
[2025-07-31 15:10:24,539][aces.predicates][INFO] - Added predicate column 'death'.
[2025-07-31 15:10:24,542][aces.predicates][INFO] - Added predicate column 'cmo_1'.
[2025-07-31 15:10:24,545][aces.predicates][INFO] - Added predicate column 'cmo_2'.
[2025-07-31 15:10:24,548][aces.predicates][INFO] - Added predicate column 'dnr_1'.
[2025-07-31 15:10:24,552][aces.predicates][INFO] - Added predicate column 'dnr_2'.
[2025-07-31 15:10:24,557][aces.predicates][INFO] - Added predicate column 'dnr_3'.
[2025-07-31 15:10:24,561][aces.predicates][INFO] - Added predicate column 'dnr_4'.
[2025-07-31 15:10:24,562][aces.predicates][INFO] - Cleaning up predicates dataframe...
[2025-07-31 15:10:24,574][aces.predicates][INFO] - Loaded plain predicates. Generating derived predicate columns...
[2025-07-31 15:10:24,575][aces.predicates][INFO] - Added predicate column 'cmo'.
[2025-07-31 15:10:24,575][aces.predicates][INFO] - Added predicate column 'dnr'.
[2025-07-31 15:10:24,575][aces.predicates][INFO] - Generating special predicate columns...
[2025-07-31 15:10:24,575][aces.query][INFO] - Checking if '(subject_id, timestamp)' columns are unique...
[2025-07-31 15:10:24,577][aces.utils][INFO] - 
trigger
┣━━ input.end
┃   ┗━━ input.start
┣━━ gap.end
┗━━ target.end

[2025-07-31 15:10:24,577][aces.query][INFO] - Beginning query...
[2025-07-31 15:10:24,577][aces.query][INFO] - No static variable criteria specified, removing all rows with null timestamps...
[2025-07-31 15:10:24,578][aces.query][INFO] - Identifying possible trigger nodes based on the specified trigger event...
[2025-07-31 15:10:24,578][aces.constraints][INFO] - Excluding 6,242 rows as they failed to satisfy '1 <= icu_admission <= None'.
[2025-07-31 15:10:24,579][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'input.end'...
[2025-07-31 15:10:24,593][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'input.start'...
[2025-07-31 15:10:24,617][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= cmo <= 0'.
[2025-07-31 15:10:24,618][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= dnr <= 0'.
[2025-07-31 15:10:24,622][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'gap.end'...
[2025-07-31 15:10:24,636][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= cmo <= 0'.
[2025-07-31 15:10:24,636][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= dnr <= 0'.
[2025-07-31 15:10:24,636][aces.constraints][INFO] - Excluding 5 rows as they failed to satisfy 'None <= icu_discharge <= 0'.
[2025-07-31 15:10:24,638][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'target.end'...
[2025-07-31 15:10:24,652][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= death <= 0'.
[2025-07-31 15:10:24,655][aces.query][INFO] - Done. 10 valid rows returned corresponding to 7 subjects.
[2025-07-31 15:10:24,655][aces.query][INFO] - Extracting label 'icu_discharge' from window 'target'...
[2025-07-31 15:10:24,656][aces.query][INFO] - Setting index timestamp as 'end' of window 'input'...
[2025-07-31 15:10:24,659][aces.run][WARNING] - Output contains columns that are not valid MEDS label columns. For now, we are dropping them.
If you need these columns, please comment on https://github.com/justin13601/ACES/issues/97
Columns:
  - trigger
  - input.end_summary
  - input.start_summary
  - gap.end_summary
  - target.end_summary
[2025-07-31 15:10:24,664][aces.run][INFO] - Completed in 0:00:00.196449. Results saved to 'MEDS_data/labels/short_LOS/tuning/0.parquet'.
[2025-07-31 15:10:24,666][HYDRA] 	#2 : config_path=task_config.yaml cohort_name=short_LOS cohort_dir=MEDS_data/labels data=sharded data.standard=meds data.root=MEDS_data/data data.shard=held_out/0
[2025-07-31 15:10:24,878][aces.run][INFO] - Loading config from 'task_config.yaml'
[2025-07-31 15:10:24,889][aces.config][INFO] - Parsing windows...
[2025-07-31 15:10:24,889][aces.config][INFO] - Parsing trigger event...
[2025-07-31 15:10:24,889][aces.config][INFO] - Parsing predicates...
[2025-07-31 15:10:24,891][aces.run][INFO] - Attempting to get predicates dataframe given:
standard: meds
ts_format: '%m/%d/%Y %H:%M'
root: MEDS_data/data
shard: held_out/0
path: ${data.root}/${data.shard}.parquet
_prefix: /${data.shard}

[2025-07-31 15:10:24,892][aces.predicates][INFO] - Loading MEDS data...
[2025-07-31 15:10:24,922][aces.predicates][INFO] - Generating plain predicate columns...
[2025-07-31 15:10:24,925][aces.predicates][INFO] - Added predicate column 'icu_admission'.
[2025-07-31 15:10:24,927][aces.predicates][INFO] - Added predicate column 'icu_discharge'.
[2025-07-31 15:10:24,931][aces.predicates][INFO] - Added predicate column 'death'.
[2025-07-31 15:10:24,933][aces.predicates][INFO] - Added predicate column 'cmo_1'.
[2025-07-31 15:10:24,934][aces.predicates][INFO] - Added predicate column 'cmo_2'.
[2025-07-31 15:10:24,936][aces.predicates][INFO] - Added predicate column 'dnr_1'.
[2025-07-31 15:10:24,937][aces.predicates][INFO] - Added predicate column 'dnr_2'.
[2025-07-31 15:10:24,939][aces.predicates][INFO] - Added predicate column 'dnr_3'.
[2025-07-31 15:10:24,941][aces.predicates][INFO] - Added predicate column 'dnr_4'.
[2025-07-31 15:10:24,941][aces.predicates][INFO] - Cleaning up predicates dataframe...
[2025-07-31 15:10:24,947][aces.predicates][INFO] - Loaded plain predicates. Generating derived predicate columns...
[2025-07-31 15:10:24,948][aces.predicates][INFO] - Added predicate column 'cmo'.
[2025-07-31 15:10:24,948][aces.predicates][INFO] - Added predicate column 'dnr'.
[2025-07-31 15:10:24,949][aces.predicates][INFO] - Generating special predicate columns...
[2025-07-31 15:10:24,949][aces.query][INFO] - Checking if '(subject_id, timestamp)' columns are unique...
[2025-07-31 15:10:24,950][aces.utils][INFO] - 
trigger
┣━━ input.end
┃   ┗━━ input.start
┣━━ gap.end
┗━━ target.end

[2025-07-31 15:10:24,950][aces.query][INFO] - Beginning query...
[2025-07-31 15:10:24,950][aces.query][INFO] - No static variable criteria specified, removing all rows with null timestamps...
[2025-07-31 15:10:24,951][aces.query][INFO] - Identifying possible trigger nodes based on the specified trigger event...
[2025-07-31 15:10:24,951][aces.constraints][INFO] - Excluding 4,163 rows as they failed to satisfy '1 <= icu_admission <= None'.
[2025-07-31 15:10:24,952][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'input.end'...
[2025-07-31 15:10:24,961][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'input.start'...
[2025-07-31 15:10:24,979][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= cmo <= 0'.
[2025-07-31 15:10:24,980][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= dnr <= 0'.
[2025-07-31 15:10:24,983][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'gap.end'...
[2025-07-31 15:10:24,992][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= cmo <= 0'.
[2025-07-31 15:10:24,993][aces.constraints][INFO] - Excluding 0 rows as they failed to satisfy 'None <= dnr <= 0'.
[2025-07-31 15:10:24,993][aces.constraints][INFO] - Excluding 4 rows as they failed to satisfy 'None <= icu_discharge <= 0'.
[2025-07-31 15:10:24,995][aces.extract_subtree][INFO] - Summarizing subtree rooted at 'target.end'...
[2025-07-31 15:10:25,005][aces.constraints][INFO] - Excluding 1 rows as they failed to satisfy 'None <= death <= 0'.
[2025-07-31 15:10:25,008][aces.query][INFO] - Done. 8 valid rows returned corresponding to 6 subjects.
[2025-07-31 15:10:25,009][aces.query][INFO] - Extracting label 'icu_discharge' from window 'target'...
[2025-07-31 15:10:25,009][aces.query][INFO] - Setting index timestamp as 'end' of window 'input'...
[2025-07-31 15:10:25,012][aces.run][WARNING] - Output contains columns that are not valid MEDS label columns. For now, we are dropping them.
If you need these columns, please comment on https://github.com/justin13601/ACES/issues/97
Columns:
  - trigger
  - input.end_summary
  - input.start_summary
  - gap.end_summary
  - target.end_summary
[2025-07-31 15:10:25,016][aces.run][INFO] - Completed in 0:00:00.137151. Results saved to 'MEDS_data/labels/short_LOS/held_out/0.parquet'.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
opencv-python 4.12.0.88 requires numpy<2.3.0,>=2; python_version >= "3.9", but you have numpy 1.26.4 which is incompatible.
opencv-python-headless 4.12.0.88 requires numpy<2.3.0,>=2; python_version >= "3.9", but you have numpy 1.26.4 which is incompatible.
opencv-contrib-python 4.12.0.88 requires numpy<2.3.0,>=2; python_version >= "3.9", but you have numpy 1.26.4 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cublas-cu12==12.4.5.8; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cublas-cu12 12.5.3.2 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cuda-cupti-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-cupti-cu12 12.5.82 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cuda-nvrtc-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-nvrtc-cu12 12.5.82 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cuda-runtime-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cuda-runtime-cu12 12.5.82 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cudnn-cu12==9.1.0.70; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cudnn-cu12 9.3.0.75 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cufft-cu12==11.2.1.3; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cufft-cu12 11.2.3.61 which is incompatible.
torch 2.6.0+cu124 requires nvidia-curand-cu12==10.3.5.147; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-curand-cu12 10.3.6.82 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cusolver-cu12==11.6.1.9; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cusolver-cu12 11.6.3.83 which is incompatible.
torch 2.6.0+cu124 requires nvidia-cusparse-cu12==12.3.1.170; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-cusparse-cu12 12.5.1.3 which is incompatible.
torch 2.6.0+cu124 requires nvidia-nvjitlink-cu12==12.4.127; platform_system == "Linux" and platform_machine == "x86_64", but you have nvidia-nvjitlink-cu12 12.5.82 which is incompatible.
thinc 8.3.6 requires numpy<3.0.0,>=2.0.0, but you have numpy 1.26.4 which is incompatible.
cudf-polars-cu12 25.6.0 requires polars<1.29,>=1.25, but you have polars 1.30.0 which is incompatible.

1 
2

%%bash
tree MEDS_data/labels

MEDS_data/labels
└── short_LOS
    ├── held_out
    │   └── 0.parquet
    ├── train
    │   └── 0.parquet
    └── tuning
        └── 0.parquet

4 directories, 3 files

What do the labels look like? Let's open one up and see:

pd.read_parquet("MEDS_data/labels/short_LOS/train/0.parquet")

[17]:

	subject_id	prediction_time	boolean_value	integer_value	float_value	categorical_value
0	10002428	2156-04-13 16:24:18	False	NaN	NaN	None
1	10002428	2156-04-20 18:11:19	False	NaN	NaN	None
2	10002428	2156-05-01 21:53:00	True	NaN	NaN	None
3	10002428	2156-05-12 14:49:34	False	NaN	NaN	None
4	10002495	2141-05-23 20:18:01	False	NaN	NaN	None
...	...	...	...	...	...	...
69	10038933	2148-09-11 13:19:00	False	NaN	NaN	None
70	10038999	2131-05-23 21:50:33	False	NaN	NaN	None
71	10039708	2140-01-24 18:08:00	False	NaN	NaN	None
72	10039708	2140-06-19 01:41:00	True	NaN	NaN	None
73	10040025	2148-01-25 04:50:17	False	NaN	NaN	None

74 rows × 6 columns

Wrapping Up

And that's it! Through this tutorial, you've:

Started with a vague task (“predict long ICU stay”)
Iteratively refined it through clinical + modeling questions
Created a structured conceptual spec
Translated that into a formal ACES task config
Prepared to extract a high-quality prediction cohort

This workflow makes it easy to build clinically relevant, reproducible machine learning tasks directly from EHR data in the MEDS format.

Extracting a Prediction Task

Question 1: Why predict this task?

Question 1.A: When are we predicting this?

Question 1.B: For which patients would this task actually help?

Question 1.C: What does "long LOS" actually mean?

Question 2: Accounting for Edge Cases and Messy Data

Question 2.A: Future Leakage and Gap Windows

Question 2.B: Conditional Predictions

Other questions

Operationalizing our "conceptual task description": The ACES Language

Building Block 1: Windows

Building Block 2: Predicates

Putting it all together

Running ACES Extraction

Wrapping Up