pharmacoml on a new datasetThis tutorial shows the most common workflow: start with subject-level parameters or EBEs plus covariates, run the hybrid screener, inspect the shortlist, and export candidate covariates for downstream PMx confirmation.
Input 1: EBEs or individual parameters
One row per subject with columns such as CL,
V, KA, or other subject-level PK/PD
quantities derived from a base model.
Input 2: Covariates
One row per subject with columns such as WT,
AGE, SEX, CRCL, and
ALB.
ebes.csv
ID,CL,V,KA
1,5.21,48.3,1.10
2,6.03,55.1,0.92
3,4.88,44.7,1.25
4,5.67,51.0,1.05
covariates.csv
ID,WT,AGE,SEX,CRCL,ALB
1,72,55,1,88,4.1
2,81,49,0,95,3.8
3,64,63,1,70,4.0
4,77,46,0,91,4.2
analysis_dataset.csv
ID,CL,V,KA,WT,AGE,SEX,CRCL,ALB
1,5.21,48.3,1.10,72,55,1,88,4.1
2,6.03,55.1,0.92,81,49,0,95,3.8
3,4.88,44.7,1.25,64,63,1,70,4.0
4,5.67,51.0,1.05,77,46,0,91,4.2
import pandas as pd
from pharmacoml.covselect import HybridScreener
ebes = pd.read_csv("ebes.csv")
covariates = pd.read_csv("covariates.csv")
df = ebes.merge(covariates, on="ID")
ebe_cols = ["CL", "V", "KA"]
cov_cols = ["WT", "AGE", "SEX", "CRCL", "ALB"]
report = HybridScreener(include_scm=True).fit(
ebes=df[ebe_cols],
covariates=df[cov_cols],
)
print(report.confirmed_covariates())
print(report.candidate_covariates())
print(report.proxy_groups())
print(report.to_nonmem_candidates())
import pandas as pd
from pharmacoml.covselect import HybridScreener
df = pd.read_csv("analysis_dataset.csv")
report = HybridScreener(include_scm=True).fit(
ebes=df[["CL", "V", "KA"]],
covariates=df[["WT", "AGE", "SEX", "CRCL", "ALB"]],
)
print(report.confirmed_covariates())
print(report.candidate_covariates())
If you know ETA shrinkage from your base model, pass it in. This makes screening more pharmacometrically credible for low-information parameters.
parameter_shrinkage = {
"CL": 0.12,
"V": 0.28,
"KA": 0.22,
}
report = HybridScreener(include_scm=True).fit(
ebes=df[ebe_cols],
covariates=df[cov_cols],
parameter_shrinkage=parameter_shrinkage,
)
The snapshots below correspond to the toy dataset shown earlier on this page. They are meant to show the shape of the output a user should expect after running the tutorial example.
Summary table
parameter covariate functional_form combined_score tier support_count scm_selected
0 CL WT power 0.9420 core 3 True
1 CL AGE linear 0.6110 candidate 2 False
2 CL BSA linear 0.4020 proxy 1 False
3 V WT power 0.8870 core 3 True
4 V ALB linear 0.1730 rejected 1 False
Confirmed covariates
parameter covariate functional_form confirmation_status
0 CL WT power scm
1 V WT power scm
Candidate covariates
parameter covariate functional_form combined_score tier
0 CL WT power 0.9420 core
1 CL AGE linear 0.6110 candidate
2 V WT power 0.8870 core
Proxy groups
parameter proxy_group_id representative proxy_member
0 CL G1 WT BSA
NONMEM-style candidate block
# pharmacoml hybrid candidate covariates (nonmem)
# core = strongest evidence, candidate = carry forward to SCM/backward elimination
; [CORE] WT -> CL | form=power | score=0.942 scm=yes
; [CANDIDATE] AGE -> CL | form=linear | score=0.611
; [CORE] WT -> V | form=power | score=0.887 scm=yes
Interaction screening
report = HybridScreener(
include_scm=True,
include_interactions=True,
interaction_top_n=3,
).fit(df[ebe_cols], df[cov_cols])
Symbolic structure search
report = HybridScreener(
include_symbolic=True,
symbolic_backend="basis",
).fit(df[ebe_cols], df[cov_cols])
| Tier | Meaning |
|---|---|
core |
Strongest ML-supported signals |
candidate |
Shortlist to carry into SCM/backward elimination |
confirmed |
Compact answer after SCM-style confirmation |
proxy |
Correlated alternatives to selected covariates |
Which result should I look at first?
Start with report.confirmed_covariates() for the most
compact daily-use answer. This is the easiest view to hand to a
pharmacometrician who wants the short list with the strongest
confirmation behind it.
Then review report.candidate_covariates(). This is the
broader shortlist and is often the most useful table to carry into
SCM, backward elimination, or manual covariate review.
When do the other outputs matter?
Use report.core_covariates() when you want to see the
strongest AI/ML-supported signals before confirmation. Use
report.proxy_groups() when correlated variables are
plausible and you need to understand which covariates are acting as
substitutes. Use report.interaction_covariates() only
when interaction screening has been enabled.