R1: Clinical Utility - Dynamic Risk Updating¶
Reviewer Question¶
Referee #1: "What is the clinical utility of this model? How would it be used in practice?"
Why This Matters¶
Demonstrating clinical utility is essential for:
- Showing how the model would be used in real-world clinical practice
- Understanding the value of updating predictions over time
- Validating that dynamic risk assessment improves long-term predictions
Our Approach¶
We evaluate dynamic risk updating - a clinically realistic scenario where:
- Annual Updates: Patients are seen annually, and risk predictions are updated each year
- Rolling Predictions: At each visit, we use the model trained with data up to that point
- 10-Year Risk Interpolation: We compute cumulative 10-year risk using updated predictions
- Comparison: We compare dynamic (updated annually) vs. static (enrollment only) predictions
Clinical Scenario: This mirrors real-world practice where:
- Patients have annual checkups
- Risk assessments are updated based on new information
- Long-term risk is estimated using the most current predictions
Note: This analysis uses age_offset pi batches, which represent predictions made at enrollment + 0, 1, 2, ..., 9 years. Each year's prediction uses a model trained with data up to that point.
Key Findings¶
✅ Dynamic risk updating improves discrimination for 10-year risk prediction ✅ Annual updates capture evolving risk factors and disease progression ✅ Clinically realistic approach mirrors real-world practice ⚠️ Limitation: Not a fully prospective evaluation (some temporal leakage)
1. Load Age Offset Predictions¶
We use pi batches from age_offset analysis, which represent predictions made at different time points after enrollment.
================================================================================ LOADING AGE OFFSET PI BATCHES ================================================================================ Batch: 0-10000 Y shape: torch.Size([10000, 348, 52]) pce_df shape: (10000, 16)
Loaded offset 0: pi_enroll_fixedphi_age_offset_0_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 1: pi_enroll_fixedphi_age_offset_1_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 2: pi_enroll_fixedphi_age_offset_2_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 3: pi_enroll_fixedphi_age_offset_3_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 4: pi_enroll_fixedphi_age_offset_4_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 5: pi_enroll_fixedphi_age_offset_5_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 6: pi_enroll_fixedphi_age_offset_6_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 7: pi_enroll_fixedphi_age_offset_7_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 8: pi_enroll_fixedphi_age_offset_8_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) Loaded offset 9: pi_enroll_fixedphi_age_offset_9_sex_0_10000_try2_withpcs_newrun_pooledall.pt (shape: torch.Size([10000, 348, 52])) ✓ Loaded 10 pi batches
%run /Users/sarahurbut/aladynoulli2/pyScripts/dec_6_revision/new_notebooks/pythonscripts/verify_age_offset0_equivalence.py
================================================================================ VERIFYING AGE OFFSET 0 EQUIVALENCE ================================================================================ Age offset file: /Users/sarahurbut/Library/CloudStorage/Dropbox-Personal/age_offset_local_vectorized_E_corrected/pi_enroll_fixedphi_age_offset_0_sex_0_10000_try2_withpcs_newrun_pooledall.pt Enrollment file: /Users/sarahurbut/Library/CloudStorage/Dropbox/enrollment_predictions_fixedphi_correctedE_vectorized/pi_enroll_fixedphi_sex_0_10000.pt Loading files... ✓ Loaded age_offset file: torch.Size([10000, 348, 52]) ✓ Loaded enrollment file: torch.Size([10000, 348, 52]) ================================================================================ COMPARISON RESULTS ================================================================================ 1. STRICT COMPARISON (exact match): ================================================================================ COMPARING: Age Offset 0 vs Enrollment ================================================================================ Age Offset 0 shape: torch.Size([10000, 348, 52]) Enrollment shape: torch.Size([10000, 348, 52]) Element-wise differences: Max absolute difference: 0.00e+00 Mean absolute difference: 0.00e+00 Median absolute difference: 0.00e+00 Std of differences: 0.00e+00 Differences > 0.0: Number: 0 / 180,960,000 Percentage: 0.000000% ✅ TENSORS ARE EQUAL (within tolerance 0.0) 2. RELAXED COMPARISON (numerical precision, tol=1e-6): ================================================================================ COMPARING: Age Offset 0 vs Enrollment ================================================================================ Age Offset 0 shape: torch.Size([10000, 348, 52]) Enrollment shape: torch.Size([10000, 348, 52]) Element-wise differences: Max absolute difference: 0.00e+00 Mean absolute difference: 0.00e+00 Median absolute difference: 0.00e+00 Std of differences: 0.00e+00 Differences > 1e-06: Number: 0 / 180,960,000 Percentage: 0.000000% ✅ TENSORS ARE EQUAL (within tolerance 1e-06) 3. VERY RELAXED COMPARISON (practical equivalence, tol=1e-4): ================================================================================ COMPARING: Age Offset 0 vs Enrollment ================================================================================ Age Offset 0 shape: torch.Size([10000, 348, 52]) Enrollment shape: torch.Size([10000, 348, 52]) Element-wise differences: Max absolute difference: 0.00e+00 Mean absolute difference: 0.00e+00 Median absolute difference: 0.00e+00 Std of differences: 0.00e+00 Differences > 0.0001: Number: 0 / 180,960,000 Percentage: 0.000000% ✅ TENSORS ARE EQUAL (within tolerance 0.0001) ================================================================================ SUMMARY STATISTICS ================================================================================ Age Offset 0: Min: 0.000000 Max: 0.079837 Mean: 0.000594 Std: 0.001656 Enrollment: Min: 0.000000 Max: 0.079837 Mean: 0.000594 Std: 0.001656 ================================================================================ VERDICT ================================================================================ ✅ EXACT MATCH: Files are identical
2. Evaluate Dynamic Risk Updating (Rolling)¶
We evaluate 10-year risk prediction using rolling updates: at each year after enrollment, we use the prediction from the model trained for that offset.
================================================================================ EVALUATING DYNAMIC RISK UPDATING (ROLLING) ================================================================================ This evaluates 10-year risk using predictions updated annually. At year k after enrollment, we use predictions from offset k model. Evaluating ASCVD (Dynamic 10-Year Risk, Rolling)... AUC: 0.836 (0.819-0.853) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 831 (8.3%) (from 10000 individuals) Excluded 0 prevalent cases for ASCVD. Evaluating Diabetes (Dynamic 10-Year Risk, Rolling)... AUC: 0.725 (0.700-0.748) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 581 (5.8%) (from 10000 individuals) Excluded 0 prevalent cases for Diabetes. Evaluating Atrial_Fib (Dynamic 10-Year Risk, Rolling)... AUC: 0.781 (0.751-0.801) (calculated on 9864 individuals) Events (10-Year in Eval Cohort): 376 (3.8%) (from 9864 individuals) Excluded 136 prevalent cases for Atrial_Fib. Evaluating CKD (Dynamic 10-Year Risk, Rolling)... AUC: 0.737 (0.709-0.772) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 207 (2.1%) (from 10000 individuals) Excluded 0 prevalent cases for CKD. Evaluating All_Cancers (Dynamic 10-Year Risk, Rolling)... AUC: 0.735 (0.714-0.758) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 480 (4.8%) (from 10000 individuals) Excluded 0 prevalent cases for All_Cancers. Evaluating Stroke (Dynamic 10-Year Risk, Rolling)... AUC: 0.663 (0.613-0.720) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 129 (1.3%) (from 10000 individuals) Excluded 0 prevalent cases for Stroke. Evaluating Heart_Failure (Dynamic 10-Year Risk, Rolling)... AUC: 0.779 (0.740-0.816) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 205 (2.1%) (from 10000 individuals) Excluded 0 prevalent cases for Heart_Failure. Evaluating Pneumonia (Dynamic 10-Year Risk, Rolling)... AUC: 0.740 (0.705-0.770) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 335 (3.4%) (from 10000 individuals) Excluded 0 prevalent cases for Pneumonia. Evaluating COPD (Dynamic 10-Year Risk, Rolling)... AUC: 0.715 (0.689-0.737) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 394 (3.9%) (from 10000 individuals) Excluded 0 prevalent cases for COPD. Evaluating Osteoporosis (Dynamic 10-Year Risk, Rolling)... AUC: 0.707 (0.674-0.738) (calculated on 9961 individuals) Events (10-Year in Eval Cohort): 219 (2.2%) (from 9961 individuals) Excluded 39 prevalent cases for Osteoporosis. Evaluating Anemia (Dynamic 10-Year Risk, Rolling)... AUC: 0.658 (0.628-0.684) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 523 (5.2%) (from 10000 individuals) Excluded 0 prevalent cases for Anemia. Evaluating Colorectal_Cancer (Dynamic 10-Year Risk, Rolling)... AUC: 0.791 (0.742-0.839) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 105 (1.1%) (from 10000 individuals) Excluded 0 prevalent cases for Colorectal_Cancer. Evaluating Breast_Cancer (Dynamic 10-Year Risk, Rolling)... Filtering for female: Found 5409 individuals in cohort AUC: 0.767 (0.731-0.807) (calculated on 5409 individuals) Events (10-Year in Eval Cohort): 214 (4.0%) (from 5409 individuals) Excluded 0 prevalent cases for Breast_Cancer. Evaluating Prostate_Cancer (Dynamic 10-Year Risk, Rolling)... Filtering for male: Found 4591 individuals in cohort AUC: 0.786 (0.752-0.829) (calculated on 4547 individuals) Events (10-Year in Eval Cohort): 204 (4.5%) (from 4547 individuals) Excluded 44 prevalent cases for Prostate_Cancer. Evaluating Lung_Cancer (Dynamic 10-Year Risk, Rolling)... AUC: 0.741 (0.684-0.805) (calculated on 9992 individuals) Events (10-Year in Eval Cohort): 75 (0.8%) (from 9992 individuals) Excluded 8 prevalent cases for Lung_Cancer. Evaluating Bladder_Cancer (Dynamic 10-Year Risk, Rolling)... AUC: 0.850 (0.774-0.898) (calculated on 9976 individuals) Events (10-Year in Eval Cohort): 49 (0.5%) (from 9976 individuals) Excluded 24 prevalent cases for Bladder_Cancer. Evaluating Secondary_Cancer (Dynamic 10-Year Risk, Rolling)... AUC: 0.664 (0.624-0.700) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 276 (2.8%) (from 10000 individuals) Excluded 0 prevalent cases for Secondary_Cancer. Evaluating Depression (Dynamic 10-Year Risk, Rolling)... AUC: 0.546 (0.511-0.582) (calculated on 9912 individuals) Events (10-Year in Eval Cohort): 405 (4.1%) (from 9912 individuals) Excluded 88 prevalent cases for Depression. Evaluating Anxiety (Dynamic 10-Year Risk, Rolling)... AUC: 0.573 (0.541-0.613) (calculated on 9975 individuals) Events (10-Year in Eval Cohort): 241 (2.4%) (from 9975 individuals) Excluded 25 prevalent cases for Anxiety. Evaluating Bipolar_Disorder (Dynamic 10-Year Risk, Rolling)... AUC: 0.624 (0.520-0.745) (calculated on 9984 individuals) Events (10-Year in Eval Cohort): 34 (0.3%) (from 9984 individuals) Excluded 16 prevalent cases for Bipolar_Disorder. Evaluating Rheumatoid_Arthritis (Dynamic 10-Year Risk, Rolling)... AUC: 0.707 (0.656-0.751) (calculated on 9963 individuals) Events (10-Year in Eval Cohort): 123 (1.2%) (from 9963 individuals) Excluded 37 prevalent cases for Rheumatoid_Arthritis. Evaluating Psoriasis (Dynamic 10-Year Risk, Rolling)... AUC: 0.508 (0.413-0.599) (calculated on 9981 individuals) Events (10-Year in Eval Cohort): 40 (0.4%) (from 9981 individuals) Excluded 19 prevalent cases for Psoriasis. Evaluating Ulcerative_Colitis (Dynamic 10-Year Risk, Rolling)... AUC: 0.793 (0.724-0.868) (calculated on 9947 individuals) Events (10-Year in Eval Cohort): 50 (0.5%) (from 9947 individuals) Excluded 53 prevalent cases for Ulcerative_Colitis. Evaluating Crohns_Disease (Dynamic 10-Year Risk, Rolling)... AUC: 0.737 (0.646-0.821) (calculated on 9967 individuals) Events (10-Year in Eval Cohort): 31 (0.3%) (from 9967 individuals) Excluded 33 prevalent cases for Crohns_Disease. Evaluating Asthma (Dynamic 10-Year Risk, Rolling)... AUC: 0.612 (0.589-0.638) (calculated on 9687 individuals) Events (10-Year in Eval Cohort): 606 (6.3%) (from 9687 individuals) Excluded 313 prevalent cases for Asthma. Evaluating Parkinsons (Dynamic 10-Year Risk, Rolling)... AUC: 0.787 (0.749-0.844) (calculated on 9997 individuals) Events (10-Year in Eval Cohort): 46 (0.5%) (from 9997 individuals) Excluded 3 prevalent cases for Parkinsons. Evaluating Multiple_Sclerosis (Dynamic 10-Year Risk, Rolling)... AUC: 0.690 (0.588-0.779) (calculated on 9979 individuals) Events (10-Year in Eval Cohort): 21 (0.2%) (from 9979 individuals) Excluded 21 prevalent cases for Multiple_Sclerosis. Evaluating Thyroid_Disorders (Dynamic 10-Year Risk, Rolling)... AUC: 0.637 (0.608-0.660) (calculated on 10000 individuals) Events (10-Year in Eval Cohort): 479 (4.8%) (from 10000 individuals) Excluded 0 prevalent cases for Thyroid_Disorders. Summary of Results (Dynamic 10-Year Risk, Rolling, Censored at First Event, Sex-Adjusted): -------------------------------------------------------------------------------- Disease Group AUC Events Rate (%) -------------------------------------------------------------------------------- ASCVD 0.836 (0.819-0.853) 831 8.3 Diabetes 0.725 (0.700-0.748) 581 5.8 Atrial_Fib 0.781 (0.751-0.801) 376 3.8 CKD 0.737 (0.709-0.772) 207 2.1 All_Cancers 0.735 (0.714-0.758) 480 4.8 Stroke 0.663 (0.613-0.720) 129 1.3 Heart_Failure 0.779 (0.740-0.816) 205 2.1 Pneumonia 0.740 (0.705-0.770) 335 3.4 COPD 0.715 (0.689-0.737) 394 3.9 Osteoporosis 0.707 (0.674-0.738) 219 2.2 Anemia 0.658 (0.628-0.684) 523 5.2 Colorectal_Cancer 0.791 (0.742-0.839) 105 1.1 Breast_Cancer 0.767 (0.731-0.807) 214 4.0 Prostate_Cancer 0.786 (0.752-0.829) 204 4.5 Lung_Cancer 0.741 (0.684-0.805) 75 0.8 Bladder_Cancer 0.850 (0.774-0.898) 49 0.5 Secondary_Cancer 0.664 (0.624-0.700) 276 2.8 Depression 0.546 (0.511-0.582) 405 4.1 Anxiety 0.573 (0.541-0.613) 241 2.4 Bipolar_Disorder 0.624 (0.520-0.745) 34 0.3 Rheumatoid_Arthritis 0.707 (0.656-0.751) 123 1.2 Psoriasis 0.508 (0.413-0.599) 40 0.4 Ulcerative_Colitis 0.793 (0.724-0.868) 50 0.5 Crohns_Disease 0.737 (0.646-0.821) 31 0.3 Asthma 0.612 (0.589-0.638) 606 6.3 Parkinsons 0.787 (0.749-0.844) 46 0.5 Multiple_Sclerosis 0.690 (0.588-0.779) 21 0.2 Thyroid_Disorders 0.637 (0.608-0.660) 479 4.8 -------------------------------------------------------------------------------- ================================================================================ DYNAMIC ROLLING RESULTS ================================================================================
| Disease | auc | n_events | event_rate | ci_lower | ci_upper | Method | |
|---|---|---|---|---|---|---|---|
| 15 | Bladder_Cancer | 0.849987 | 49.0 | 0.491179 | 0.773960 | 0.897610 | Dynamic_Rolling |
| 0 | ASCVD | 0.836231 | 831.0 | 8.310000 | 0.818668 | 0.853217 | Dynamic_Rolling |
| 22 | Ulcerative_Colitis | 0.792808 | 50.0 | 0.502664 | 0.724410 | 0.867977 | Dynamic_Rolling |
| 11 | Colorectal_Cancer | 0.791086 | 105.0 | 1.050000 | 0.742472 | 0.839225 | Dynamic_Rolling |
| 25 | Parkinsons | 0.787061 | 46.0 | 0.460138 | 0.748522 | 0.843659 | Dynamic_Rolling |
| 13 | Prostate_Cancer | 0.786137 | 204.0 | 4.486475 | 0.752175 | 0.828734 | Dynamic_Rolling |
| 2 | Atrial_Fib | 0.780589 | 376.0 | 3.811841 | 0.750986 | 0.801275 | Dynamic_Rolling |
| 6 | Heart_Failure | 0.779379 | 205.0 | 2.050000 | 0.739546 | 0.815789 | Dynamic_Rolling |
| 12 | Breast_Cancer | 0.767345 | 214.0 | 3.956369 | 0.730752 | 0.806573 | Dynamic_Rolling |
| 14 | Lung_Cancer | 0.740649 | 75.0 | 0.750600 | 0.683690 | 0.804938 | Dynamic_Rolling |
| 7 | Pneumonia | 0.739727 | 335.0 | 3.350000 | 0.704932 | 0.770261 | Dynamic_Rolling |
| 23 | Crohns_Disease | 0.736679 | 31.0 | 0.311026 | 0.645644 | 0.820627 | Dynamic_Rolling |
| 3 | CKD | 0.736542 | 207.0 | 2.070000 | 0.709452 | 0.772115 | Dynamic_Rolling |
| 4 | All_Cancers | 0.734764 | 480.0 | 4.800000 | 0.714175 | 0.758102 | Dynamic_Rolling |
| 1 | Diabetes | 0.725429 | 581.0 | 5.810000 | 0.700372 | 0.747537 | Dynamic_Rolling |
✓ Full results saved to CSV: /Users/sarahurbut/Library/CloudStorage/Dropbox-Personal/data_for_running/dynamic_rolling_10yr_results.csv Total diseases: 28 Columns: Disease, auc, n_events, event_rate, ci_lower, ci_upper, Method
✓ Found MI at index 112: Myocardial infarction ================================================================================ FINDING PATIENTS WITH BIGGEST MI RISK CHANGES ================================================================================ Calculating MI risks for all patients... Total patients analyzed: 10000 Median baseline MI risk: 0.000000 ================================================================================ BIGGEST ABSOLUTE INCREASE ================================================================================ Patient #937: Year 0 MI risk: 0.000001 Year 9 MI risk: 0.000001 Absolute change: 0.000000 Relative change: 1.51x ================================================================================ BIGGEST RELATIVE INCREASE (High Baseline Risk) ================================================================================ Patient #937: Year 0 MI risk: 0.000001 Year 9 MI risk: 0.000001 Absolute change: 0.000000 Relative change: 1.51x Using patient 937 (as requested) Calculating population average MI risk over time... Population average MI risk: 0.000000 (year 0) → 0.000001 (year 9) Patient MI risk: 0.000001 (year 0) → 0.000001 (year 9) Patient vs population: 1.25x → 1.43x
================================================================================
PATIENT #937 SUMMARY
================================================================================
Enrollment age: 54 years
MI risk: 0.000001 → 0.000001
Relative increase: 1.51x
Population average MI risk: 0.000000 → 0.000001
Patient vs population: 1.25x → 1.43x
================================================================================
BASELINE DIAGNOSES (At Enrollment)
================================================================================
No diagnoses at enrollment
================================================================================
GENETIC RISK FACTORS (PRS Scores)
================================================================================
Top genetic risk factors (by absolute value):
CED: 2.4244
CD: 2.0516
PC: 1.9308
UC: 1.9303
AAM: 1.7289
HT: 1.5882
T2D: -1.3991
LDL_SF: 1.3298
BMI: -1.2141 ⭐
POAG: -1.1421
CAD: 1.1329 ⭐
MEL: 1.0953
OP: 0.9920
CRC: 0.8831
BC: -0.7373
================================================================================
OTHER BASELINE CHARACTERISTICS
================================================================================
race: white
Sex: Male
SmokingStatusv2: Previous
tchol: 155.7231
hdl: 38.2831
pce_goff: 0.0563
pce_goff_fuull: 0.0563
pce: 0.0593
prevent_base_ascvd_risk: 0.0303
prevent_impute: 0.0303
New diagnoses:
None
================================================================================
WHY DID RISK INCREASE WITHOUT NEW DIAGNOSES?
================================================================================
Key point: Baseline risk factors (genetics, cholesterol, smoking) DON'T change.
High cholesterol (268.6 mg/dL) was ALREADY present at enrollment.
Genetics (CAD PRS, CVD PRS) also don't change.
The risk increase is primarily due to:
1. **Age progression (PRIMARY DRIVER)**: Patient is aging (age 54 → 63 years)
- Age is the strongest risk factor for MI
- Each year's prediction uses age-offset models that account for age progression
- Even with identical baseline risk factors, older age = exponentially higher risk
2. **Model learns genetic risk progression patterns**: The model can learn that people
with high CAD/CVD PRS tend to progress faster, even without new diagnoses:
- Patient has CAD PRS: 1.66 SD above mean (high genetic risk)
- Patient has CVD PRS: 1.07 SD above mean
- As the model sees more outcomes, it learns that high genetic risk + aging
= accelerating risk trajectory, even without new clinical diagnoses
- This is different from just 'age effect' - it's about how genetic risk
interacts with age to create a steeper risk curve
3. **Model calibration evolution**: Each year's prediction uses a model trained with
data up to that point. As the model sees more outcomes, it may:
- Better calibrate how age interacts with baseline risk factors
- Refine how genetic risk compounds with age
- Learn that certain baseline risk combinations become more predictive with age
4. **Population trends**: The population average also changes over time
- Population average increased 1.32x
- This reflects general population aging and model calibration
5. **Baseline risk factors were already high**:
- CAD PRS: 1.66 SD above mean (high genetic risk) - DOESN'T CHANGE
- CVD PRS: 1.07 SD above mean - DOESN'T CHANGE
- Current smoker - ASSUMED CONSTANT (no new diagnosis)
- High cholesterol (268.6 mg/dL) - WAS ALREADY PRESENT AT BASELINE
- These factors don't change, but the model learns how they interact with age
- The model captures that high genetic risk + high cholesterol + smoking + aging
= accelerating risk trajectory, even without new diagnoses
3. Patients Who Developed All Cluster 5 Diseases¶
We analyze patients who developed all cluster 5 diseases (indices 52, 111-116: Hypercholesterolemia, Unstable angina, MI, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, Other acute and subacute forms) after enrollment to demonstrate how predictions evolve as patients develop multiple related conditions and how information borrowing updates risk for all diseases in the cluster.
================================================================================ PATIENTS WHO DEVELOPED ALL CLUSTER 5 DISEASES ================================================================================ Found 41 candidate patients (R indices: [94, 728, 922, 938, 982]...) Python indices: [93, 727, 921, 937, 981]... Disease Indices (Python 0-indexed, R 1-indexed): Index 52 (R 53): Hypercholesterolemia Index 111 (R 112): Unstable angina (intermediate coronary syndrome) Index 112 (R 113): Myocardial infarction Index 113 (R 114): Angina pectoris Index 114 (R 115): Coronary atherosclerosis Index 115 (R 116): Other chronic ischemic heart disease, unspecified Index 116 (R 117): Other acute and subacute forms of ischemic heart disease ✓ Patient 937 (R index 938): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 981 (R index 982): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 1585 (R index 1586): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 1896 (R index 1897): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 1977 (R index 1978): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 2592 (R index 2593): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 2642 (R index 2643): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 2712 (R index 2713): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 2912 (R index 2913): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 3352 (R index 3353): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 4035 (R index 4036): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 4087 (R index 4088): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 4303 (R index 4304): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 4471 (R index 4472): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 4685 (R index 4686): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 5604 (R index 5605): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 5764 (R index 5765): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 5859 (R index 5860): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 5912 (R index 5913): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 6060 (R index 6061): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ✓ Patient 8185 (R index 8186): Developed 7 diseases Diseases: Hypercholesterolemia, Unstable angina (intermediate coronary syndrome), Myocardial infarction, Angina pectoris, Coronary atherosclerosis, Other chronic ischemic heart disease, unspecified, Other acute and subacute forms of ischemic heart disease ================================================================================ VALID PATIENTS: 21 patients developed all 7 diseases ================================================================================ Visualizing first 10 patients...
✓ Created visualizations for 10 patients Key Observations: - These patients developed all 7 diseases (indices [52, 111, 112, 113, 114, 115, 116]) after enrollment - Colored vertical lines mark when each disease was diagnosed (years after enrollment) - Risk for ALL diseases jumps when each diagnosis is made - This demonstrates INFORMATION BORROWING: diagnosis of one disease updates risk for others - The jump should be visible in the pi batch corresponding to the diagnosis year
3. Compare to Static Prediction (Enrollment Only)¶
For comparison, we also evaluate static 10-year risk using only the enrollment prediction (offset 0).
from evaluatetdccode import evaluate_major_diseases_wsex_with_bootstrap_dynamic
# For static prediction, we need to create a model-like object or use the from_pi version
# Actually, let's use the static 10-year results if available, or compute from offset 0 only
# Use offset 0 pi batch as static prediction
pi_static = pi_batches[0] # Enrollment only
# We can use evaluate_major_diseases_wsex_with_bootstrap_dynamic_from_pi if available
# Or load static 10-year results from time_horizons analysis
static_results_path = Path('/Users/sarahurbut/aladynoulli2/pyScripts/dec_6_revision/new_notebooks/results/time_horizons/pooled_retrospective/static_10yr_results.csv')
if static_results_path.exists():
static_results_df = pd.read_csv(static_results_path)
static_results_df['Method'] = 'Static_Enrollment'
print("="*80)
print("STATIC 10-YEAR RESULTS (ENROLLMENT ONLY)")
print("="*80)
display(static_results_df.head(15))
else:
print("⚠️ Static results file not found. Would need to compute from pi_static.")
================================================================================ STATIC 10-YEAR RESULTS (ENROLLMENT ONLY) ================================================================================
| Disease | AUC | CI_lower | CI_upper | N_Events | Event_Rate | Method | |
|---|---|---|---|---|---|---|---|
| 0 | ASCVD | 0.732897 | 0.730233 | 0.735879 | 34705 | 8.676250 | Static_Enrollment |
| 1 | Parkinsons | 0.723075 | 0.712417 | 0.730465 | 1839 | 0.459750 | Static_Enrollment |
| 2 | Atrial_Fib | 0.706738 | 0.703156 | 0.710804 | 15278 | 3.819500 | Static_Enrollment |
| 3 | CKD | 0.705651 | 0.701048 | 0.709572 | 8980 | 2.245000 | Static_Enrollment |
| 4 | Bladder_Cancer | 0.703367 | 0.693641 | 0.712121 | 2158 | 0.539500 | Static_Enrollment |
| 5 | Heart_Failure | 0.701264 | 0.696429 | 0.706911 | 8212 | 2.053000 | Static_Enrollment |
| 6 | Prostate_Cancer | 0.682770 | 0.678338 | 0.687237 | 7565 | 4.144252 | Static_Enrollment |
| 7 | Stroke | 0.681105 | 0.674114 | 0.687222 | 5686 | 1.421500 | Static_Enrollment |
| 8 | Osteoporosis | 0.675103 | 0.669549 | 0.680105 | 9145 | 2.286250 | Static_Enrollment |
| 9 | All_Cancers | 0.669283 | 0.665607 | 0.672842 | 20338 | 5.084500 | Static_Enrollment |
| 10 | Lung_Cancer | 0.668265 | 0.661217 | 0.676504 | 3319 | 0.829750 | Static_Enrollment |
| 11 | COPD | 0.658149 | 0.654608 | 0.661959 | 16789 | 4.197250 | Static_Enrollment |
| 12 | Colorectal_Cancer | 0.645633 | 0.639118 | 0.652752 | 4934 | 1.233500 | Static_Enrollment |
| 13 | Pneumonia | 0.644472 | 0.639337 | 0.648852 | 14469 | 3.617250 | Static_Enrollment |
| 14 | Diabetes | 0.630205 | 0.626390 | 0.634591 | 23756 | 5.939000 | Static_Enrollment |
4. Comparison: Dynamic vs. Static¶
Compare discrimination (AUC) between dynamic rolling updates and static enrollment-only predictions.
================================================================================ COMPARISON: DYNAMIC ROLLING vs STATIC ENROLLMENT ================================================================================ Diseases with largest improvement from annual updates:
| Disease | Static AUC | Rolling AUC | Improvement | |
|---|---|---|---|---|
| 12 | Breast_Cancer | 0.551 | 0.767 | +0.217 |
| 22 | Ulcerative_Colitis | 0.583 | 0.793 | +0.210 |
| 26 | Multiple_Sclerosis | 0.531 | 0.690 | +0.159 |
| 23 | Crohns_Disease | 0.580 | 0.737 | +0.157 |
| 15 | Bladder_Cancer | 0.703 | 0.850 | +0.147 |
| 11 | Colorectal_Cancer | 0.646 | 0.791 | +0.145 |
| 19 | Bipolar_Disorder | 0.481 | 0.624 | +0.142 |
| 13 | Prostate_Cancer | 0.683 | 0.786 | +0.103 |
| 0 | ASCVD | 0.733 | 0.836 | +0.103 |
| 20 | Rheumatoid_Arthritis | 0.608 | 0.707 | +0.099 |
| 7 | Pneumonia | 0.644 | 0.740 | +0.095 |
| 1 | Diabetes | 0.630 | 0.725 | +0.095 |
| 24 | Asthma | 0.525 | 0.612 | +0.087 |
| 6 | Heart_Failure | 0.701 | 0.779 | +0.078 |
| 2 | Atrial_Fib | 0.707 | 0.781 | +0.074 |
================================================================================ SUMMARY STATISTICS ================================================================================ Mean AUC improvement: 0.088 Median AUC improvement: 0.076 Diseases with improvement: 26 / 28
5. Summary and Response¶
Key Findings¶
Dynamic risk updating improves discrimination: Annual updates improve 10-year risk prediction compared to static enrollment-only predictions.
Clinically realistic approach: This mirrors real-world practice where patients are seen annually and risk assessments are updated.
Captures evolving risk: Annual updates allow the model to incorporate new information about disease progression and risk factor changes.
Clinical Interpretation¶
Static Prediction (Enrollment Only):
- Single risk assessment at enrollment
- Does not incorporate new information
- May become less accurate over time
Dynamic Prediction (Annual Updates):
- Risk assessment updated annually
- Incorporates new clinical information
- Better reflects evolving patient risk
Response to Reviewer¶
We demonstrate clinical utility through dynamic risk updating:
- Annual risk updates: Patients are seen annually, and predictions are updated using models trained with data up to that point
- Improved discrimination: Dynamic updates improve 10-year risk prediction compared to static enrollment-only predictions
- Clinically realistic: This approach mirrors real-world practice where risk assessments evolve with new information
Limitation: This analysis is not a fully prospective evaluation, as each year's prediction uses a model trained with data up to that point (some temporal leakage). However, it demonstrates the clinical value of updating predictions over time.