An Independent Validation of the Kidney Failure Risk Equation (KFRE) on a Seoul University Hospital Population

Background

KFRE Overview:

Developed by Tangri et al. to predict the risk of progression to kidney failure in CKD patients (stages 3 to 5).
Key predictors identified: age, sex, eGFR, and urine albumin-to-creatinine ratio (ACR).

Model Details:

KFRE estimates the risk of kidney failure at 2 and 5 years using statistical modeling.
The equation has been validated across various populations and is widely used for clinical decision-making and patient counseling.

Methods

Python Library Implementation

Created to replicate the original KFRE equations.
The library supports calculations for the 2-year and 5-year risk using:
- 4-variable equation: age, sex, eGFR, uACR.
- 6-variable equation: Adds diabetes mellitus and hypertension.
- 8-variable equation: Includes all from the 4-variable model plus serum albumin, serum phosphorous, serum bicarbonate, and calcium.

Application

The kfre library enables healthcare professionals and researchers to integrate KFRE calculations into their analyses and decision-making processes.

Further Information

A detailed usage guide is available at the official documentation: kfre Documentation.

Preprocessing

Create Randomized Patient IDs for Indexing - crucial for several reasons:

Privacy and Anonymity: Random patient IDs help protect patient privacy and maintain anonymity. This is particularly important in healthcare research to ensure compliance with data protection regulations such as HIPAA.

Data Integrity: Randomized IDs prevent potential biases that could arise from using identifiable information. This ensures that the analysis is based solely on clinical data without any influence from patient identity.

Simplified Data Management: Random IDs facilitate easier data management and tracking, especially when dealing

Scalability: Randomized IDs allow for easier scaling of datasets, as new patients can be added without concern for ID conflicts.

By using random patient IDs, we can enhance the robustness, security, and scalability of our data management processes.

Patient Table (Manual Column Widths)

	Age	SEX	HTN	DM	GFR	uACR	ACR	Ca	P
Patient_ID
867721094	61	1	1	0	9.148234	10	10	6	4
533512602	30	0	1	1	153.9749	29	29	9.4	4.5
988350865	55	1	0	0	66.81041		25.77426	9.2	1.6
428707535	35	1	0	0	89.34706		28.4513	10.5	2.9
813646552	66	1	1	0	7.316171	0	0	7.7	6.5

Tangri et al. used years to determine outcome for 2-years, and 5-years, respectively, so we will have to convert days to years.

The class_esrd_outcome() function below creates a new column called years that converts ESRD_dur.

Now we use the years column to classify the ESRD column into two new columns inside df called 2_year_outcome and 5_year_outcome given the constrains above.

ESRD_in_2_year_outcome	ESRD_in_5_year_outcome
0	0
0	0
0	0
0	0
1	1

Patient_ID	Age	HTN	DM	GFR	ACR	Ca	P	Alb	TCO2	ESRD	ESRD_dur	Sex	kfre_4var_2year	kfre_4var_5year	kfre_6var_2year	kfre_6var_5year	kfre_8var_2year	kfre_8var_5year
829432911	61	1	0	9.15	10	6	4	2.6	16	0	2086	Male	0.1218	0.3953	0.1319	0.4155	0.5811	0.9800
451074312	66	1	0	7.32	0	7.7	6.5	3.5	14	1	3	Male	0.0001	0.0004	0.0001	0.0004	0.0046	0.0207
472425367	70	1	0	10.12	0	7.5	3.8	3.2	17	1	93	Male	0.0001	0.0003	0.0001	0.0003	0.0015	0.0067
300680837	49	0	0	7.63	0	8.5	5.4	4.3	23	1	138	Female	0.0001	0.0004	0.0001	0.0004	0.0013	0.0059
105959696	54	1	1	11.34	0	8	5.1	2.9	20	1	311	Male	0.0001	0.0003	0.0001	0.0003	0.0020	0.0091
205521453	56	0	1	34.9	0	8.6	3	3.2	13	1	461	Male	0.0000	0.0000	0.0000	0.0000	0.0001	0.0007
964175840	62	1	1	43.62	0	9.2	3.8	4.3	26	1	1312	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0001
366215045	73	1	0	19.63	0	7.7	2.9	2.4	14	1	1566	Male	0.0000	0.0001	0.0000	0.0001	0.0007	0.0031
703995795	28	1	0	47.68	0	9.8	3.7	4.8	28	1	1587	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0001
193572795	25	1	0	7.2	0	8.4	5.3	4.2	19	1	1705	Male	0.0002	0.0010	0.0002	0.0009	0.0036	0.0162
140269431	64	1	1	53.15	0	9.3	4.3	3.7	25	1	1958	Female	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
167087450	69	1	1	58.17	0	9.7	3.4	4.7	25	1	2856	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
548454181	23	1	0	28.81	0	7	6.8	1.7	20	1	3224	Male	0.0000	0.0001	0.0000	0.0001	0.0020	0.0089
780327933	67	0	1	29.83	0	7.9	2.2	2.8	14	0	28	Female	0.0000	0.0000	0.0000	0.0000	0.0002	0.0008
586810087	67	1	0	47.63	0	9.4	3.7	4	30	0	114	Female	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
128191268	71	1	0	47.07	0	9.3	4.1	4.8	22	0	210	Female	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
354037890	77	1	0	44.77	0	8.4	1.6	2.4	18	0	322	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0001
325811054	67	0	0	58.52	0	10.1	2.9	4.2	26	0	418	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
290397253	71	1	0	42.44	0	9.1	3.1	4.2	31	0	439	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
340488662	71	1	1	49.03	0	9.7	2.9	4.5	29	0	756	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
852670690	71	1	1	57.84	0	8.9	1.6	4.2	28	0	1215	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
532777291	83	0	1	41.12	0	8.4	2.7	3.7	22	0	1267	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0001
770017136	73	1	0	48.76	0	9.5	4.9	3.8	21	0	1323	Male	0.0000	0.0000	0.0000	0.0000	0.0000	0.0001
845855347	28	1	0	31.17	0	10.3	4.7	2.5	27	0	1370	Female	0.0000	0.0000	0.0000	0.0000	0.0002	0.0007
259011659	63	0	0	59.52	0	9.6	4.2	4.4	26	0	1620	Female	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
451920450	71	1	1	27.19	0	8.9	3.5	3.9	23	0	1680	Male	0.0000	0.0000	0.0000	0.0000	0.0001	0.0004
971884645	65	1	0	22.52	0	9.1	4.1	4	25	0	1911	Male	0.0000	0.0001	0.0000	0.0001	0.0002	0.0007
340172975	66	0	1	58.96	0	9.2	3.1	3.7	14	0	2013	Female	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000

Results

We define truth for 2-year and 5-year outcomes
Filter down the dataset to CKD stages 3-5 (Tangri et al.)
Extract the true labels for the 2-year and 5-year outcomes from the DataFrame df.
Assign the true labels for the 2-year outcome to y_true_2_yr and for the 5-year outcome to y_true_5_yr.
Combine these true labels into a list y_true.

Define Predictions for 4-Variable and 6-Variable KFREs for 2-Years and 5-Years

Extract the predicted probabilities for the 4-variable KFRE model for both the 2-year and 5-year outcomes.
Assign these predictions to y_pred_4var_2_yr and y_pred_4var_5_yr, respectively.
Similarly, extract and assign the predicted probabilities for the 6-variable KFRE model for both the 2-year and 5-year outcomes to y_pred_6var_2_yr and y_pred_6var_5_yr.
Combine the 4-variable model predictions into a list preds_4var.

These steps set up the necessary true labels and predictions for subsequent performance evaluation and analysis of the 4-variable and 6-variable KFRE models for both 2-year and 5-year outcomes.

Table below displays KFRE metrics across models.

Metrics	2_year_4_var_kfre	5_year_4_var_kfre	2_year_6_var_kfre	5_year_6_var_kfre	2_year_8_var_kfre	5_year_8_var_kfre
Precision/PPV	0.614641	0.590909	0.616874	0.593812	0.589202	0.558601
Average Precision	0.559144	0.602071	0.559212	0.603624	0.548023	0.587199
Sensitivity	0.445892	0.641297	0.446894	0.635659	0.503006	0.675123
Specificity	0.949919	0.877670	0.950278	0.880194	0.937175	0.853010
AUC ROC	0.875196	0.844687	0.875311	0.845210	0.877136	0.843695
Brier Score	0.091249	0.137478	0.091271	0.136473	0.096593	0.148403

Appendix

Descriptive Statistics

	Age	GFR	ACR	Ca	P	Alb	TCO2	ESRD_dur
count	16619	16619	16619	16619	16619	16619	16619	16619
mean	54.43	66.7	727.43	9.14	3.62	4.04	26.13	1385.88
std	17.24	31.13	1760.78	0.67	0.84	0.61	3.99	1444.12
min	18	2.237831	0	3.5	0	0.3	3	1
25%	43	46.94	25.28	8.8	3.1	3.8	24	241
50%	57	67.84	120.86	9.2	3.5	4.2	27	868
75%	68	85.13	642	9.6	4	4.4	29	2091.5
max	97	415.04	60323.44	14.8	15.5	5.7	59	5892

Age-Related Distributions

HTN           No Hypertension   Hypertension    Total   No Hypertension %    Hypertension %
18–29                    1711            193     1904               89.86             10.14
30–39                    1311            335     1646               79.65             20.35
40–49                    1682            633     2315               72.66             27.34
50–59                    2166           1247     3413               63.46             36.54
60–69                    2046           1766     3812               53.67             46.33
70–79                    1322           1505     2827               46.76             53.24
80–89                     304            360      664               45.78             54.22
90–99                      16             22       38               42.11             57.89
Total                   10558           6061    16619               63.53             36.47

DM         No Diabetes  Diabetes  Total  No Diabetes_%  Diabetes_%
Age_Group
18-29             1785       119   1904          93.75        6.25
30-39             1444       202   1646          87.73       12.27
40-49             1776       539   2315          76.72       23.28
50-59             2235      1178   3413          65.48       34.52
60-69             2216      1596   3812          58.13       41.87
70-79             1695      1132   2827          59.96       40.04
80-89              423       241    664          63.70       36.30
90-99               25        13     38          65.79       34.21
Total            11599      5020  16619          69.79       30.21

SEX        Male  Female  Total  Male_%  Female_%
Age_Group
18-29       328    1576   1904   17.23     82.77
30-39       396    1250   1646   24.06     75.94
40-49       508    1807   2315   21.94     78.06
50-59       764    2649   3413   22.38     77.62
60-69       779    3033   3812   20.44     79.56
70-79       527    2300   2827   18.64     81.36
80-89       143     521    664   21.54     78.46
90-99         6      32     38   15.79     84.21
Total      3451   13168  16619   20.77     79.23

ESRD_in_5_year_outcome  No_ESRD  ESRD  Total  No_ESRD_%  ESRD_%
Age_Group
18-29                      1778   126   1904      93.38    6.62
30-39                      1482   164   1646      90.04    9.96
40-49                      2019   296   2315      87.21   12.79
50-59                      2970   443   3413      87.02   12.98
60-69                      3395   417   3812      89.06   10.94
70-79                      2586   241   2827      91.48    8.52
80-89                       610    54    664      91.87    8.13
90-99                        35     3     38      92.11    7.89
Total                     14875  1744  16619      89.51   10.49

ESRD_in_2_year_outcome  No_ESRD  ESRD  Total  No_ESRD_%  ESRD_%
Age_Group
18-29                      1808    96   1904      94.96    5.04
30-39                      1526   120   1646      92.71    7.29
40-49                      2099   216   2315      90.67    9.33
50-59                      3121   292   3413      91.44    8.56
60-69                      3544   268   3812      92.97    7.03
70-79                      2673   154   2827      94.55    5.45
80-89                       634    30    664      95.48    4.52
90-99                        35     3     38      92.11    7.89
Total                     15440  1179  16619      92.91    7.09

ckd_stage  CKD Stage 1  CKD Stage 2  CKD Stage 3a  CKD Stage 3b  CKD Stage 4  \
Age_Group
18-29             1165          467            74            68           66
30-39              488          703           131            99          100
40-49              476         1047           277           172          178
50-59              581         1523           457           315          289
60-69              421         1529           843           494          322
70-79              105         1216           752           442          219
80-89               24          286           165           123           48
90-99                2           17             8             7            4
Total             3262         6788          2707          1720         1226

ckd_stage  CKD Stage 5  Total  CKD Stage 1_%  CKD Stage 2_%  CKD Stage 3a_%  \
Age_Group
18-29               64   1904          61.19          24.53            3.89
30-39              125   1646          29.65          42.71            7.96
40-49              165   2315          20.56          45.23           11.97
50-59              248   3413          17.02          44.62           13.39
60-69              203   3812          11.04          40.11           22.11
70-79               93   2827           3.71          43.01           26.60
80-89               18    664           3.61          43.07           24.85
90-99                0     38           5.26          44.74           21.05
Total              916  16619          19.63          40.84           16.29

ckd_stage  CKD Stage 3b_%  CKD Stage 4_%  CKD Stage 5_%
Age_Group
18-29                3.57           3.47           3.36
30-39                6.01           6.08           7.59
40-49                7.43           7.69           7.13
50-59                9.23           8.47           7.27
60-69               12.96           8.45           5.33
70-79               15.63           7.75           3.29
80-89               18.52           7.23           2.71
90-99               18.42          10.53           0.00
Total               10.35           7.38           5.51

Key Observations:

Stage 1 CKD is most common in the 18-29 age group with 1,165 cases and decreases significantly in older age groups.
Stage 2 CKD shows a peak in the 60-69 age group, with 1,529 cases, and remains relatively high in adjacent age groups (50-79 years).
Stage 3 CKD (both 3a and 3b) becomes more prevalent in middle to older age groups, particularly from 50-79 years.
Stages 4 and 5 CKD are less frequent across all age groups, but Stage 4 sees an increase in the 60-69 age group and remains higher in older populations.

Clinical Implications:

The data suggests that early-stage CKD is more common in younger adults, while later stages are more prevalent in older populations.
This pattern may reflect the progressive nature of CKD and the increasing risk with age.

Considerations:

Future studies could focus on understanding the factors contributing to the peak in Stage 2 CKD in the 60-69 age group.
Intervention strategies might be tailored according to the age group and CKD stage distribution to improve patient outcomes.

CKD-Related Distributions

HTN               No Hypertension   Hypertension    Total   No Hypertension %    Hypertension %
CKD Stage 1                  2715            547     3262               83.23             16.77
CKD Stage 2                  4610           2178     6788               67.91             32.09
CKD Stage 3a                 1423           1284     2707               52.57             47.43
CKD Stage 3b                  906            814     1720               52.67             47.33
CKD Stage 4                   557            669     1226               45.43             54.57
CKD Stage 5                   347            569      916               37.88             62.12
Total                       10558           6061    16619               63.53             36.47

DM            No Diabetes  Diabetes  Total  No Diabetes_%  Diabetes_%
ckd_stage
CKD Stage 1          2571       691   3262          78.82       21.18
CKD Stage 2          4740      2048   6788          69.83       30.17
CKD Stage 3a         1776       931   2707          65.61       34.39
CKD Stage 3b         1114       606   1720          64.77       35.23
CKD Stage 4           757       469   1226          61.75       38.25
CKD Stage 5           641       275    916          69.98       30.02
Total               11599      5020  16619          69.79       30.21

SEX           Male  Female  Total  Male_%  Female_%
ckd_stage
CKD Stage 1    856    2406   3262   26.24     73.76
CKD Stage 2   1413    5375   6788   20.82     79.18
CKD Stage 3a   412    2295   2707   15.22     84.78
CKD Stage 3b   322    1398   1720   18.72     81.28
CKD Stage 4    275     951   1226   22.43     77.57
CKD Stage 5    173     743    916   18.89     81.11
Total         3451   13168  16619   20.77     79.23

ESRD_in_2_year_outcome  No_ESRD  ESRD  Total  No_ESRD_%  ESRD_%
ckd_stage
CKD Stage 1                3199    63   3262      98.07    1.93
CKD Stage 2                6670   118   6788      98.26    1.74
CKD Stage 3a               2616    91   2707      96.64    3.36
CKD Stage 3b               1636    84   1720      95.12    4.88
CKD Stage 4                 933   293   1226      76.10   23.90
CKD Stage 5                 386   530    916      42.14   57.86
Total                     15440  1179  16619      92.91    7.09

ESRD_in_5_year_outcome  No_ESRD  ESRD  Total  No_ESRD_%  ESRD_%
ckd_stage
CKD Stage 1                3160   102   3262      96.87    3.13
CKD Stage 2                6565   223   6788      96.71    3.29
CKD Stage 3a               2523   184   2707      93.20    6.80
CKD Stage 3b               1516   204   1720      88.14   11.86
CKD Stage 4                 797   429   1226      65.01   34.99
CKD Stage 5                 314   602    916      34.28   65.72
Total                     14875  1744  16619      89.51   10.49

References

Kang, M. W. (2024). [KFRE validation dataset, Asian cohort]. Unpublished dataset provided by personal communication, June 26, 2024. Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Korea.
Kang, M. W., Tangri, N., Kim, Y. C., An, J. N., Lee, J., Li, L., Oh, Y. K., Kim, D. K., Joo, K. W., Kim, Y. S., Lim, C. S., & Lee, J. P. (2020). An independent validation of the kidney failure risk equation in an Asian population. Scientific Reports, 10, 12920. https://doi.org/10.1038/s41598-020-69715-3