Machine Learning Models and HCC Risk Prediction in Patients with HCV cACLD after SVR to DAA
Abstract
Background and aims: For HCC detection after SVR to DAA treatment in patients with chronic HCV infection and cALD, a single baseline marker is not sufficient to define a risk profile. A combination of tools and evaluation of dynamic changes are needed to optimize accuracy prediction. We aimed to investigate whether developed ML models based on LSM, laboratory parameters and co-morbidities may help to predict HCC occurrence.
Methods: A retrospective prospective cohort of patients who achieved SVR from 2014 to 2024 at a single tertiary referral hospital. Several classes of ML models, namely: LR, DT, XGB, RF, SVM, and NB were tested and compared.
Results: The cohort included a total of 1082 patients, each of whom underwent at least one and up to three visits (T0, T1 and T2). Patients who had the first LSM (T1) later than 1 year and patients who had HCC diagnosis before T1, were excluded. The XGB model demonstrated that LSM results at T1 correlate with an increased risk of developing HCC. Higher baseline LSM values (T0), increased glucose (T1) and AST (T1) levels were also moderately associated with an increased risk, as was being of older age. Glucose was found to be a good predictor of increased risk, independently of baseline levels. Also, for LR, LSM results (both at T1 and T0) were one of the most predictive features for increased risk. Maim finding is that XGB and LR substantially outperformed traditional HCC risk scores in predictive performance.
Conclusion: LR and XGB were significantly more accurate than traditional HCC risk scores. This work marks an important step toward precision hepatology, demonstrating how dynamic, data-driven models can reshape surveillance.