Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: a systematic review

Cai, Yue; Cai, Yu-Qing; Tang, Li-Ying; Wang, Yi-Han; Gong, Mengchun; Jing, Tian-Ci; Li, Hui-Jun; Li-Ling, Jesse; Hu, Wei; Yin, Zhihua; Gong, Da-Xin; Zhang, Guang-Wei

doi:10.1186/s12916-024-03273-7

Table 1 The specific evaluation criteria of IVS

From: Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: a systematic review

Score items	Grade	Specific evaluation criteria	References
Transparency of algorithms	I	Post the trained models that can be directly loaded by other researchers for a contiguous independent validation or online/mobile user-friendly calculators that can allow batch processing of participant information (e.g., a prediction software or tool)	∙ APPRAISE-AI [31] ∙ MI-CLAIM [32] ∙ AI-TREE [33]
	II	Apply and report the classic algorithms that can be found in some common tools/platforms OR report complete codes and hyperparameters and required description, allowing independent researchers to run the pipeline end to end
	III	Report formulas and/or incomplete hyperparameters without required description, leading to difficulties in replication or incomplete reproducibility
	IV	Incomplete reports that cannot be used for reproduction
Performance of models	I	At least report the discrimination (preferably c-index) and calibration (preferably calibration plot/table) of the model, and the performance index version is clearly reported and index is excellent (e.g., 0.9 < c-index < = 1.0; calibration intercept close to 0 and calibration slope close to 1)	TRIPOD [34] ∙ CHARMS checklist [35] ∙ Official statement [36] ∙ AI-TREE [33] ∙ Expert comment [37]
	II	At least report the discrimination (preferably c-index) and calibration (preferably calibration plot/table) of the model, and the performance index version is clearly reported and index is good (e.g., 0.7 < c-index < = 0.9; calibration intercept deviates moderately from 0, and calibration slope deviates moderately from 1)
	III	Do not report the discrimination or calibration of the models; OR the performance index version is not clearly reported; OR the value of the index is unknown
	IV	The model performance is at a low accuracy (e.g., c-index < = 0.7; calibration intercept deviates severely from 0 and calibration slope deviates severely from 1)
Feasibility of reproduction	I	The office-based models without requirement for laboratory and inspection data (also known as non-laboratory models)	∙ Validation and evaluation framework [38] ∙ AI standardization [39] ∙ AI-TREE [33] ∙ MI-CLAIM [32] ∙ CONSORT-AI [40] ∙ MAIC-10 [41] ∙ SR of validity and clinical utility [11] ∙ WHO laboratory-based and non-laboratory models [42] ∙ Laboratory-based and non-laboratory models [43]
	II	The laboratory-based models only requiring routine clinical structured data, which are easy to obtain and do not need secondary operation (e.g., image pre-processing or annotation, etc.)
	III	Include data derived from unconventional laboratory and inspection, complex gene-related testing, tissue specimen, and other resource-limiting extensive applications, which are hard to obtain or require secondary operation (e.g., labeling)
	IV	Do not report the variables
Risk of reproduction	I	No domain high risk (evaluated by using PROBAST)	∙ PROBAST [30]
	II	Only one domain is high risk (evaluated by using PROBAST)
	III	Two domains are high risk (evaluated by using PROBAST)
	IV	Over two domains are high risk (evaluated by using PROBAST)
Clinical implication	I	Identified novel risk markers or novel risk standards, which will optimize existing clinical preventive strategies and contribute to patient benefit for the general population and major CVDs, similar to classical T-Ms (e.g., Framingham Score)	∙ SR of T-Ms [29] ∙ Biomedical research AI guideline [44] ∙ BS30440 [45] ∙ APPRAISE-AI [31] ∙ Consolidated AI reporting guideline [46] ∙ AI-TREE [33] ∙ SR of validity and clinical utility [11] ∙ Rare CVD [47, 48]
	II	Do not identify novel risk markers or novel risk standards, but enhance the predictive capacity beyond that of existing methods, which may optimize existing clinical preventive measures or offer additional benefits for the non-rare population and non-rare subset of CVDs (more than 1/2000 of the general population)
	III	Only enhance the predictive capacity beyond that of existing methods, but cannot alter the existing preventive interventions or provide additional benefits for the non-rare population and non-rare subset of CVDs (more than 1/2000 of the general population)
	IV	Do not enhance the predictive performance beyond that of existing methods OR only target a rare population or subset of CVDs (fewer than 1/2000 of the general population, e.g., infiltrative cardiac diseases), leading to inadequate validation and a lack of clinical utility for a broader population

Back to article page

ISSN: 1741-7015

Contact us

Submission enquiries: bmcmedicineeditorial@biomedcentral.com
General enquiries: info@biomedcentral.com

BMC Medicine

Contact us