Asymptomatic left ventricular dysfunction (ALVD), a condition affecting 3-6% of the general population, often goes unnoticed yet has significant impacts on quality of life and longevity. Remarkably, it is treatable if detected early. However, the challenge lies in the lack of cost-effective, non-invasive screening tools in clinical settings.
Advancements in artificial intelligence (AI) preceding the COVID-19 pandemic that are approaching commercialization offer a glimmer of hope. Let's reflect on a 2019 study in Nature Medicine that applied AI to electrocardiograms (ECGs), a standard method for monitoring the heart's electrical activity. By training a convolutional neural network with ECG data and echocardiogram results from over 44,000 patients, researchers successfully identified patients with ALVD, achieving impressive accuracy and predictive values. This innovation is now approaching commercialization, and could significantly impact how we approach early detection, offering a low-cost, easily accessible tool.
"Applying AI to standard ECGs for ALVD detection represents a significant stride in preventive healthcare. It empowers us to intervene early, altering the course of a patient's life for the better." says Dr. Andres Jimenez, prevention expert, and founder of the HealthPrevent360 program.
According to the 2019 Article in Nature Medicine, when the inexpensive, noninvasive screening AI was "tested on an independent set of 52,870 patients, the network model yielded values for the area under the curve, sensitivity, specificity, and accuracy of 0.93, 86.3%, 85.7%, and 85.7%, respectively. In patients without ventricular dysfunction, those with a positive AI screen were at 4 times the risk (hazard ratio, 4.1; 95% confidence interval, 3.3 to 5.0) of developing future ventricular dysfunction compared with those with a negative screen."
To help clarify, let me break down some of these common measures of diagnostic test performance. If you already understand these measures, skip to the next section:
Area Under the Curve (AUC): Think of the AUC as a measure of a test's overall ability to correctly identify those with and without a disease. It's like the test's "accuracy scorecard." An AUC close to 1.0 means the test is excellent at making these distinctions; an AUC of 0.5 means it's no better than flipping a coin.
Sensitivity: This is about how good the test is at correctly identifying people who have the disease. High sensitivity means the test can reliably spot those who are truly ill, reducing the chances of false negatives (telling someone they're disease-free when they're not). Sensitivity includes all "True positives" (these are individuals that test positive and have the disease) divided by all the individuals tested that have the disease, True positives + False negatives (False negatives are individuals that have the disease but the test result was falsely negative).
Specificity: Specificity is the test's ability to correctly identify those who don't have the disease. A highly specific test means it's good at avoiding false alarms - telling people they're sick when they're actually healthy. Specificity includes all "True negatives" (these are individuals without the disease that test negative), divided by all individuals without the disease, True negatives + False positives (False positives are individuals that do not have the disease, but the test result was falsely positive).
Accuracy: This is the overall correctness of the test, considering both those who have the disease and those who don’t. High accuracy means the test generally gives the right result, whether it's a positive or negative diagnosis. Accuracy includes all correct testing outputs, True positives and True negatives, divided by everyone tested.
In the study, the AI applied to the EKG had an AUC of 0.93, which is quite high. This means the AI is excellent at distinguishing between those with and without ALVD. The sensitivity was 86.3%, showing that it's very good at identifying patients who actually have ALVD (for every 100 people with ALVD tested, 86 were identified by the test). The specificity was 85.7%, meaning it's also good at confirming when patients don’t have ALVD, avoiding unnecessary anxiety or treatments for those who are healthy (for every 100 people without ALVD tested, about 86 (85.7) tested negative. An accuracy of 85.7% indicates that the test is highly reliable in giving the correct diagnosis in most cases (for every 100 people tested, the test was correct for 86 (85.7, either correct positive, or correct negative).
So how does this test performance match up with other common diagnostic tests used in medicine that many of us are already familiar with:
Mammography for Breast Cancer Screening: Mammograms have an AUC ranging from 0.7 to 0.9, depending on various factors like age and breast density. Their sensitivity can vary from 70% to 90%, and specificity from 75% to 95%. The AI EKG analysis is comparable in effectiveness, particularly in its high AUC.
Rapid Strep Test for Strep Throat: These tests have a sensitivity of about 90-95% and specificity around 98-99%. The AI EKG analysis shows similar specificity but higher sensitivity, indicating it might be more reliable in identifying the disease when present.
PSA Test for Prostate Cancer: When using a value of 4 ng/ml, a PSA test has a lower specificity compared to the AI EKG analysis of about 91%. This means that while the AI EKG is better at avoiding false alarms, the PSA test might have more false positives, leading to further unnecessary tests.
The HealthPrevent360 program aligns seamlessly with less expensive, more broadly available diagnostic tools like these, which emphasizes early detection and personalized health strategies. We pay close attention to these emerging technologies, ensuring the right patients can get immediate access, and the information received carefully evaluated to support an individual's goals of living longer and healthier. By incorporating cutting-edge tools like AI in ECG analysis, and AI in other low cost diagnostics like retinal scans, HealthPrevent360 will. be able to offer more precise, tailored health plans, ensuring early intervention for conditions like ALVD. This approach is in line with our program's ethos – a comprehensive, evidence-based strategy focusing on prevention and overall well-being.
No Science is Perfect
When translating research into the Real-world, there are many very important considerations that we must take into account, and as board-certified public health and prevention experts, as well as board-certified informaticists with deep understanding in AI and clinical data, we are well suited to incorporate these technologies for our patients. Here are some of those considerations:
Controlled vs. Variable Environments: Research studies, like the one involving AI and EKG, are typically conducted in controlled environments with specific parameters. These conditions are ideal for testing hypotheses but may not fully replicate the complexities of real-world scenarios. When a test is deployed in diverse healthcare settings, factors like different patient demographics, varying levels of equipment, and operator expertise can affect its performance.
Selection Bias in Study Populations: Research often involves carefully selected participants who meet specific criteria. This can lead to results that are not entirely representative of the general population. For instance, a study might inadvertently exclude certain age groups, ethnicities, or people with coexisting conditions, which can skew the results.
Overestimation of Test Performance: In research settings, there's a tendency for test performance (like sensitivity and specificity) to appear more favorable. This phenomenon, known as the "research-translation gap," occurs because controlled conditions are optimal for test performance. In real-world settings, where conditions are more variable and less controlled, the performance of the test may not be as high.
Learning Curve and Technological Adaptation: The introduction of new technologies, especially those involving AI, requires a period of adaptation. Healthcare providers need time to learn and integrate these technologies into their practice. This learning curve can temporarily affect the effectiveness of the test when first deployed. This is an area we plan to advance with our HealthPrevent360 program.
Continuous Monitoring and Validation: It's essential for any new test, especially those using advanced technologies like AI, to undergo continuous monitoring and validation once deployed in real-world settings. This ensures that the test remains reliable, safe, and effective across different populations and healthcare environments.
Ethical and Regulatory Considerations: Finally, deploying new diagnostic tools in a live population must be done ethically and in compliance with regulatory standards. This includes ensuring patient privacy, informed consent, and addressing any potential biases in the AI algorithms.
In summary, while research findings are exciting and offer valuable insights and advancements, caution is needed when applying these results to the general population. Continuous evaluation, adaptation, and adherence to ethical and regulatory standards are key to ensuring the safe and effective use of new diagnostic tools in healthcare.
Attia, Z.I., Kapa, S., Lopez-Jimenez, F. et al. Screening for cardiac contractile dysfunction using an artificial intelligence–enabled electrocardiogram. Nat Med 25, 70–74 (2019). https://doi.org/10.1038/s41591-018-0240-2
David MK, Leslie SW. Prostate Specific Antigen. [Updated 2022 Nov 10]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2023 Jan-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK557495/