(원문)
Nature Medicine (2018)
Machine learning can be used for computer-aided diagnosis of acute neurological events and retinal disease and can be incorporated into conventional clinical workflows to improve health outcomes.
Machine learning is a branch of data science that trains computers to perform tasks by observing patterns in large datasets and using them to derive rules or algorithms that optimize task performance. Machine-learning algorithms are now ubiquitous in daily life—from flagging spam in an e-mail inbox to selecting the best route for a daily commute. In medicine, machine learning and other forms of artificial intelligence (AI) may one day transform how physicians diagnose and treat their patients, and studies have already underscored the potential of AI for diagnosing cancer, depression, and chronic pain1,2,3, predicting suicide4, and optimizing dietary decision-making in the setting of diabetes5 using phenotypic and genotypic information or medical images.
Computer-aided diagnosis based on medical imaging is one especially promising field, in which AI technologies could potentially be deployed to enhance or accelerate a physician’s diagnostic capabilities or to assist in triaging urgent cases for rapid evaluations6,7,8. Still, these methods remain relatively uncommon in clinical practice for at least two reasons. First, machine-learning algorithms may not perform well when applied to new data, so it is especially critical to replicate the results in new, independent samples. This is especially true for imaging data: algorithms trained on images derived from a particular device at a particular hospital may need to be modified to perform well when applied to new kinds of images acquired on a variety of devices at other hospitals—an objective that is easily achieved by a trained radiologist but can be much more challenging for computers. Second, even successful AI technologies may ultimately have little impact on patient care unless data scientists and physicians collaborate to work out how best to integrate them into clinical practice in real-world settings to improve patient outcomes. In this issue of Nature Medicine, there are two reports of new AI technologies for computer-aided diagnosis of acute neurological events9 and retinal disease10 that succeed by addressing both of these challenges (Fig. 1).
In the context of stroke, intracranial hemorrhage, and other acute neurological events, “time is brain”, and achieving the best clinical outcomes means diagnosing and intervening as quickly as possible. In a standard radiology workflow, medical images are often interpreted in the order they are acquired, or they can be triaged on the basis of clinical history, but manual triaging can also be time-consuming. Either way, acute cases requiring rapid diagnosis and treatment may not be prioritized. Titano et al.9 use machine-learning methods to develop automated algorithms for screening medical images at the time of their acquisition and triaging cases requiring urgent review by a radiologist.
The authors used an artificial neural network, a machine-learning algorithm inspired by information processing in biological neural networks, to decide whether a computed tomography (CT) image—typically acquired after a patient presents to an emergency department—contains a critical finding, such as a stroke or hemorrhage. An artificial neural network is structured as an input layer of nodes, one or more hidden layers, and an output layer. The authors trained their model using a large (n = 37,236) dataset of CT images. During training, weights between nodes in successive hidden layers were tuned to achieve a desired outcome in the output layer. Instead of manually labeling each case in the training dataset, a natural language–processing algorithm was used to parse case reports and predict whether a critical finding was present.
Importantly, the authors went on to replicate their results and show that they could be used in a real-world clinical setting. They tested their trained model on a second dataset of CT images (n = 180), for which labels were obtained through manual review of patient medical records by a physician. The sensitivity of their algorithm was on par with that of three physicians, albeit with lower specificity (0.48 versus 0.85). Building on this finding, the authors performed a double-blinded prospective trial in a simulated clinical environment to evaluate whether their model could function effectively as a triage system. The model improved the simulated radiology workflow in two ways. First, the model flagged critical findings 150 times faster than human physicians. Second, critical cases (i.e., patients requiring immediate attention) appeared significantly earlier in the work queue when compared to random orderings, meaning that they would be evaluated sooner by a radiologist. Collectively, these findings suggest that a machine learning–based triage system can reduce the time to treatment for urgent cases of acute neurologic illness, thereby improving patient outcomes.
In another study featured in this issue, De Fauw et al.10 show how AI can also be used for computer-aided diagnosis of retinal disease. The authors constructed a two-stage artificial neural network to identify retinal pathologies in optical coherence tomography (OCT) scans. One key to their success was the decision to design a two-stage algorithm that first accounts for technical variations in the images produced by different devices and then diagnoses various retinal diseases. In the first stage, a trained ‘segmentation’ neural network transforms the raw OCT image into a 3D tissue map, assigning each OCT image pixel to 1 of 15 tissue classes. In the second stage, a trained ‘classification’ neural network generates diagnosis probabilities and one of four referral suggestions (for example, ‘urgent’, ‘semi-urgent’, ‘routine’, ‘observation’) using the segmentation map as an input.
The authors tested their two-stage framework retrospectively in a dataset of 977 patients with previously established retinal pathologies and showed how it could be integrated into conventional clinical workflows. Remarkably, referral accuracies were on par or exceeded those from a group of eight retinal specialists and optometrists, even when these human experts also considered clinical notes and other forms of retinal imaging data. The authors improved their model further by minimizing the likelihood of it missing diagnoses with more severe clinical consequences and diagnosing the presence of multiple pathologies. Finally, De Fauw et al.10 replicated their results in a new sample and demonstrated that their algorithms generalized to data acquired from other imaging devices. Here, the two-stage framework was critical: by separating the segmentation and classification steps, the authors could retrain their algorithm to perform well on data from new devices without the need for relearning the classification step, which is much more complicated than the segmentation step.
Moving forward, key challenges must be addressed for AI technologies to be used widely as diagnostic imaging tools. Machine-learning methods have a tendency to ‘overfit’ to idiosyncrasies in the training sample, which may yield overly optimistic performance estimates. The size of the training and replication samples used in both reports and careful efforts to avoid overfitting are noteworthy strengths of these investigations, but independent replications will also be important. Interpreting machine-learning models is another key challenge, especially for artificial neural networks, which often rely on extraordinarily complex methods of extracting and combining features that defy human efforts to understand how they make correct predictions in some contexts and why they fail in others. Defining and understanding failure modes will be critical as AI technologies become more widely used in clinical settings.
It is also important to bear in mind that human physicians do not need machine-learning methods to accurately interpret medical images. Instead, improving patient outcomes in the real world will mean identifying specific clinical scenarios in which machine-learning algorithms can effectively aid human physicians, not replace them. These two studies report significant advances that succeed in part by identifying two specific clinical scenarios—computer-aided diagnosis of retinal disease and triaging head CTs—in which AI tools could yield tangible benefits by supporting physicians and accelerating clinical decision-making.