Researchers used machine learning techniques, including natural language processing algorithms, to identify clinical concepts in radiologist reports for CT scans. (Credit: Icahn School of Medicine at Mount Sinai)

Researchers used machine learning techniques, including natural language processing algorithms, to identify clinical concepts in radiologist reports for CT scans, according to a study conducted at the Icahn School of Medicine at Mount Sinai. The technology is an important first step in the development of artificial intelligence that could interpret scans and diagnose conditions.

From an ATM reading handwriting on a check to Facebook suggesting a photo tag for a friend, computer vision powered by artificial intelligence is increasingly common in daily life. Artificial intelligence could one day help radiologists interpret X-rays, computed tomography (CT) scans, and magnetic resonance imaging (MRI) studies. But for the technology to be effective in the medical arena, computer software must be "taught" the difference between a normal study and abnormal findings.

This study aimed to train this technology how to understand text reports written by radiologists. Researchers created a series of algorithms to teach the computer clusters of phrases. Examples of terminology included words like phospholipid, heartburn, and colonoscopy.

Researchers trained the computer software using 96,303 radiologist reports associated with head CT scans performed at The Mount Sinai Hospital and Mount Sinai Queens between 2010 and 2016. To characterize the "lexical complexity" of radiologist reports, researchers calculated metrics that reflected the variety of language used in these reports and compared these to other large collections of text: thousands of books, Reuters news stories, inpatient physician notes, and Amazon product reviews.

Deep learning describes a subcategory of machine learning that uses multiple layers of neural networks (computer systems that learn progressively) to perform inference, requiring large amounts of training data to achieve high accuracy. Techniques used in this study led to an accuracy of 91 percent, demonstrating that it is possible to automatically identify concepts in text from the complex domain of radiology.

Source