Natural Language Processing Lab
Unlock access to text analytics in 4 weeks
Download the OfferDiagnosing conditions. Developing treatment plans. Optimizing the patient experience.
These are just a few of the many possible applications for natural language processing (NLP) in the healthcare industry. Because of this, a growing number of healthcare providers and practitioners are adopting NLP in order to make sense of the massive quantities of unstructured data contained in electronic health records (EHR) and to offer patients more comprehensive care. According to a recent report, global NLP in the healthcare and life sciences market is expected to reach $3.7 billion by 2025, at a Compound Annual Growth Rate of 20.5%.
In this blog post, we’ll take a closer look at NLP in the healthcare industry — what it is, how it works, and how healthcare providers can benefit from this truly remarkable technology.
What is NLP & How Does It Work?
Natural language processing is a specialized branch of artificial intelligence that enables computers to understand and interpret human speech.
The way it works is this: NLP systems pre-process data by first “cleaning” the dataset. This essentially involves organizing the data into a more logical format — for example, breaking down text into smaller semantic units, or “tokens,” in a process known as tokenization. Pre-processing simply makes the dataset easier for the NLP system to interpret.
From there, the system applies algorithms to the text in order to interpret it. The two primary algorithms used in NLP are rule-based systems, which interpret text based on predefined grammatical rules, and machine learning models, which use statistical methods and “learn” over time by being fed training data.
Despite being a major technological advancement — one that stands at the crossroads of computer science and linguistics — NLP is more commonplace than you might realize. Any time you interact with an at-home virtual assistant such as Siri or Alexa, or explain a customer service issue to a chatbot, that’s actually NLP in action. That said, NLP also has more sophisticated applications, especially in the healthcare industry, which we’ll explore in this article.
5 NLP Techniques You Should Know
Before we can talk about the ways in which you can use NLP in healthcare, we must first define a few key NLP techniques:
Optical Character Recognition (OCR):
OCR, or text recognition, is the method by which a computer “reads” a handwritten or printed text and converts it into a digital format — for example, scanning a physical document and turning it into a PDF. OCR is also used to scan unstructured data sets, such as images or text files, extract text and tables from that data, and present it in a digestible format. Once this data has been formatted, it can be fed into an NLP pipeline for further analysis. In the healthcare industry, OCR is commonly used to digitize clinical notes, medical history records, patient intake forms, discharge summaries, medical tests, and so on.
Named Entity Recognition (NER):
NER is an information extraction technique that segments named entities — that is, real-world subjects, such as a person, location, organization, or product — into predefined categories. NER is also known as entity chunking, entity extracting, or entity identification. We’ll explore some healthcare-specific NER applications further down the page.
Sentiment Analysis:
Sentiment analysis applies a combination of NLP, text analysis, computational linguistics, and biometrics to a text in order to ascertain its underlying sentiment. For this reason, sentiment analysis is also commonly referred to as sentiment detection or opinion mining.
An excellent illustrative example — and, perhaps, its most common use case — is when businesses apply sentiment analysis to social media. In doing so, they’re able to better understand how the public perceives their products, services, or brand as a whole. A healthcare provider could theoretically do the same by analyzing patients’ comments about their facility on social media in order to get an accurate picture of the patient experience.
Text Classification:
Also known as text categorization, this NLP technique is used to analyze text data and assign tags or labels to different semantic units or clauses based on predefined categories. For example, a healthcare provider might use text classification to identify at-risk patients based on certain key words or phrases within their medical records.
Topic Modeling:
Topic modeling is a form of statistical modeling and NLP used to classify collections of documents — that is, group them together based on common words or phrases in order to identify semantic structures, or “topics.” The most common form of topic modeling, latent dirichlet allocation, uses algorithms to identify semantic relationships between different words and phrases and group them accordingly.
Of the five NLP techniques described here, OCR and NER are the most common in the healthcare industry.
How Can NLP Support the Healthcare Industry?
Though there really are no limits to how NLP can support the healthcare industry, let’s look at three primary use cases:
- Improving Clinical Documentation: Rather than waste valuable time manually reviewing complex EHR, NLP uses speech-to-text dictation and formulated data entry to extract critical data from EHR at the point of care. This not only enables physicians to focus on providing patients with the essential care they need, it also ensures that clinical documentation is accurate and kept up to date.
- Accelerating Clinical Trial Matching: Using NLP, healthcare providers can automatically review massive quantities of unstructured clinical and patient data and identify eligible candidates for clinical trials. Not only does this enable patients to access experimental care that could dramatically improve their condition — and their lives — it also supports innovation in the medical field.
- Supporting Clinical Decisions: NLP makes it fast, easy, and efficient for physicians to access health-related information exactly when they need it, enabling them to make more informed decisions at the point of care.
6 Healthcare-Specific NLP Applications
Now that we’ve covered the basics, let’s discuss NLP applications in a healthcare-specific setting. Before you can use NLP on any text, all paperwork — be it clinical notes, patient records, medical forms, or anything in between — must be converted into a digital format using OCR.
From there, you can apply any of the following:
Clinical Assertion Model:
Clinical assertion modeling enables healthcare providers to analyze clinical notes and identify whether a patient is experiencing a problem, and whether that problem is present, absent, or conditional. For this reason, clinical assertion models are often used to help diagnose and treat patients.
For example, a patient might tell her doctor that she’s experienced a headache for the past two weeks and feels anxious when she walks fast. After examining the patient, the doctor might note that she has no symptoms of alopecia and that she doesn’t appear to be in any pain.
The doctor could later use a combination of NER and text classification to analyze their clinical from that appointments and flag “headache,” “anxious,” “alopecia,” and “pain” as PROBLEM entities. From there, the doctor could further categorize those problems by making assertions as to whether they were present, conditional, or absent — in this case, the headache would be present, anxiousness would be conditional, and alopecia and pain would be absent.
As you can see based on this example, this application of NLP in healthcare enables physicians to optimize patient care by identifying which problems are most pressing and administering immediate treatment.
Clinical Deidentification Model:
Under the Health Insurance Portability and Accountability Act (HIPAA), healthcare providers, health plans, and other covered entities are required to “protect sensitive patient health information from being disclosed with the patient’s consent or knowledge.”
The exception to this rule is data that has been deidentified — that is, data from which specified individual identifiers, such as name, address, telephone number, and so on, have been removed. Deidentified data is no longer considered to be Protected Health Information (PHI) because it does not contain any information that could possibly expose the patient’s privacy.
Healthcare providers can actually use NLP to pinpoint potential pieces of content containing PHI and deidentify or obfuscate them by replacing PHI with semantic tags. In doing so, healthcare organizations can avoid HIPAA non-compliance.
Clinical Entity Resolver:
Using natural language processing, healthcare providers can extract information about different conditions and diagnoses from patient records and assign an ICD-10 Clinical Modification (ICD-10-CM) code to them.
The ICD-10-CM is a valuable resource, one that helps physicians make better decisions by cross-referencing symptoms and diagnoses against ICD-10-CM codes. Therefore, by assigning the appropriate ICD-10-CM code, physicians can monitor healthcare statistics, quality outcomes, mortality statistics, and more for that particular condition. This, in turn, enables them to better understand medical complications, better design treatment, and better determine the outcome of care.
Clinical Named Entity Recognition General Model:
Similar to the Clinical Assertion Model, healthcare providers can use this version of NER to analyze clinical notes, extract keywords, and assign them to specific entities, such as PROBLEM, TEST, or TREATMENT.
For example, if a patient were to be treated with an insulin drip for euDKA and HTG with a reduction in the anion gap to 13 and triglycerides to 1400 mg/dL, within 24 hours, “insulin drip” and “reduction” would be flagged as TREATMENTs, “euDKA” and “HTG” would be flagged as PROBLEMs, and “the anion gap” and “triglycerides” would be flagged as TESTs.
Clinical Named Entity Recognition Posology — shown in the image below — is a more specified version of the Clinical NER General Model. Both versions of this application can be used to help clinical trials identify patients through drug and dosage filtration.
Clinical Relation Extraction Model:
Healthcare providers can use NLP to identify the strength, frequency, form, and duration associated with a particular drug. Known as the Clinical Extraction Model, this is achieved by drawing connections between different entities detected by NLP algorithms. This application of NLP in healthcare supports clinical documentation by identifying pertinent data based on the different relationships that exist between key words and phrases.
Financial Contract Named Entity Recognition:
This NLP application works in much the same way as the other examples of NER shown above except, in this case, it’s applied to financial documents in order to identify organizations, individuals, monetary sums, dates, and so on. Financial Contract NER enables health insurance providers to automate the financial contract review process and flag any potential errors or fraudulent information.
Take Patient Care to the Next Level with Hitachi Solutions
When it comes to providing your patients with exceptional and, in some cases, life-saving care, you can’t afford to let anything stand in your way — especially not unstructured data.
Here at Hitachi Solutions, we’re committed to helping organizations within the healthcare and health insurance industries do more with their data using innovative solutions and services, including natural language processing. All of our offerings come backed by decades of proven data science expertise, and we have the resources to help your organization go further, faster, and at scale.
Are you ready to take patient care to the next level using NLP? There’s no time like the present to get started — contact us today to learn more.
Models referenced in this post were created with Spark NLP. Learn more about it here: John Snow Labs | NLP & AI in Healthcare