Using Artificial Intelligence and Machine Learning to understand EHR Unstructured Data

    How AI and Machine Learning Are Increasing Insights from Unstructured Data

    4 mins read

    There is an abundance of information in the healthcare world, especially with new databases emerging such as those created through patient claims, electronic health records (EHRs), social media, and new customer engagement channels. Note that electronic medical records (EMRs) generally captures a patient’s specific interactions with specific healthcare activities that result in billing. EHR data is a longitudinal aggregation of patients' EMR data over time.

    There are substantial opportunities to leverage EHR data for more precise, efficient, and effective interventions at the right moment in the patient care journey. However, as the size of these emerging databases grow, and the underlying data becomes increasingly unstructured, finding those critical ‘needles’ of insight in the ‘haystack’ becomes an increasingly difficult challenge.

    Learn More - "Humanizing The Artificial Intelligence Foundation For Success"

    This is where Artificial Intelligence/Machine Learning (AI/ML) can be a game-changer. AI/ML allows humans to gain unprecedented insights into improving diagnostics and care processes, understanding treatment variability, and enhancing patient outcomes. One such area of applying AI/ML is with EHRs.

    EHRs are instrumental in the healthcare industry’s journey towards digitalization. However, the benefits of EHRs also brings a myriad of challenges including cognitive overload, endless documentation, and user burnout.

    It was reported in the journal Mayo Clinic Proceedings that clinical documentation and review may be the leading cause of lost physician productivity in the United States.1 Physicians spend 34-55 percent of their workday creating notes and reviewing medical records in EHRs, this is time diverted from direct patient interactions.1 


    AIML & Machine Learning

    EHRs are the single point of source for the complete history of patient data. Extracting and analyzing that wealth of information in an accurate, timely, and reliable manner is a continual challenge for providers and developers. Data quality and integrity issues, plus multiple data formats, including structured and unstructured inputs, and incomplete records have made it very difficult to get actionable insights.

    AI/ML is creating more intuitive interfaces to open newer possibilities of information from EHRs into generating more and complete insights for healthcare professionals (HCPs). It automates time-consuming routine processes to save valuable time and provides a window to patient information that was being collected but not available or visible to the HCP for decision-making. Additionally, analytics on EHR data are producing successful risk-scoring and stratification tools, especially when researchers employ Machine Learning/Natural Language Processing (ML/NLP) techniques to identify novel connections between seemingly unrelated datasets.

    Axtria engaged with a diversified healthcare company using AI/ML to extract biomarker information from unstructured data. The objective was to provide a comprehensive view of patient reports and deliver critical insights to healthcare providers from lab test results for breast and ovarian cancer.

    The specialty healthcare providers were dealing with an inefficient data collection and information delivery process. The stakeholders (doctors, paramedical staff, nursing staff, technicians, and management) were viewing data in a raw image format. Additionally, there was a critical need to capture biomarker data and other biographic information for the treatment of breast and ovarian cancer patients that was going undiscovered. This meant comprehensive diagnosis and downstream healthcare delivery were adversely affected by:

    1.  Inefficient patient treatment – leading to high medical treatment costs for patients and the healthcare provider.

    2.  Inadequate patient care – with heavy dependence on the doctor’s experience on prescription and prognosis leading to inconsistency in healthcare delivery with less reliance on data and information availability.

    The Axtria team accurately extracted and analyzed information from ~1,000,000 EMR / EHR files in the form of PDF, Image, and XML files, totaling ~7 terabytes of data to overcome these challenges. Using ML/NLP, the team was able to dive into mining and scanning EHR and EMR files. We focused our effort on the process of using cloud and distributed computing to upgrade data environments, experimenting with different tools to convert images to text, and applying ML/NLP to extract accurate information from the noisy text. A process was developed that automated the handling of scanned records and analyzed the unstructured data that was going undiscovered.

    Using techniques like image recognition, sentiment analysis, and intent detection, Axtria’s experience in big data and ML helped the client by generating the following insights:

    1.  Aided physician decision-making and improved the quality of care.
    2.  Identified new cancer patient population.
    3.  Provided a better understanding of the disease area and treatment patterns.
    4.  Drove patient adherence through tracking patients and their disease progression. 

    We delivered a significant 11% improvement in reporting accuracy
    (going from ~85% to over 95%) with < 2% error rates

    In addition, tailored information dashboards for each stakeholder were delivered, facilitating easy dissemination of patient insights to physicians. The dashboards included comprehensive information, and previously inaccessible lab results and physician and encounter notes, that ultimately led to an increase in decision-making accuracy.

    The solution also helped the hospital network tap into a new source of revenue – insights from anonymized patient data for manufacturers to aid R&D, and improve delivery of treatments to patients.

    Also Read: Actionable Insights Through Text Mining And Sentiment Analysis


    AI/ML is at a stage of continuous evolution with vast potential to deliver better results across the healthcare ecosystem. Understanding the strengths, limitations, and iterative nature of these approaches is critical to the proper usage of the AI/ML tools. At the same time, organizations should have a basic understanding and realistic expectation of these methods. It is important to note that there is often an ‘accuracy versus interpretation’ trade-off in ML models.  That is, in some or many cases, the most accurate algorithms do not usually provide insights that are easy to interpret and there is a need to design intervention programs around them to modify the outcome of interest (i.e., low patient support service utilization, low adherence, etc.).  AI/ML needs human intelligence to command, nurture, and amplify it as a partner in patient care. It takes an efficient combination of people, tools, and resources to leverage advanced technologies that will usher in a new era of clinical quality and exciting breakthroughs in patient care.


    Click here to read more about this case study.



    1.  Reimagining Clinical Documentation With Artificial Intelligence by Steven Y. Lin, MD; Tait D. Shanafelt, MD; Steven M. Asch, MD, MPH, published in May 2018. Available at

    For Questions, Contact Us Now