What Is Predictive Analytics?
Throughout history, companies have needed to anticipate risks and jump on opportunities. Knowing what's coming ahead can mean the difference between a business that merely survives and one that thrives. Forward-thinking companies are now turning to predictive analytics to guide them. Predictive analytics is an advanced analytics methodology that uses historical data to forecast future events by combining several elements – from standard statistical modeling to more modern approaches like data mining and machine learning.
A study by Allied Market Research found that “the global predictive analytics market size was valued at $7.32 billion in 2019, and is projected to reach $35.45 billion by 2027, growing at a CAGR of 21.9% from 2020 to 2027.”1 This rapid growth is due to the enormous amounts of data generated by computers that have exponentially faster processing power than in the past and easier-to-use software. This deluge of information includes structured data and items not usually seen in an organized database, such as text, video, and data streams from Internet of Things (IoT) devices.
Before we delve deeper into common examples and uses of predictive analytics, it is crucial to understand the four analytics levels. Each of the four types—descriptive, diagnostic, predictive, and prescriptive—play an important role in uncovering business solutions.
Together, all four types of analytics create a complete story of what the data is telling us.
- Descriptive analytics is the most basic level of analytics and answers the question, “what happened?” It primarily observes and reports data by collating and segregating it into manageable groupings that reveal insights. This is done either manually or by using business intelligence (BI) and graphics such as pie charts, bar charts, line graphs, or tables to make the data easier to understand. Descriptive analytics help provide important information on the general functioning of the business and explain what is happening.
- Diagnostic analytics is where we get to the “why did it happen?” It is usually performed using techniques like data discovery, drill-down, data mining, and correlations. It helps find the reason behind the facts revealed during the descriptive analysis. Once we drill down and dive deep into a specific feature of data, we can set up our analysis and draw our conclusion.
- Predictive analytics is the science of looking into and answering what might happen in the future based on past results.
- Prescriptive analytics is the most advanced level of analysis. It goes beyond projections and forecasts to propose the next actions, recommending the best course forward. This type of analytics is instrumental in driving data-informed decision-making.
It is easy to see that all four types of analysis have a crucial role in solving business problems. But for your company or team to zero in on the right data analytics strategy, you must identify some important factors first. What is the current state of your data? How is the data stored? What is the existing data analytics process in your company? Are the solutions to your problems obvious, or do they need a deep dive? You must also understand how far your current data insights are from the desired state. All these must be known when designing the optimal technology stack and developing a detailed roadmap for successful implementation.
Uses of Predictive Analytics Across Different Industries
Predictive analytics is widely used by sales and marketing teams to optimize their operations and efficiency. Here are a few commonly applied examples across industries.
- Healthcare has benefited immensely from predictive modeling in improving patient care outcomes and proactive treatment. Real-time data analysis identifies patients at risk before they show clear signs of disease, allowing physicians to focus on preventive treatment and avoid further deterioration caused by the disease.
- Financial organizations use predictive modeling to build credit risk models that help them make better decisions about which potential borrowers they should lend to and which they should deny. Predictive modeling also helps determine credit limits. Data like the customer’s credit background, payment history, and other related customer data determine how much money a client can borrow. They then use this information to evaluate the customer’s credit score and determine if that person can submit credit payments on time or if they will default. Default, as we know, can happen for many reasons, such as if the borrower loses their job, experiences a financial crisis, or dies.
- Analytics also plays a role in our everyday shopping. For instance, Amazon can tell you what other consumers think about a product. They can identify and recommend similar products by analyzing your historical purchase data. Beyond general products, Amazon also recommends books and movies to users based on their reading and viewing habits.
- Lately, we have seen a rise in the use of analytics in human resources (HR), such as identifying employee churn and determining why employees want to quit. When shared with line managers, this information can help them make plans to retain high-performing employees who are likely to give their notice.
Predictive Modeling vs. Predictive Analytics
Though both the above terms are often used interchangeably, predictive modeling and predictive analytics are different. Predictive analytics uses data analysis, i.e., using data and evidence to drive insights. Predictive modeling, on the other hand, refers to building a structure for that data; in layman’s language, a blueprint or design that is then used to explore the information.
Some of the most widely-used predictive models are described below:
Forecast model – This is one of the most common types of predictive models. It uses past data to predict future values.
Classification model – These models are best for making decisions. For example, a classification model might decide which set of categories a data point belongs to, then assigning a benign versus malignant diagnosis based on a patient's characteristics.
Outlier model –An outlier model is used to identify inconsistent data points that fall outside a data set's expected range.
Time series model – This model uses past data, ordered in a sequence, to forecast future results.
- Segmentation model – These models are used to identify groups of data points that are very similar based on the customer’s preferences, activities, demographics, or any other variable of interest.
The Role of Artificial Intelligence/Machine Learning
Artificial intelligence/machine learning (AI/ML) is an efficient tool that plays a dominant role in predictive analytics. By combining powerful computing mechanisms, statistical techniques, and complex mathematical models, AI/ML allows businesses and analysts to incorporate predictive analytics into real-time problem-solving. Particularly in the case of supervised learning, modeling techniques like tree-based classifiers, random forest, and gradient boosting allow users to build highly specialized models that serve various needs beyond just predicting future outcomes. These models help us understand key drivers and the impact of controllable variables, thereby extending their purpose beyond predictive analytics.
Before diving deep into healthcare examples, let us examine how a basic statistical model works.
The first step is to collect relevant data and prepare the required master data set. By relevant, we mean all the essential data needed to answer the business questions. This is followed by data pre-processing, which entails treating outliers and missing values and applying the data transformations. Treatment of outliers could range from deleting them to replacing them with 0, depending on how negatively they affect our analysis. Data transformations help us modify raw data into a format that is appropriate for model building. Examples like log transformation and data standardization are very commonly used transformation techniques.
After preparing the master data set, we decide on the appropriate algorithm based on data type and resultant outputs. Sometimes, a single algorithm may not be enough. We may need to find the right fit among multiple algorithms based on accuracy and model efficacy. The final model is evaluated and validated through several tests before its deployment. Best practices in predictive analytics call for continued monitoring of the model and ongoing improvement for future values.
Predictive Analytics in Healthcare
For the healthcare industry, predictive analytics plays a central role across the key pillars of regression, classification, and forecasting. One industry standard example is leveraging predictive statistical techniques and models to understand projected demand when developing short- and long-term forecasts. This helps drug manufacturers calibrate their supply chains and negotiate better contracts. Short-term projection models also help trigger gap analysis by comparing expected demand against unexpected dips or peaks, helping manufacturers better understand their customers and distribution patterns. Both regression and traditional forecasting techniques, such as time series modeling, are used to develop forecasts. Classification models leveraging logistic regression, random forest, and such are commonly implemented to predict patient and prescriber outcomes. It is common practice to leverage patient-level data to determine diagnoses, disease stages, and procedures to fill in the patient journey gaps left by data restriction requirements.
Another common practice is to use predictive analytics, particularly classifier-based models, to anticipate prescriber outcomes. It is possible to develop models that build upon granular patient and prescribing healthcare provider data to predict which physicians are likely to write for or adopt particular drugs. These models rely on past prescribing history, available patient pool, response to past promotions, or a combination of these.
One such example, among many others, is a model Axtria developed and deployed to help our client partners identify physicians likely to prescribe their drug in a short-term horizon (within six months). We developed a model trained on past prescribing behavior that can be refreshed periodically to predict future prescribing events. Input data includes patient pool attributes, exposure to promotions, competitor prescribing behavior, and other key aspects relevant to the patient journey. This approach is particularly beneficial as it helps our partners identify potential targets in an ever-dynamic space. The transparent nature of the model also enables additional downstream analytics—understanding drivers for writing and estimating the impact of promotions on writing behavior.
Interpretability techniques like SHAP values (SHapley Additive exPlanations) are implemented to determine what drives prescription writing. SHAP values is a technique based on cooperative game theory and is widely used to interpret machine learning models. Layering them on top breaks open the traditional models, making it easy to see how each independent variable contributed to the model's final prediction. This approach makes it possible to understand differences in individual prescribers' behaviors and further group them or create segments based on similar behaviors. Often, driver analyses help identify curious nuances of behaviors and lead to a better understanding of customers. One example is the response to promotion and how it can impact writing behavior. In the healthcare space, physicians can be reached by several channels, both personal and non-personal. Traditional methods of personal promotion include interactions with sales reps, whereas non-personal promotion can include advertising segments via website banners, print, and emails. By looking at the sales contributions of each promotion variable and modulating input data, it is possible to simulate different promotion scenarios and determine their effect. Movement in the predicted scores of physicians and, more importantly, tracking the magnitude of the movement of scores can help inform decisions about which prescribers to promote your drugs to, how to promote them, and in what frequency.
In summary, a single model can inform:
- which customers are likely to write for the drug,
- the reasons driving them to write, and
- which customers are susceptible to promotion.
Therefore, this model is essential for sales and marketing operations. It offers rich information on likely customers and why they are viable, thus contributing to business intelligence and developing data-driven strategies for enhanced sales and marketing.
Conclusion and Takeaways
With the right combination of business acumen and statistical tools, the pharma industry can deploy predictive analytics to solve problems across several verticals, including patient, physician, and commercial analytics. With the advent of user-friendly analytics platforms, some particularly focused on automated predictive solutions, tools have become more robust and efficient, enabling complex solutions across all levels. A good mix of industry experience, business understanding, and predictive analytics deployed over the right data can empower businesses to predict the future effectively. The critical element in all of this is the human ability to look at results through the lens of industry experience and critically review, tune, and then draw insights to ensure that results from a mathematical process make sense from a real-world perspective.
This article is contributed by Gayatri Subramaniam, Associate Director & Abhishek Adlakha, Manager at Axtria.
- Vishwa G, Vineet K. Predictive analytics market by component… global opportunity analysis and industry forecast, 2020-2027 [Internet]. Portland (OR): Allied Market Research; 2020 Jul [cited 2022 Jan 12]. Available from: https://www.alliedmarketresearch.com/predictive-analytics-market