Estimating Target Patient Population Size

Estimating the size of a target patient population is critical for many stakeholders within the healthcare industry. For pharmaceutical companies, a strong understanding of the patient population should feed into research & development, forecasting, physician targeting, and discussions with payers and health technology authorities (HTAs). For payers, the size of the target patient population aids in estimating the budget impact of treatments and the overall burden of disease.

For many diseases, there are robust estimates for the national prevalence of the disease as published by reputable sources. For example, the Center for Disease Control (CDC) estimates that there are 30.3 million Americans with diabetes1 and that approximately 630,000 Americans die from heart disease each year2. These estimates, however, do not account for comorbidities in a target population, quantifiable health metrics such as blood glucose levels, or current treatments.  

While real-world data can be used to identify target patient populations based on a variety of factors, there are inherently biases in the datasets. For example, employer-based claims data is biased towards the population with insurance and non-retirees. Other biases may include gender, region, and comorbidities. Thus, it is crucial to correct for those biases in order to develop robust estimates for the national target patient population.

A poster presented at the National Lipid Association (NLA) Scientific Session3 aimed to quantify the prevalence of patients in the US with atherosclerotic cardiovascular disease (ASCVD) and diabetes, correcting for biases in the data. An additional analysis was performed to filter the patient population by statin and ezetimibe use and by different thresholds of low-density lipoprotein cholesterol (LDL-C).

The analysis leveraged the MarketScan Research Database, an employer-based claims database. The MarketScan Database is very large and robust, making it a good choice for this analysis. However, there were certain biases that needed to be accounted for, including adjusting for the underrepresentation of retirees.

To perform this analysis, ASCVD and diabetes patients with a valid LDL-C measurement were identified in the MarketScan Database along with their current statin/ezetimibe use. The ASCVD characteristics of interest included patients with coronary heart disease (CHD), history of ischemic stroke, and peripheral arterial disease (PAD). Patients were stratified into distinct profiles which captured the various combinations of ASCVD/diabetes characteristics, such as CHD only, CHD + PAD, ischemic stroke + diabetes, etc. Using national estimates for each characteristic, a national prevalence was estimated per profile to minimize the difference between published national estimates and extrapolated estimates for each characteristic.

As shown in the table below, the extrapolation analysis estimated there are 35.7 million adults in the US with ASCVD and/or diabetes, accounting for comorbidities. Approximately 19.6 million of those were being treated with a statin and/or ezetimibe as of the index date. Of those patients on treatment, 35% have an LDL-C < 70 mg/dL and 79% have an LDL-C < 100 mg/dL.


This highlights the national burden of ASCVD/diabetes in the US, accounting for the fact that national estimates of cardiovascular disease and diabetes include many overlapping patients. Additionally, this analysis shows the underutilization of statin/ezetimibe use amongst this high-risk cohort along with the fact that a significant proportion fail to achieve recommended LDL-C levels.

This analysis provides a robust estimate for the extrapolated number of patients with ASCVD/diabetes in the US, on statin/ezetimibe, and under certain LDL-C levels. Additionally, it provides a framework for estimating the size of a target patient population based on a variety of factors and accounting for biases in a data source. This framework be used to develop robust estimates for national patient population sizes in a variety of therapeutic areas, which has implications on a wide range of stakeholders and decisions.  



1. Center for Disease Control, United States, National Diabetes Statistics Report. 2017.

2. Centers for Disease Control and Prevention, National Center for Health Statistics. Multiple Cause of Death 1999-2015 on CDC WONDER Online Database, released December 2016. Data are from the Multiple Cause of Death Files, 1999-2015, as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program. Accessed at 

3. Gorcyca KM, Khan I, Wadhera R, et al. Prevalence of Atherosclerotic Cardiovascular Disease and Diabetes in the United States. Poster Presented at: National Lipid Association (NLA) Scientific Sessions 2015; June 11-14, 2015; Chicago, IL, USA. 

Tags: Pharma Analytics, HEOR, Health Economics, Real-World Evidence, RWE

Alexa Klimchak

Alexa Klimchak is an Associate Director at Axtria. Alexa’s core expertise is in developing innovative simulations that evaluate the patient journey and the impact of healthcare interventions. Alexa has 10 years of experience in modeling & simulation, health economics & outcomes research (HEOR), and management consulting. She partners closely with her clients and experts to analyze and solve complex HEOR problems. Her analyses are used for publications, global Health Technology Assessment (HTA) submissions, discussions with payers, and internal decisions.

Subscribe to Email Updates

Recent Posts