What Proportion of Drugs Approved in 2000 Have Generics in 2024?
of drugs approved in 2000 have generic equivalents in 2024
Questions like this are our starting points when we exploring new datasets or ideas. In this case, I used our custom Orange Book database and some SQL. This is a brief description of how I approached this question.
How to Find Answers ¶
The first step is identifying the relevant dataset(s). The Orange Book is the FDA’s official guide for Approved Drug Products with Therapeutic Equivalents 1. The Orange Book provides a search tool, which is useful for finding the information FDA curators determined would be the most commonly searched, such as generics by company or brand name drug. Searching here returns approval dates as well as the brand name/generic, so I know the information I need is in the Orange Book, but this search interface won’t let me answer this question. Luckily, FDA also provides monthly database exports!
Getting to Know the Data ¶
After identifying the dataset, the next step is figuring out how to work
with it. Generics (abbreviated new drug applications, or ANDA) and brand
name drugs (new drug applications, or NDA) are identified via the
ApplType
column. However, there is no single column that links a given
brand drug to its generics. There are columns describing reference
listed drugs (RLD), reference standards (RS), product numbers,
ingredients, doses, routes of administration, therapeutic equivalence
codes and more. When working in the curated search tools, all this
information was abstracted away, and I was given a simple window to view
the data. Now, I need to build my own window, so it is important to
figure out:
- How does FDA define a generic (ANDA), and how are they approved?
- What is RLD, RS, and TE, and how do they relate to generics?
- How do you match all the approved generics to a given brand name drug (if they exist)?
I’ve found that the best starting point to start addressing these questions is the is the Orange Book Preface.
Building the Query ¶
After some investigation, I figured out that I could identify generics for a given brand drug by matching on ingredient, route of administration, and strength2. An RLD has a defined RS, which is what the FDA chooses as the standard ANDA’s are compared to for that drug. For this question, I didn’t worry about RLD/RS and TE.
I filtered for the first generic by selecting only product number
001
for the brand name drug and the minimum date for the generic. I
also filtered out brand name drugs that were discontinued as of
September 2024.
Knowing Your Data ¶
Even with curated datasets, it is vital to dive deep and discover some of the context of the data before attempting to answer questions. I’ve found that simple questions like the one I addressed here are very useful for building up the context necessary to answer more interesting questions with this data. The real magic comes from discovering and linking different datasets.
There are plenty of ways to link FDA data to other data sources. We’ll cover more in future posts.
We know pharma data. Sign up for our newsletter for research and data product updates.
Let us know if you have any feedback, ideas, or corrections by emailing [email protected]
Therapeutic equivalents are drugs with the same dosage, routes, and active ingredients, that have demonstrated bioequivalence and can be expected to have the same clinical and safety profile. See our post on the FDA Orange Book for more information. ↩︎
This is an abbreviated version of the sqlite query ``` WITH BrandDrugs AS ( SELECT DISTINCT op.DistType AS BrandDistType, op.TradeName, op.Ingredient, op.DFRoute, op.Strength, op.Applicant AS BrandApplicant, op.ApplNo AS BrandApplNo, op.ProductNo AS BrandProductNo, op.ApprovalDate AS BrandApprovalDate FROM Obproduct op WHERE op.ApplType = ‘N’ AND BrandProductNo = ‘001’ AND strftime(“%Y”, op.ApprovalDate) = ‘2000’ ), FirstGenerics AS ( SELECT op.DistType AS GenericDistType, op.Ingredient, – … and so on GROUP BY op.Ingredient, op.DFRoute, op.Strength ) SELECT bd.BrandDistType, fg.GenericDistType, bd.TradeName, – … other fields fg.FirstGenericApprovalDate, JulianDay(fg.FirstGenericApprovalDate) - JulianDay(bd.BrandApprovalDate) AS DaysToFirstGeneric FROM BrandDrugs bd LEFT JOIN FirstGenerics fg ON bd.Ingredient = fg.Ingredient AND bd.DFRoute = fg.DFRoute AND bd.Strength = fg.Strength ORDER BY bd.BrandApprovalDate ASC;
``` ↩︎