What Proportion of Drugs Approved in 2000 Have Generics in 2024?

Density plot showing the distribution of time to generic approval for drugs approved in 2000 that are still produced
81%

of drugs approved in 2000 have generic equivalents in 2024

Questions like this are our starting points when we exploring new datasets or ideas. In this case, I used our custom Orange Book database and some SQL. This is a brief description of how I approached this question.

How to Find Answers

The first step is identifying the relevant dataset(s). The Orange Book is the FDA’s official guide for Approved Drug Products with Therapeutic Equivalents 1. The Orange Book provides a search tool, which is useful for finding the information FDA curators determined would be the most commonly searched, such as generics by company or brand name drug. Searching here returns approval dates as well as the brand name/generic, so I know the information I need is in the Orange Book, but this search interface won’t let me answer this question. Luckily, FDA also provides monthly database exports!

Getting to Know the Data

After identifying the dataset, the next step is figuring out how to work with it. Generics (abbreviated new drug applications, or ANDA) and brand name drugs (new drug applications, or NDA) are identified via the ApplType column. However, there is no single column that links a given brand drug to its generics. There are columns describing reference listed drugs (RLD), reference standards (RS), product numbers, ingredients, doses, routes of administration, therapeutic equivalence codes and more. When working in the curated search tools, all this information was abstracted away, and I was given a simple window to view the data. Now, I need to build my own window, so it is important to figure out:

I’ve found that the best starting point to start addressing these questions is the is the Orange Book Preface.

Building the Query

After some investigation, I figured out that I could identify generics for a given brand drug by matching on ingredient, route of administration, and strength2. An RLD has a defined RS, which is what the FDA chooses as the standard ANDA’s are compared to for that drug. For this question, I didn’t worry about RLD/RS and TE.

I filtered for the first generic by selecting only product number 001 for the brand name drug and the minimum date for the generic. I also filtered out brand name drugs that were discontinued as of September 2024.

Knowing Your Data

Even with curated datasets, it is vital to dive deep and discover some of the context of the data before attempting to answer questions. I’ve found that simple questions like the one I addressed here are very useful for building up the context necessary to answer more interesting questions with this data. The real magic comes from discovering and linking different datasets.

There are plenty of ways to link FDA data to other data sources. We’ll cover more in future posts.


  1. Therapeutic equivalents are drugs with the same dosage, routes, and active ingredients, that have demonstrated bioequivalence and can be expected to have the same clinical and safety profile. See our post on the FDA Orange Book for more information. ↩︎

  2. This is an abbreviated version of the sqlite query ``` WITH BrandDrugs AS ( SELECT DISTINCT op.DistType AS BrandDistType, op.TradeName, op.Ingredient, op.DFRoute, op.Strength, op.Applicant AS BrandApplicant, op.ApplNo AS BrandApplNo, op.ProductNo AS BrandProductNo, op.ApprovalDate AS BrandApprovalDate FROM Obproduct op WHERE op.ApplType = ‘N’ AND BrandProductNo = ‘001’ AND strftime(“%Y”, op.ApprovalDate) = ‘2000’ ), FirstGenerics AS ( SELECT op.DistType AS GenericDistType, op.Ingredient, – … and so on GROUP BY op.Ingredient, op.DFRoute, op.Strength ) SELECT bd.BrandDistType, fg.GenericDistType, bd.TradeName, – … other fields fg.FirstGenericApprovalDate, JulianDay(fg.FirstGenericApprovalDate) - JulianDay(bd.BrandApprovalDate) AS DaysToFirstGeneric FROM BrandDrugs bd LEFT JOIN FirstGenerics fg ON bd.Ingredient = fg.Ingredient AND bd.DFRoute = fg.DFRoute AND bd.Strength = fg.Strength ORDER BY bd.BrandApprovalDate ASC;

    ``` ↩︎