FormalPara Key Points

Clinical trials investigating novel pharmacotherapies for alcohol use disorder (AUD) traditionally use abstinence-based drinking outcomes or no heavy drinking days as trial endpoints to determine the efficacy of pharmacotherapies.

Recent developments in novel endpoints for AUD pharmacotherapy trials include the utilization of reductions in World Health Organization (WHO) risk drinking levels as harm-reduction endpoints.

Biological markers of alcohol use may also serve as objective endpoints in AUD pharmacotherapy trials.

The updated definition of recovery for AUD, as outlined by the National Institute on Alcohol Abuse and Alcoholism (NIAAA), offers new opportunities for the use of endpoints beyond measures of alcohol consumption itself.

1 Introduction

Alcohol use disorder (AUD) is a highly pervasive disorder affecting over 28.6 million adults in the USA [1]. Moreover, 93.4 million adults are estimated to suffer from AUD during their lifetime [2]. AUD is characterized by the continued use of alcohol despite significant social, psychological, and medical consequences [3]. These consequences include health conditions such as cardiovascular disease, cancer, liver cirrhosis, and injuries [4]. It is estimated that more than 140,000 people die from alcohol-related causes annually [5], making excessive alcohol use a leading cause of preventable death [6]. Despite these negative consequences, only an estimated 4.6% of adults with AUD received any treatment for alcohol use in 2021 [1], and only 1.6% of adults with AUD received evidence-based medications for AUD in 2019 [7].

To date, three medications have been approved by the US Food and Drug Administration (FDA) for the indication of AUD [8, 9]. These medications are disulfiram, acamprosate, and naltrexone (formulated for oral administration or extended-release injection). These medications, along with nalmefene, are recognized by the European Medicines Agency (EMA) as established pharmacotherapies for AUD. Disulfiram, the first FDA-approved medication for AUD, is an aldehyde dehydrogenase inhibitor that inhibits the metabolism of alcohol, which leads to an accumulation of acetaldehyde in the blood following alcohol intake [10]. The accumulation of acetaldehyde leads to unpleasant physiological reactions including flushing, nausea, vomiting, headache, and tachycardia [10]. Thus, disulfiram is intended to create an aversive reaction to alcohol. Disulfiram is administered daily to support individuals as they work towards self-management, with treatment duration varying from months to years depending on the individual [11]. Due largely to problems with medication adherence, disulfiram has shown mixed clinical efficacy [12]. Moreover, the psychological expectation of physiological reactions from disulfiram renders disulfiram equally efficacious as placebo in blinded randomized controlled trials, but more efficacious than placebo in open-label trials [13]. Thus, the risk-benefit of disulfiram in comparison to lack of medication is more favorable when generalized to naturalistic settings where patients know what medication they are consuming. Acamprosate is thought to interact with the glutamatergic system, but its exact mechanism of action remains unclear [3, 14]. Acamprosate may be effective in hel** individuals achieve and maintain abstinence, with treatment duration varying from months to years depending on the individual [15, 16]. Compared to placebo, acamprosate has been shown to significantly reduce the risk of returning to any drinking by 86% [17]. Adverse effects from acamprosate are generally minimal, with diarrhea being the most common side effect reported [18].

Considering acamprosate's significant benefits and minimal adverse events, the risk-benefit of acamprosate is favorable compared to placebo. Naltrexone is a mu-opioid antagonist and is approved as a treatment for both opioid and AUD [19]. Oral naltrexone results in lower rates of relapse [20] and reductions in subjective pleasurable effects of alcohol [21,22,23], craving for alcohol [24, 25], and drinks per drinking day [26]. Oral naltrexone is typically prescribed to be taken once daily for a period of up to 12 weeks [27]. Similarly, injectable naltrexone reduces the number of heavy drinking days [28, 29]. Injectable naltrexone is administered intramuscularly once every 4 weeks and can be used for 6 months or longer [30]. Naltrexone is generally well tolerated with minimal adverse effects [22, 31]. Placebo-controlled randomized controlled trials of naltrexone indicate a 50% reduction rate in relapse due to naltrexone versus placebo [32]. This significant benefit, coupled with the high tolerance and low adverse events of naltrexone, indicate a favorable risk-benefit ratio of naltrexone in comparison to lack of treatment [32]. These FDA-approved pharmacotherapies for AUD have a moderate effect on the overall reduction of alcohol use [1]. Given the limited number of current FDA-approved pharmacotherapies for AUD, their moderate treatment effects, limited prescription rates, and poor medication adherence (disulfiram), there is a need to facilitate the development and regulatory approval of novel medications that can effectively treat AUD.

While continuous research aimed at advancing novel treatments is a crucial aspect of this process, it is equally essential to examine the application of endpoint metrics in clinical trials for AUD. Presently, the FDA accepts abstinence and no heavy drinking days as primary outcomes for phase 3 trials of AUD pharmacotherapy [33]. While these outcomes remain crucial benchmarks, broadening the scope of metrics and approved endpoints accepted by regulatory bodies could enhance the medication approval process. In this article, we explore the current methods used to measure alcohol use in clinical trials for pharmacotherapy development. We also examine how these measurements are utilized for medication approval. In addition, we discuss alternative methods of measuring alcohol use and medication efficacy that could enhance insights from clinical trials. Finally, we explore other potential endpoints and analysis methods that regulatory bodies could adopt to improve the development and approval process of pharmacotherapies for AUD.

2 Measurements of Alcohol Use

Alcohol consumption is typically assessed using methods such as the Timeline Follow-Back (TLFB) interview [34], the quantity-frequency (QF) questionnaires [35], and Form 90 [36]. The TLFB, a semi-structured interview, is designed to evaluate the daily consumption of drinks within a specified timeframe, commonly ranging from the preceding 7 to 30 days. In clinical pharmacotherapy trials, the TLFB may be administered at baseline and then during follow-up time points spanning the treatment period to assess the daily number of drinks consumed. Though the TLFB has a high reliability [34], it may be influenced by recall bias and a tendency to under-report daily alcohol use due to social desirability effects [37]. While the TLFB is traditionally administered as a semi-structured interview [24], self-administration of the TLFB is often utilized. In a comparative study, participants reported consuming more total drinks in a repeated self-administered 7-day TLFB compared to an interview-administered 30-day TLFB during overlap** time periods [38]. A daily level analysis revealed that these discrepancies increased with the length of recall, suggesting that the 7-day TLFB might increase accuracy in documenting drinking events [38]. TLFB interviews of recent alcohol use allow for the quantification of different aspects of drinking behavior including drinks per drinking day, percentage of heavy drinking days, and drinking per day percentage of abstinent days. These calculations often require manually transforming participant-reported standard drink amounts to volume and alcohol content by volume (ABV) measurements [39].

Quantity and frequency (QF) questionnaires similarly measure the quantity and frequency of typical drinking, the frequency of heavy drinking, and the maximum number of drinks consumed on a single occasion [40]. In contrast to the semi-structured interview style of the TLFB, QF questionnaires employ a multiple-choice response format. While the format of QF questionnaires makes it more efficient to employ, there is evidence that it can result in a loss of fine-grained drinking data [40]. There is evidence suggesting that QF may underestimate the quantity of alcohol consumed to a greater extent than the TLFB [41]. However, QF questionnaires offer quick and effective measurements of typical alcohol consumption and hazardous drinking. Questions probed by QF questionnaires include quantity of drinks on a typical day, frequency of drinking, frequency of binge drinking, and maximum number of drinks [40].

The Form 90 offers a more comprehensive assessment of drinking behaviors compared to the TLFB and QF questionnaires. Form 90 consists of a family of instruments comprising structured assessment interviews that are widely used to evaluate alcohol consumption over a specified time frame. The assessments are designed to provide a continuous retroactive daily drinking record from a 90-day baseline period through the duration of the assessment period [36]. Form 90 consists of five separate instruments, each serving a specific purpose. These five instruments include Form 90-I for the initial intake interview, Form 90-F for in-person follow-ups, Form 90-T for telephone follow-ups, Form 90-Q for a quick time-limited follow-up, and Form 90-C for collateral information gathering [36]. This breadth of instrumentation allows for Form 90’s comprehensive continuous recording of alcohol use, allowing it to potentially provide more sensitive measurements of abstinence and no heavy drinking days [42]. Form 90 quantifies drinking behavior by estimating total estimated standard drinks of alcohol consumed on each day for 90 days, repetitive steady and episodic patterns of drinking, periods of total abstinence, and not-otherwise-accounted-for days. In addition, Form 90 can be utilized to calculate “intoxication peaks,” or measurements of projected blood alcohol concentration based on length of time alcohol was consumed in a given drinking period [36]. Given this breadth, Form 90 is combining methods of TLFB and average consumption estimates by using calendars to gather extensive daily drinking level information [43]. Therefore, Form 90 allows for ease of calculations for drinking levels, patterns, and outcomes.

Intensive longitudinal measurements such as a daily diary or ecological momentary assessment measurements can also be utilized to assess daily drinking [44]. Smartphone application daily diary interviews are promising assessments to measure alcohol consumption more accurately over time as compared to the TLFB [45]. More broadly, ecological momentary assessment approaches including mobile electronic diaries, personal data assistants, and smartphones may be useful in capturing alcohol use and its consequences in real-time [46], thereby circumventing the potential biases of retrospective recall as seen with the TLFB, QF, and Form 90. Ecological momentary assessment methods can be used to estimate blood alcohol concentrations and to study subjective responses to alcohol in the natural environment, with and without medications [47, 48]. These self-monitoring approaches, however, could potentially lead to issues with compliance and missing data due to the challenges associated with sustained participant engagement, technological disruptions, and the participant's commitment to regular data input.

The choice of an assessment method for alcohol consumption is contingent upon specific research objectives and the feasibility of measurement tools. Information on alcohol use collected with assessment tools can be leveraged to demonstrate individual alcohol use patterns, including abstinence and heavy drinking days. These data are crucial, as changes in their calculated outcomes in pharmacotherapy trials inform current regulatory approval guidelines for novel AUD treatments.

3 Endpoints

Given the aforementioned methods to measure alcohol consumption in the context of clinical trials, the next section in this review covers a variety of endpoints that can be derived from the assessments described above. Specifically, a host of endpoints are discussed, including abstinence-based endpoints, non-abstinence endpoints, the WHO risk drinking levels endpoints, recovery as an endpoint, patient-reported outcomes, neuroimaging endpoints, and biological endpoints. Together, these endpoints provide a comprehensive set of outcomes that can be used alone or in combination to advance treatment development for AUD. Table 1 provides a summary of drinking endpoints used in a selected set of 118 clinical trials for AUD. For details on the selection of this sample of clinical trials, see Ray et al. [49].

Table 1 Frequency of endpoints utilized in 118 pharmacotherapy randomized clinical trials for alcohol use disorder (AUD)

3.1 Abstinence-Based Endpoints

Total abstinence serves as a primary outcome for clinical trials of AUD pharmacotherapy and can indicate a successful response to treatment [33]. Total abstinence from alcohol is characterized by no consumption of any alcohol within a designated timeframe. Within a clinical trial for AUD pharmacotherapy, there are various ways to analyze total abstinence data. For example, researchers can calculate the percentage of participants who achieved complete abstinence during the duration of a clinical trial and compare this proportion between treatment groups [50]. Researchers could also employ abstinence-based measurements to compare the percentage of days abstinent between treatment groups [51].

Abstinence-based outcomes are reasonable predictions of clinical benefits and can serve as an indicator of the efficacy of pharmacotherapy for AUD [22]. The dichotomous categorization of participants into “abstinent” or “non-abstinent” can also be valuable for clinical interpretation and establishing guidelines to define a successful treatment [52]. Variants of the abstinence-based outcome may be constructs such as time to first drinking day or time to first heavy drinking day.

3.2 Non-Abstinence-Based Endpoints

3.2.1 Percentage of No Heavy Drinking Days

Non-abstinence goals for treating AUD may be more attractive to patients [53] and could increase motivation to seek and maintain treatment. While achieving total abstinence is a valid and meaningful goal for some, more lenient outcomes, such as a reduction in drinking rather than complete abstinence, may be more appropriate for others [52]. The concept of moderate drinking as a viable treatment goal for AUD was introduced in the 1970s [54, 55]. It was proposed that the insistence on total abstinence as the only goal for treatment may be unrealistic and discouraging for individuals seeking treatment [54]. Furthermore, it was demonstrated that individuals with AUD were capable of acquiring and maintaining patterns of controlled drinking, also described as low-risk drinking [54].

Aligned with a focus on more controlled drinking rather than complete abstinence, the percentage of subjects with no heavy drinking days (PSNHHDs) is also recognized as a primary efficacy endpoint in clinical trials of pharmacotherapy for AUD [33]. PSNHHDs also serve as an indicator of a successful response to treatment [33]. A heavy drinking day is defined as four or more drinks for women or five or more drinks for males consumed in one day [56]. The PSNHDDs outcome is a dichotomous measure that defines a participant as either having no heavy drinking days or as having drinking days, even if they had only experienced one heavy drinking day [52]. Researchers can then analyze the percentage of participants with no heavy drinking days throughout the trial period and compare these percentages across treatment groups [52].

3.2.2 World Health Organization Risk Drinking Level Reductions as an Endpoint

Total abstinence and PSNHDDs may not capture the magnitude of drinking reductions, which could be an effective measurement of individualized beneficial treatment outcomes. To address this concern, researchers have explored the potential inclusion of measures that are more finely tuned to capture reductions in drinking. The National Institute on Alcohol Abuse and Alcoholism (NIAAA) and the Alcohol Clinical Trials Initiative (ACTIVE) workgroup have collaborated to discuss reductions in WHO risk drinking levels as a primary efficacy endpoint for AUD pharmacotherapy clinical trials [57]. The WHO risk drinking levels include: very high-risk drinking, high-risk drinking, moderate-risk drinking, and low-risk drinking [58], and are based on grams of ethanol consumed per day. Therefore, this measure considers the magnitude of change in the volume of alcohol consumed with treatment [59]. The success rate of achieving full abstinence or no heavy drinking days is generally lower than the success rate for attaining a 1 or 2 WHO risk level reduction [59]. In a secondary analysis of a randomized clinical trial of varenicline, a promising pharmacotherapy for the treatment of AUD, 7.6% of participants who received varenicline achieved abstinence, while 53% of participants who received varenicline achieved a WHO 2-level reduction [59].

While the NIAAA does not assert that a reduction in WHO risk drinking levels implies "recovery" from AUD, as heavy drinking may still be present, there is evidence suggesting that a reduction in WHO risk drinking levels could be an indication of a positive response to medication [60]. Studies have demonstrated evidence indicating that a 1- or 2-level reduction in WHO risk drinking level is associated with significant improvements in various aspects of mental and physical health [57, 61,62,63,64]. Furthermore, reduction in WHO risk drinking levels has been demonstrated to predict improved long-term outcomes for up to 3 years following the trial [63,64,65]. These outcomes include a lower risk of alcohol dependence, reduced alcohol consumption, decreased risk of a comorbid substance-use disorder, improved physical health, and fewer alcohol-related negative consequences [63,64,65]. Notably, these outcomes may also better align with a patient’s goal of reducing drinking rather than achieving full abstinence [59].

3.2.3 Including Grace Periods in Analyses of Endpoints

Researchers have explored various approaches to conducting endpoint analysis, including the incorporation of grace periods, aiming to enhance the precision of assessments related to drinking outcomes in clinical trials. Grace periods are defined as early time periods in a clinical trial that should be excluded from the analysis to allow the pharmacotherapy to reach an optimal therapeutic level [52]. Additionally, grace periods could allow the novelty of participating in the trial to diminish, especially for participants in the placebo group [52]. In a secondary analysis of data from the COMBINE study of naltrexone for reducing heavy drinking days, four potential grace periods were examined [52]. The naltrexone treatment effect increased as the grace period duration increased. With no grace period, there were no significant differences between the naltrexone and placebo groups in terms of the percentage of subjects with PSNHDDs. However, with a 1-month, 2-month, and 3-month grace period, there was a significant difference in PSNHDDs between the naltrexone and placebo groups, with naltrexone being more efficacious [52]. Similar findings were observed with data from a clinical trial for topiramate for the treatment of AUD, in which there were no significant differences in PSNHDDs between topiramate and placebo when there was no grace period implemented [52]. When a 1-month and 2-month grace period was implemented, there was a significant difference in PSNHDDs between the two groups, favoring topiramate [52].

Tailoring the grace period to the unique titration periods and pharmacologic actions of the medications under examination is imperative, ensuring its alignment with both the specific drug being tested and the trial's design. For example, if the titration period of a medication is 6 weeks, it would be reasonable to implement a grace period that extends to the end of the titration period. While recognizing the benefits of utilizing a grace period, it is important to acknowledge that extending the duration of this period may result in a prolonged and more costly trial, potentially leading to increased dropout rates and a higher likelihood that participants will be lost to follow-up. When implemented within reason, however, the incorporation of grace periods stands out as a viable strategy to enhance the evaluation of drinking outcomes in clinical trials.

3.2.4 Biological Endpoints

Apart from patient reports of drinking behavior outcomes, biomarkers of alcohol use can additionally function as objective treatment outcome measurements of pharmacological responses to AUD medications [66, 67]. The incorporation of biomarkers presents a promising opportunity to enhance the reliability of alcohol consumption measurements in clinical trials by functioning as unbiased treatment outcome indicators of responses to AUD medications [66, 67]. These biomarkers can be defined as measurable behavioral phenotypes [68], biological substrates [67, 68], or neural activity [69]. Biomarkers that measure biological substrates fall into two main categories: direct and indirect [70]. Direct biomarkers involve assessing alcohol levels or byproducts of ethanol metabolism, while indirect biomarkers measure the effects of alcohol consumption on various physiological processes or organs [71].

Blood alcohol concentration (BAC) and breath alcohol concentration (BrAC) are widely used direct biomarkers that detect alcohol concentration in the blood and breath, respectively [72]. These measurements are generally closely correlated and are utilized to assess the immediate level of intoxication [73]. Although valuable in clinical trials for assessing very recent alcohol consumption, the limited time window of sensitivity restricts their ability to offer insight into a participant's drinking patterns over the course of a clinical trial. An alternative method for monitoring alcohol consumption is through the use of transdermal alcohol sensors (TAS). These are devices that were developed to monitor alcohol consumption continuously over extended periods of time by measuring alcohol vapors that are excreted through the skin in sweat [74]. There is a moderate to strong correlation between TAS data and both breath alcohol content (BrAC) and self-reported drinking [75, 76]. However, TAS may have decreased accuracy in detecting low-to-moderate alcohol consumption [76]. Further development and validation are critical to ensure reliability and accuracy across various levels of alcohol consumption in real-world contexts [77]. Phosphatidylethanol (PEth), another widely used direct biomarker, is a cellular membrane phospholipid that is formed in the presence of ethanol [78] and demonstrated to be more sensitive than other direct biomarkers [79]. It can be detected in blood for up to 12 days following a single drinking event [80], and the concentration of measured PEth is correlated with the amount of alcohol consumed [81]. Other direct biomarkers include fatty acid ethyl ester (FAEE), which is present in blood for at least 24 h following alcohol consumption [82]. Indirect biomarkers include mean corpuscular volume (MCV) and gamma-glutamyl transpeptidase (GGT), which have detection times ranging from days to weeks, providing a more extended timeframe for assessing alcohol consumption compared to immediate measures like BAC and BrAC [71]. Several biomarkers of recent alcohol consumption, such as the ethanol metabolite ethyl glucuronide (EtG), can also be detected in urine [83]. The sensitivity of this assay to detect heavy drinking depends on the EtG cutoff level used; higher cutoffs can generally detect EtG for only about a day, while lower cutoffs can detect heavy drinking for up to 5 days [83]. However, the validity of urinary EtG tests has shown mixed results, and they are not yet considered reliable enough to be used as objective biomarkers [84]. Nevertheless, urinary EtG tests can still be useful in qualitatively assessing recent alcohol consumption, especially when used in conjunction with other measures of alcohol use and when blood sampling is not feasible. Hair analysis for EtG is another method to assess alcohol consumption. While these tests may be capable of indicating alcohol consumption for up to 90 days, these tests have shown low sensitivity and are highly prone to false positives [85, 86].

Moreover, instead of being employed as a sole endpoint measurement, biomarkers could be utilized at various intervals throughout the trial, offering supplementary evidence regarding participants' drinking patterns and enhancing the reliability of the data. While biomarkers for alcohol consumption offer valuable insights, they come with their own set of limitations. A major limitation is the inter-individual variability of alcohol pharmacokinetics, influenced by factors including age, sex, and comorbid conditions [71, 87, 88]. As such, the reliance on direct biomarkers in clinical trials may lead to challenges in establishing consistent and reliable endpoints, as individual responses to alcohol can significantly vary. Furthermore, these biomarkers would need to be collected at multiple time points throughout the trial to yield temporal information about drinking patterns across the study duration. Biomarker collection would also increase concerns related to cost, logistical complexities of collection, and specialized storage and analysis requirements [89].

When used appropriately, biomarkers can facilitate a more nuanced understanding of alcohol-related behaviors following an intervention. Their integration alongside other clinical outcome measurements can offer a comprehensive and robust evaluation of interventions, mitigating the potential limitations associated with sole reliance on subjective measures of alcohol consumption. While the development of direct biomarkers of alcohol consumption has evolved, there is a strong call for indirect biomarkers that can capture disease processes beyond alcohol use itself [66, 90].

3.2.5 Neuroimaging Endpoints

Neuroimaging paradigms may non-invasively assess the efficacy of treatments for AUD and the functional neurocircuitry underlying these effects [91]. Task-based functional magnetic resonance imaging (fMRI) paradigms, for instance, have enabled researchers to examine the impact of medication on brain activity within tasks specifically designed to recruit neural circuitry implicated in AUD pathology. For example, researchers can employ an alcohol neural cue-reactivity paradigm in the scanner at baseline and following medication administration to understand how the medication influences neural activity in response to alcohol-related cues, designed to induce craving [92]. Furthermore, researchers can utilize resting-state fMRI, a technique that captures functional brain connectivity in the absence of a stimulus or task. This approach can help to elucidate how a medication influences resting-state functional connectivity within neural networks associated with addiction [93]. Incorporating neuroimaging into pharmacotherapy clinical trials allows researchers to better understand how pharmacotherapies modulate specific brain regions implicated in AUD. It is important to note that utilizing fMRI in a clinical trial setting can incur significant costs. Additionally, participants could be excluded due to the strict MRI-safety scanning criteria. However, when utilized, this integrated approach holds the potential to bridge the gap between observed alterations in neural activity and improvements in clinical outcomes, effectively creating a biomarker for treatment efficacy. Several efforts are underway to standardize fMRI drug cue-reactivity in order to advance it as an acceptable endpoint for clinical trials [94, 95].

3.2.6 Recovery Definition as an Endpoint

More recently, researchers in the field have continued to discuss the harm reduction approach in AUD treatment [96]. This interest stems from the recognition that the emphasis on recovery exclusively defined as abstinence from alcohol and the absence of AUD symptoms may inadequately represent the heterogeneous and multifaceted nature of recovery from AUD [97]. Thus, researchers have actively worked towards establishing a clear definition of recovery from AUD. Presently, stakeholder groups such as researchers, clinicians, policymakers, the FDA, and individuals with AUD define “recovery” in various ways. The ongoing discussions regarding the measurement of non-abstinence treatment outcomes include subjective symptom reduction outcomes to better capture symptom relief and functioning throughout treatment [98], improvements in social factors associated with harm reduction [99], and biomarkers to assess biological and brain-based indicators of recovery [98].

The NIAAA recently developed an operational definition of recovery from AUD to further facilitate consistency within the field [60]. This definition includes three primary components of recovery [60]. The first component of this definition is remission from AUD symptoms in accordance with the diagnostic guidelines in the DSM-5 [60]. This entails looking at the severity of the disease based on the number of criteria for AUD endorsed, as well as paying attention to criteria that are associated with clinical improvement in psychosocial functioning and well-being, which can serve as markers of recovery [60]. The second component of the definition is cessation from heavy drinking, with heavy drinking defined as no more than three (for females) or no more than four (for males) drinks on a single day [60], aligning with the primary endpoint language utilized by the FDA in pharmacotherapy trials for AUD. The third component of the definition is improvements in biopsychosocial functioning and quality-of-life criteria as markers of recovery [60]. This approach is consistent with the recognition that changes in drinking may be necessary but not sufficient to re-establish healthy functioning across multiple life domains [98].

The NIAAA’s newly developed definition of recovery recognizes that recovery is an ongoing process and that improvements in biopsychosocial functioning and well-being are an important part of recovery [60]. Establishing and adopting a standardized definition of recovery is important for advancing research in the field by allowing professionals more consistent and precise ways to measure recovery, ultimately enhancing our comprehension of the clinical trajectory of AUD. In terms of endpoints for AUD pharmacotherapy trials, researchers can utilize this aspect of the definition to capture clinically meaningful improvements in well-being throughout the recovery process. There is a tremendous opportunity for methods development and standardization in order to reach the overarching goal of a more comprehensive assessment of recovery. To that end, measures of well-being and functioning could potentially be incorporated as accepted endpoints for measuring the efficacy of a medication. Furthermore, the revised definition of recovery could also guide the development of more holistic public health strategies and interventions that extend beyond promoting abstinence.

3.2.7 Patient-Reported Outcomes

Throughout the treatment development process, it is also important to consider outcome efficacy from the patient’s perspective. The Patient-Reported Outcomes Measurement Information System (PROMIS) is a framework launched by the National Institutes of Health to guide the development of measurements assessing patient-reported outcomes on several constructs (pain, fatigue, physical functioning, emotional distress, sleep) across a range of diseases [100]. The PROMIS contains item banks, which led to the creation of specific patient-reported outcome measures (PROMs) for constructs including alcohol use [101]. Specific PROMs assess a patient's perception of their physical and mental health at specific points in time, such as baseline and follow-up visits to monitor the patient’s subjective progress [102]. These assessments can provide valuable insights into the patient’s personal experiences and well-being, which may not be fully captured by abstinence or drinking reduction-based measurements. Five PROMIS item banks are related to alcohol use. These item banks include alcohol use, negative and positive consequences of use, and negative and positive expectancies of drinking [103]. These items are highly positively correlated with alcohol use disorder identification Test scores (r = 0.79), supporting the validity of using PROMIS alcohol measures [101]. In addition, PROMIS health status profile measures evaluating emotional distress, sleep disturbance, and pain could be effectively utilized as outcome measures in individuals with substance-use disorders. Both patients and providers in addiction medicine settings rated those constructs highly important in assessing their perceived recovery [104]. These patient and provider opinions are especially important to consider when develo** treatments for AUD, as they will be the primary beneficiaries of new AUD pharmacotherapies.

Apart from the PROMIS item bank, the Substance User Recovery Evaluator (SURE) is a PROM that was developed to be administered specifically to patients recovering from drug and alcohol use disorder [105]. This assessment is comprised of five factors: substance use, material resources, outlook on life, self-care, and relationships [105]. It can be used to assess patient-reported outcomes and allow researchers to assess treatment outcomes from the perspective of the patient. The positive patient and clinician response to these PROMs supports the potential utilization of patient-reported holistic health outcomes in AUD treatment trials. In brief, the PROMIS framework is intended to be complementary, and not substitutive, to the drinking and recovery endpoints discussed herein.

4 Regulatory Perspectives

Currently, the FDA accepts abstinence and no heavy drinking days as primary endpoints for phase 3 trials of AUD pharmacotherapy [33]. Similarly, the EMA accepts full abstinence as a primary endpoint in AUD pharmacotherapy clinical trials, but it additionally takes into consideration the willingness of patients to achieve full abstinence [106]. If patients are not yet able to achieve full abstinence, the EMA accepts an intermediate harm-reduction primary endpoint with the intention of achieving eventual abstinence maintenance [106]. Intermediate harm reduction refers to significantly reducing alcohol intake, thereby reducing alcohol-related harms. Efficacy for this endpoint is measured by the change in total consumption of pure alcohol in grams per day from baseline and the reduction in the number of heavy drinking days [106]. The EMA additionally evaluates improved patient health as a secondary efficacy endpoint. Improved patient health is measured through changes from baseline in validated liver markers, effects on the participant’s social relationships, adherence to medication, changes in alcohol-dependence severity measures, and two-level reductions in WHO risk levels of drinking [106].

5 Conclusions

AUD is a highly pervasive and debilitating disorder, yet treatment-seeking rates and utilization of available approved pharmacotherapies are low [1, 7]. Therefore, the current landscape of pharmacotherapies for AUD suggests opportunities for improvement. While continuous research aimed at develo** novel pharmacotherapies is crucial, it is equally as important to examine the practices and accepted endpoints utilized in clinical trials for AUD. Traditionally, the preponderance of clinical trials have used abstinence-based or heavy drinking-based endpoints as primary outcomes [33]. These outcomes are derived from patient self-reports about their drinking, which are transformed into quantifiable measurements of days abstinent or days without heavy drinking. Apart from these traditional self-report outcomes, there is movement in new directions. One important direction is the use of WHO risk drinking level reductions, which would allow a harm-reduction approach that harnesses the health and psychological benefits of non-abstinence endpoints and controlled drinking [57]. The WHO risk drinking levels as an endpoint has received ample empirical support and has the potential to move forward in terms of FDA review and approval [57, 61,62,63,64]. Additionally, there have been developments in biological measurements of alcohol use, such as PEth, which can be utilized with the primary caveat of the consideration of the time periods for which they are implemented [66, 67]. Transdermal alcohol sensors represent a significant advancement in monitoring alcohol consumption continuously over extended periods of time, but require further enhancement and development before they are able to be used as a primary endpoint [74]. While clinical self-reported abstinence and non-abstinence-based endpoints are not affected by the alcohol pharmacokinetics, biological measurements of alcohol use can be influenced by inter-individual variability of alcohol pharmacokinetics, which is shaped by a variety of factors. Other novel endpoints include the utilization of neuroimaging in which neuroimaging tasks that assess the incentive salience for alcohol have been developed and employed in clinical trials [92].

Conceptually, the newly expanded recovery definition encompasses more than the consumption of alcohol itself, which presents further opportunities to refine clinical endpoints [60]. However, its operational definition for clinical trials remains to be finalized. Further, patient-oriented outcomes are a necessary step that allows providers to engage patients in the treatment development process by reporting subjective benefits and perceived acceptability of treatments [100]. Apart from the endpoints themselves, placebo effects are more likely to occur during early treatment or in short-term treatment durations. Treatment intervals vary greatly across randomized clinical trials, typically ranging from 4 weeks to 6 months [107, 108]. Placebo-controlled trials that utilize treatment intervals greater than 12 weeks may be more reliable [108]. However, there is an ethical obligation to limit the duration in which participants might receive a placebo when effective treatments exist. Therefore, utilizing active-controlled trials or placebo-controlled trials with a sufficient yet reasonable duration of care, in addition to implementing grace periods, may be necessary to treat such a complex disorder and to detect the benefits of active pharmacotherapies versus placebo [52].

The breadth of significant attention paid to clinical endpoints for AUD trials within the past decade speaks largely to the lack of success in the development of novel compounds for this indication. New pharmacotherapies meeting industry standards are necessary in order to address the complexity and heterogeneity of AUD symptom presentations. Moreover, given the complexity of AUD, a holistic approach to endpoint assessment is ideal. This holistic approach is feasible given the major developments in clinical endpoints of AUD pharmacotherapy trials including recognizing WHO risk drinking levels, utilizing biomarkers, and advancing the recovery definition. However, in practical terms, it is imperative that these endpoints be measurable in a valid and reliable fashion in order to ultimately improve treatment development. There is a promising opportunity to leverage the attention paid to clinical endpoints to continue their innovation, validation, and implementation, thereby ensuring the expansion of treatment development to capture the complexities of AUD.