Log in

Market segmentation and dynamic price discrimination in the U.S. airline industry

  • Research Article
  • Published:
Journal of Revenue and Pricing Management Aims and scope

Abstract

Airfares are affected by a variety of factors, but it is less clear which factors are the key determinants and how they interact. Based on a unique transaction level data set, this paper introduces a widely used, machine learning based pricing tool to investigate the airline market segmentation and dynamic price discrimination problems. The empirical results suggest that purchasing time, city distance, market structure, market size, and seat availability are the five most important pricing factors in order. Airlines first partition their markets into an early market and a late market, and split the market further by city distance and other factors. While intertemporal price discrimination explains the majority of fare variations, there are strong indications that airlines use their market power in the late market and charge higher fares on late-arriving consumers (but not on early consumers), in response to extra seats sold.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. A literature survey from a managerial standpoint on the yield management tools used in the airline industry can be found in McGill and Van Ryzin (1999).

  2. The expected difference in fares between two random passengers sitting next to each other is 36–44% of the airline’s average ticket fare (Borenstein and Rose 1994; Gerardi and Shapiro 2009).

  3. For example, hospitality industries like airlines and hotels, finance industries, telecommunications, and retailing industries.

  4. For example, in order to identify consumer needs and develop products and prices to meet these needs, United Airlines (2016) segments its international market into nine motivational segment profiles, such as global executives, schedule optimizers, corporate troopers, mile accumulators, reluctant travellers, travel seekers, tour takers, quality vacationers and frugal flyers.

  5. Based on Deneckere and Peck (2012), on average, a flat price curve must be observed over the booking period. However, such a prediction may not hold in a model incorporating the possibility of using intertemporal pricing strategies to screen consumers (Alderighi et al. 2015).

  6. This criterion is consistent with Gerardi and Shapiro (2009) to define the “big-city routes.”

  7. It also includes connecting passengers on the given segment and differs from the DB1B data which only comprise passengers who originate and end their trips at the origin and destination airports.

  8. Following Borenstein (1989) and Peters (2006), for an itinerary with more than one outbound segment, flight rating and frequency are measured as the geometric mean of the rating and frequency for each outbound segment.

  9. Based on the 1995 American Travel Survey, Borenstein (2010) calculates two MSA level business share indices for each airport: “bizsmsao” and “bizsmsad,” which measure the share of business passengers originating/ending their travel at the specific airport. The route level business share is calculated and supplemented as the average of the origin airport’s originating share and the ending airport’s destination share. The tourist share is one minus the business share.

  10. An itinerary is defined as an ordered sequence of airports (and their corresponding flight schedules) through which a passenger travels.

  11. Another advantage is that it can discover some crucial relationships in subsamples that are not evident in the entire sample.

  12. While the bootstrap aggregating process is often called as \(``b+agging''\) process, a method that combines predictions from multiple models is called as an “ensemble” method.

  13. Based on the Majority Voting theorem (Condorcet’s jury theorem), as formally proved in Breiman (1996) and Freund (1995), the ensemble method (including random forests) could improve the model accuracy should weak learners be (weakly) independent of each other. Another assumption is that each weak learner must have a prediction that is slightly better than the random prediction.

  14. In detail, permutation importance is reported here as the relative importance measurement in the random forest regressor. When the forest is grown, the model is evaluated on an out-of-bag sample and the accuracy (here, MSE) is recorded. Then, the values for the variable j are randomly permuted (i.e., reshuffled) among all the data points and the sample is re-scored. The decrease in accuracy as the outcome of this permuting is averaged over all the trees, and it is used as a measure of the importance of variable j. More detailed discussions can be found in Hastie et al. (2009).

  15. This is also used in Borenstein (1989). As pointed out by Berry (1994), the excluded demand-side parameters are also proper instruments. However, it may be unreasonable to assume the existence of x variables that are valued by consumers but do not affect the marginal cost.

  16. Other ticket characteristics such as carry-on policy dummy, baggage fees, cancellation policy, Wi-Fi dummy, and entertainment facility dummy are also available but are not included, as they are highly correlated with carrier dummies.

  17. These findings are in contrast to Sengupta and Wiggins (2014) who find a negative insignificant relationship between HHI and fares.

  18. A three-layer decision tree is also experimented and it is found that advance-purchase days are selected as the next layers in most segments.

  19. The cutoff points selected from the model (i.e., 790.55 and 1250.95 miles) are consistent with the segment boundaries used in Gerardi and Shapiro (2009), who use 818 and 1240 miles as the boundaries to define their segments. For illustration purpose, the distance between New York LaGuardia Airport (LGA) and Chicago O’Hare Airport (ORD) is around 731 miles and the distance between ORD and Las Vegas McCarran International Airport (LAS) is about 1510 miles.

  20. As described in Strobl et al. (2007), the default feature importance measure for the random forest regressor in many popular statistical packages can be biased during the random selection of features and the bootstrap** process. It tends to inflate the importance of continuous features or high-cardinality categorical variables. As a result, the permutation importance (based on MSE reductions) is reported here as suggested by Strobl et al. (2007). We also experimented with other settings transforming continuous variables into binned or ordinal variables, and our results are robust to these changes.

References

  • Alderighi, M., M. Nicolini, and C. Piga. 2015. Combined Effects of Capacity and Time on Fares: Insights from the Yield Management of a Low-Cost Airline. The Review of Economics and Statistics 97 (4): 900–915.

    Article  Google Scholar 

  • Ata, B., and J. Dana. 2015. Price Discrimination on Booking Time. International Journal of Industrial Organization 43: 175–181.

    Article  Google Scholar 

  • Athey, S. 2017. The Impact of Machine Learning on Economics. Stanford University Working Paper.

  • Bajari, P., D. Nekipelov, S. Ryan, and M. Yang. 2015. Machine Learning Methods for Demand Estimation. American Economic Review 105 (5): 481–485.

    Article  Google Scholar 

  • Berry, S. 1994. Estimating Discrete-Choice Models of Product Differentiation. The RAND Journal of Economics 25 (2): 242–262.

    Article  Google Scholar 

  • Berry, S., and P. Jia. 2012. Tracing the Woes: An Empirical Analysis of the Airline Industry. American Economic Journal: Microeconomics 2: 1–43.

    Google Scholar 

  • Berry, S., J. Levinsohn, and A. Pakes. 1995. Automobile Prices in Market Equilibrium. Econometrica 63 (4): 841–890.

    Article  Google Scholar 

  • Borenstein, S. 1989. Hubs and High Fares: Dominance and Market Power in the U.S. Airline Industry. The RAND Journal of Economics 20 (3): 344–365.

    Article  Google Scholar 

  • Borenstein, S. 2010. An Index of Inter-City Business Travel for Use in Domestic Airline Competition Analysis. https://www.nber.org/data/bti.html.

  • Borenstein, S., and N. Rose. 1994. Competition and Price Dispersion in the U.S. Airline Industry. The Journal of Political Economy 102 (4): 653–683.

    Article  Google Scholar 

  • Breiman, L. 1996. Bagging Predictors. Machine Learning 24: 123–140.

    Article  Google Scholar 

  • Breiman, L., J. Friedman, R. Olshen, and C. Stone. 1984. Classification and Regression Trees. Boca Raton, FL: Chapman & Hall/CRC Press.

    Google Scholar 

  • Dai, M., Q. Liu, and K. Serfes. 2014. Is the Effect of Competition on Price Dispersion Non-monotonic? Evidence from the U.S. Airline Industry. The Review of Economics and Statistics 96 (1): 161–170.

    Article  Google Scholar 

  • Dana, J. 1998. Advance-Purchase Discounts and Price Discrimination in Competitive Markets. The Journal of Political Economy 106 (2): 395–422.

    Article  Google Scholar 

  • Dana, J. 1999a. Using Yield Management to Shift Demand When the Peak Time is Unknown. The RAND Journal of Economics 30 (3): 456–474.

    Article  Google Scholar 

  • Dana, J. 1999b. Equilibrium Price Dispersion Under Demand Uncertainty: The Roles of Costly Capacity and Market Structure. The RAND Journal of Economics 30 (4): 632–660.

    Article  Google Scholar 

  • Dana, J. and K. Williams. 2019. Intertemporal Price Discrimination in Sequential Quantity-Price Games. Cowles Foundation Discussion Paper, No. 2136.

  • Deneckere, R., and J. Peck. 2012. Dynamic Competition with Random Demand and Costless Search: A Theory of Price Posting. Econometrica 80: 1185–1247.

    Article  Google Scholar 

  • Escobari, D. 2012. Dynamic Pricing, Advance Sales and Aggregate Demand Learning in Airlines. The Journal of Industrial Economics LX (4): 697–724.

    Article  Google Scholar 

  • Freund, Y. 1995. Boosting a Weak Learning Algorithm by Majority. Information and Computation 121 (2): 256–285.

    Article  Google Scholar 

  • Gaggero, A., and C. Piga. 2011. Airline Market Power and Intertemporal Price Discrimination. The Journal of Industrial Economics LIX (4): 552–577.

    Article  Google Scholar 

  • Gale, I., and T. Holmes. 1993. Advance-Purchase Discounts and Monopoly Allocation of Capacity. American Economic Review 83: 135–146.

    Google Scholar 

  • Gerardi, K., and A. Shapiro. 2009. Does Competition Reduce Price Dispersion? New Evidence from the Airline Industry. The Journal of Political Economy 117 (1): 1–37.

    Article  Google Scholar 

  • Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. New York, NY: Springer Press.

    Book  Google Scholar 

  • Huber, M., and D. Imhof. 2019. Machine Learning with Screens for Detecting Bid-rigging Cartels. International Journal of Industrial Organization 65: 277–301.

    Article  Google Scholar 

  • Kowka, J., K. Hearle, and P. Alepin. 2016. From the Fringe to the Forefront: Low Cost Carriers and Airline Price Determination. Review of Industrial Organization 48 (3): 247–268.

    Article  Google Scholar 

  • McGill, J., and G. Van Ryzin. 1999. Revenue Management: Research Overview and Prospects. Transportation Science 33 (2): 233–256.

    Article  Google Scholar 

  • Morrison, S. 2001. Actual, Adjacent, and Potential Competition: Estimating the Full Effect of Southwest Airlines. Journal of Transport Economics and Policy 35 (2): 239–256.

    Google Scholar 

  • Mullainathan, S., and J. Spiess. 2017. Machine Learning: An Applied Econometric Approach. The Journal of Economic Perspectives 31 (2): 87–106.

    Article  Google Scholar 

  • Peters, C. 2006. Evaluating the Performance of Merger Simulation: Evidence from the U.S. Airline Industry. Journal of Law and Economics XLIX: 627–649.

    Article  Google Scholar 

  • Puller, S., A. Sengupta, and S. Wiggins. 2009. Testing Theories of Scarcity Pricing in the Airline Industry. NBER Working Paper, 15555.

  • Puller, S., and L. Taylor. 2012. Price Discrimination by Day-of-Week of Purchase: Evidence from the U.S. Airline Industry. Journal of Economic Behavior & Organization 84: 801–812.

    Article  Google Scholar 

  • Sengupta, A., and S. Wiggins. 2014. Airline Pricing, Price Dispersion, and Ticket Characteristics On and Off the Internet. American Economic Journal: Economic Policy 6 (1): 272–307.

    Google Scholar 

  • Smith, B.C. and Penn, C.W. 1988. Analysis of Alternate Origin-Destination Control Strategies. In AGIFORS Symposium Proceedings, New Seabury, MA.

  • Smith, B.C., J.F. Leimkuhler., and R.M. Darrow. 1992. Yield Management at American Airlines. Interfaces 22 (1): 8–31.

  • Stavins, J. 2001. Price Discrimination in the Airline Market: The Effect of Market Concentration. The Review of Economics and Statistics 83 (1): 200–202.

    Article  Google Scholar 

  • Strobl, C., A. Boulesteix, A. Zeileis, and T. Hothorn. 2007. Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution. BMC Bioinformatics 8: 25.

    Article  Google Scholar 

  • United Airlines. 2016. Using Global Segmentation to Grow a Business.

  • Varian, H. 2014. Big Data: New Tricks for Econometrics. The Journal of Economic Perspectives 28 (2): 3–27.

    Article  Google Scholar 

  • Williams, K. 2018. Dynamic Airline Pricing and Seat Availability. Cowles Foundation Discussion Paper, No. 3003.

Download references

Disclaimer

The views expressed in this paper are those of the author and do not reflect those of the author’s affiliations. All the opinions and errors remain the sole responsibility of the author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengyan Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Table 6.

Table 6 List of ticketing airlines

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, C. Market segmentation and dynamic price discrimination in the U.S. airline industry. J Revenue Pricing Manag 22, 338–361 (2023). https://doi.org/10.1057/s41272-022-00407-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1057/s41272-022-00407-5

Keywords

JEL Classification

Navigation