Abstract
There are massive numbers of aviation inspection reports collected each year in the USA. These reports record findings from aviation surveillance inspections as well as accident or incident investigations. The goal of this paper is to apply text classification to the mining of these reports, and to show that the text classification methodology can be a critical element of the aviation safety decision support system. The performances of several text classification models are evaluated in the context of mining aviation inspection reports. The evaluation is given in terms of misclassification rates. Further breakdowns of the misclassification rates and related findings from the dataset suggest ways for improving data quality and for gathering information which are more pertinent for filing inspection reports.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Cheng, R. Liu, and J. Luxhøj. Monitoring multivariate aviation safety data by data depth: control charts and threshold systems. HE Transactions on Operations Engineering 32 (2000), 861–872.
GAO (United States General Accounting Office). AVIATION SAFETY: Weakness in Inspection and Enforcement Limit FAA in Identifying and Responding to Risks. (1998) GAO/RCED-98-6.
T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning, Data Mining, Inference, and Prediction. (2001), Springer.
D. Lewis. Naive (Bayes) at forty: the independence assumption in information retrieval. In ECML ′98: Tenth European Conference on Machine Learning (1998), 4-15.
R. Liu. BootQC: Bootstrap for Robust Analysis of Aviation Safety Data. Developments in Robust Statistics, ICOR 2001, ed. R. Dutter, P. Filzmoser, U. Gather, and P. Rousseeuw. Springer, Heidelberg, (2001) press.
D. Madigan, H. Ju and Y. Vardi. On the naive Bayes model for text classification. (2002) Technical report, Dept. of Statistics, Rutgers University.
A. McCallum, and K. Nigam. A comparison of event models for naive Bayes text classification. In Proceedings of the AAAI-98 Workshop on Machine Learning for Text Categorization, (1998).
A. Ng and M. Jordan. On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes. Advances in Neural Information Processing Systems, 14 (2001).
R. Schapire and Y. Singer. A boosting-based system for the text categorization. Machine Learning, 39(2/3) (2000), 135–168.
D. Spiegelhalter and R. Knill-Jones. Statistical and knowledge based approaches to clinical decision support systems, with an application in gastroenterology (with discussion). Journal of the Royal Statistical Society — Ser. A147 (1984), 35–77
Y. Yang and X. Liu. A re-examination of text categorization methods. Proceedings of the 22nd ACM SIGIR Conference on Research and Development in Information Retrieval (1999), 42-49.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Basel AG
About this paper
Cite this paper
Liu, R.Y., Madigan, D., Eyheramendy, S. (2002). Text Classification for Mining Massive Aviation Inspection Reports. In: Dodge, Y. (eds) Statistical Data Analysis Based on the L1-Norm and Related Methods. Statistics for Industry and Technology. Birkhäuser, Basel. https://doi.org/10.1007/978-3-0348-8201-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-0348-8201-9_31
Publisher Name: Birkhäuser, Basel
Print ISBN: 978-3-0348-9472-2
Online ISBN: 978-3-0348-8201-9
eBook Packages: Springer Book Archive