Abstract
Nowadays, data are more and more used for intelligent modeling and prediction, and the comprehensive evaluation of data quality is getting more and more attention as a necessary means to measure whether the data are usable or not. However, the comprehensive evaluation method of data quality mostly contains the subjective factors of the evaluator, so how to comprehensively and objectively evaluate the data has become a bottleneck that needs to be solved in the research of comprehensive evaluation method. In order to evaluate the data more comprehensively, objectively and differentially, a novel comprehensive evaluation method based on particle swarm optimization (PSO) and grey correlation analysis (GCA) is presented in this paper. At first, an improved GCA evaluation model based on the technique for order preference by similarity to an ideal solution (TOPSIS) is proposed. Then, an objective function model of maximum difference of the comprehensive evaluation values is built, and the PSO algorithm is used to optimize the weights of the improved GCA evaluation model based on the objective function model. Finally, the performance of the proposed method is investigated through parameter analysis. A performance comparison of traffic flow data is carried out, and the simulation results show that the maximum average difference between the evaluation results and its mean value (MDR) of the proposed comprehensive evaluation method is 33.24% higher than that of TOPSIS-GCA, and 6.86% higher than that of GCA. The proposed method has better differentiation than other methods, which means that it objectively and comprehensively evaluates the data from both the relevance and differentiation of the data, and the results more effectively reflect the differences in data quality, which will provide more effective data support for intelligent modeling, prediction and other applications.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Adnan R M, Dai H L, Mostafa R R, Parmar K S, Heddam S, Kisi O (2022). Modeling multistep ahead dissolved oxygen concentration using improved support vector machines by a hybrid metaheuristic algorithm. Sustainability 14(6): 3470.
Adnan R M, Mostafa R R, Kisi O, Yaseen Z M, Shahid S, Zounemat-Kermani M (2021). Improving streamflow prediction using a new hybrid ELM model combined with hybrid particle swarm optimization and grey wolf optimization. Knowledge-Based Systems 230: 107379.
Becker W, Saisana M, Paruolo P, Vandecasteele I (2017). Weights and importance in composite indicators: Closing the gap. Ecological Indicators 80: 12–22.
Chang X, Chen B Y, Li Q, Cui X, Tang L, Liu C (2013). Estimating Real-Time traffic carbon dioxide emissions based on intelligent transportation system technologies. IEEE Transactions on Intelligent Transportation Systems 14(1): 469–479.
Cheng H X, Zhang M H (2021). Groundwater quality evaluation model based on multi-scale fuzzy comprehensive evaluation and big data analysis method. Journal of Water and Climate Change 12(7): 2908–2919.
Chikhaoui O, Douss A B, Abassi R, Fatmi S G (2021). Formal validation of credibility and accuracy assessment of safety messages in VANETs. Proceeding of the 16th International Conference on Availability, Reliability and Security. Austria.
Deng H, Yeh C H, Willis R J (2000). Inter-company comparison using modified TOPSIS with objective weights. Computers and Operations Research 27(10): 963–973.
Diakoulaki D, Mavrotas G, Papayannakis L (1995). Determining objective weights in multiple criteria problems. Computers and Operations Research 22: 763–770.
Ertekin D O, Ozbay K (2012). Dynamic data maintenance for quality data, quality research. International Journal of Information Management 32(3): 282–293.
Herrera J C, Work D B, Herring R, Ban X J, Jacobson Q, Bayen A M (2010). Evaluation of traffic data obtained via GPS-enabled mobile phones: The mobile century field experiment. Transportation Research Part C 18(4): 568–583.
Hu X, Pedrycz W, Wang X (2017). Granular fuzzy rule-based models: A study in a comprehensive evaluation and construction of fuzzy models. IEEE Transactions on Fuzzy Systems 25(5): 1342–1355.
Ikram R M A, Dai H L, Ewees A A, Shiri J, Kisi O, Zounemat-Kermani M (2022). Application of improved version of multi verse optimizer algorithm for modeling solar radiation. Energy Reports 8: 12063–12080.
Ikram R M A, Ewees A A, Parmar K S, Yaseen Z M, Shahid S, Kisi O (2022). The viability of extended marine predators algorithm-based artificial neural networks for streamflow prediction. Applied Soft Computing 131: 109739.
Ikram R M A, Mostafa R R, Chen Z H, Islam A M T, Kisi O, Kuriqi A, Zounemat-Kermani M (2023). Advanced hybrid metaheuristic machine learning models application for reference crop evapotranspiration prediction. Agronomy-Basel 13(1): 98.
Jain R, Dhingra S, Joshi K, Grover A (2022). An improved traffic flow forecasting based control logic using para-metrical doped learning and truncated dual flow optimization model. Wireless Networks 28(7): 3101–3110.
Jiang Y, Liu Y, Ying G, Wang H, Liang Y, Chen X (2015). A new tool for assessing sediment quality based on the weight of evidence approach and grey TOPSIS. Science of the Total Environment 537: 369–376.
Kashinath S A, Mostafa S A, Mustapha A, Mahdin H, Lim D, Mahmoud M A, Mohammed M A, Al-Rimy B A S, Fudzee M F M, Yang T J (2021). Review of data fusion methods for real-time and multi-sensor traffic flow analysis. IEEE Access 9: 51258–51276.
Kennedy J, Eberhart R (1995). Particle swarm optimization. 1995 IEEE International Conference on Neural Networks Proceedings, Australia.
Lee C S, Wang M H, Wang C S, Teytaud O, Liu J, Lin S W, Hung P H (2018). PSO-based fuzzy markup language for student learning performance evaluation and educational application. IEEE Transactions on Fuzzy Systems 26(5): 2618–2633.
Lenormand M (2018). Generating OWA weights using truncated distributions. International Journal of Intelligent Systems 33(4): 791–801.
Li W, Xu S, Peng X (2021). Research on comprehensive evaluation of data source quality in big data environment. International Journal of Computational Intelligence Systems 14(1): 1831–1841.
Li Y, Chen D (2016). A learning-based comprehensive evaluation model for traffic data quality in intelligent transportation systems. Multimedia Tools and Applications 75(19): 11683–11698.
Ma P, Yao N, Yang X (2021). Service quality evaluation of terminal express delivery based on an integrated SERVQUAL-AHP-TOPSIS approach. Mathematical Problems in Engineering 2021: 8883370.
Mahjoub S, Labdai S Chrifi-Alaoui L, Marhic B, Delahoche L. (2023). Short-term occupancy forecasting for a smart home using optimized weight updates based on GA and PSO algorithms for an LSTM network. Energies 16(4): 1641.
Mohd N A, Mostafa S A, Mustapha A, Ramli A A, Mohammed M A, Kumar N M (2020). Vehicles counting from video stream for automatic traffic flow analysis systems. International Journal of Emerging Trends in Engineering Research 8(1.1): 142–146.
Niu J, He J M, Li Y R, Zhang S F (2022). Highway temporal-spatial traffic flow performance estimation by using gantry toll collection samples: A deep learning method. Mathematical Problems in Engineering 2022: 1563–5147.
Pipino L L, Yang W L, Wang R Y (2002). Data quality assessment. Communications of the ACM 45: 211–218.
Qian K, Luan Y H (2017). Weighted measures based on maximizing deviation for alignment-free sequence comparison. Physica a-Statistical Mechanics and Its Applications 481: 235–242.
Rong R, Wang B (2015). Combining grey relational analysis and TOPSIS concepts for evaluating the technical innovation capability of high technology enterprises with fuzzy information. Journal of Intelligent and Fuzzy Systems: Applications in Engineering and Technology 29(4): 1301–1309.
Sadiq R, Tesfamariam S (2007). Probability density functions based weights for ordered weighted averaging (OWA) operators: An example of water quality indices. European Journal of Operational Research 182(3): 1350–1368.
Singh P, Raw R S, Khan S A, Mohammed M A, Aly A A, Le D N (2022). W-GeoR: Weighted geographical routing for VANET’s health monitoring applications in urban traffic networks. IEEE Access 10: 38850–38869.
Smith B, Scherer W, Conklin J (2003). Exploring imputation techniques for missing data in transportation management systems. Transportation Research Record Journal of the Transportation Research Board 1836: 132–142.
Strong D M, Lee Y W, Wang R Y (1997). Data quality in context. Communications of the ACM 40(5): 103–110.
Turner S (2004). Defining and measuring traffic data quality: White paper on recommended approaches. Transportation Research Record: Journal of the Transportation Research Board 1870: 62–69.
Xu J, Li Z, Shen W, Lev B (2013). Multi-attribute comprehensive evaluation of individual research output based on published research papers. Knowledge-Based Systems 43(2): 135–142.
Xu X, Nie C, ** X, Li Z, Zhu H, Xu H, Wang J, Zhao Y, Feng H (2021). A comprehensive yield evaluation indicator based on an improved fuzzy comprehensive evaluation method and hyperspectral data. Field Crops Research 270: 108204–108219.
Xu Z (2005). An overview of methods for determining OWA weights. International Journal of Intelligent Systems 20(8): 843–865.
Yan L, Shen Q, Lu H, Wang H, Fu X, Chen J (2020). Inversion and uncertainty assessment of ultra-deep azimuthal resistivity logging-while-drilling measurements using particle swarm optimization. Journal of Applied Geophysics 178: 104059.
Zhang B, Sha Z, OpenITS Org (2021). OpenData V6.0-introduction of open data of Hefei demonstration area. http://www.openits.cn/openData2/602.jhtml.
Acknowledgments
The authors would like to thank the anonymous reviewers for their insightful suggestions and remarks, which helped in improving the manuscript. This work was supported by the Scientific Research Funding Project of Liaoning Education Department of China under Grant No. JDL2020005, No. LJKZ0485, and the National Key Research and Development Program of China under Grant No. 2018YFA0704605.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare no conflict of interest.
Additional information
Wei Ba received the B.S. degree in automation from Dalian University of Technology, Dalian, China, in 2003 and she received the Ph.D in control theory and control engineering from Dalian University of Technology, Dalian, China, in 2010. She is now an associate professor of Dalian Jiaotong University, Dalian, China. Her research interests include data quality analysis, big data modeling and artificial intelligence technology.
Baojun Chen received the B.S. degree in automation from Dalian University of Technology, Dalian, China, in 2002 and she received the Ph.D in circuit and system from Dalian University of Technology, Dalian, China, in 2012. She is now an associate professor of Dalian Jiaotong University, Dalian, China. Her research interests include data quality analysis, artificial intelligence technology and microwave device technology.
Qi Li received the B.S. degree in automation from Dalian University of Technology, Dalian, China, in 2002 and he received the Ph.D in control theory and control engineering from Dalian University of Technology, Dalian, China, in 2008. He is now an associate professor of Dalian University of Technology, Dalian, China. His research interests include advanced process control, soft sensor and artificial intelligence algorithm.
Rights and permissions
About this article
Cite this article
Ba, W., Chen, B. & Li, Q. Comprehensive Evaluation Method for Traffic Flow Data Quality Based on Grey Correlation Analysis and Particle Swarm Optimization. J. Syst. Sci. Syst. Eng. 33, 106–128 (2024). https://doi.org/10.1007/s11518-023-5585-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11518-023-5585-5