Abstract
The social Web contains enormous content, in which a higher range of online users believes the online reviews for their decision making before any online purchase or using services. The online reviews written are all not true, and some of them are Spam. Data mining classification helps in finding the review as spam or ham. Many text classification algorithms are existing, and it has been proved that these classifiers can be improved when clustering is used with classification to form features. This research work focuses on finding out the right classifier and improving it through clustering. Also, uncover the suitable clustering technique for improving the performance of the classifier. Three classifier algorithms such as Naive Bayes, support vector machine (SVM) and Decision Tree classifiers; clustering algorithms such as K-means, One-Pass and DBScan (Density-based spatial clustering of applications with noise) clustering algorithms are compared and found that the K-means clustering, when used with SVM classifier, outperforms than other combinations. Amazon and Yelp datasets are used for implementation, and the accuracy of SVM classifier has been improved from 89.02% to 90.02% for Amazon dataset and from 86.03% to 88.25% for the Yelp dataset when used with K-means clustering, which is significant than the other combinations compared.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lin HC, Bruning PF, Swarna H (2018) Using online opinion leaders to promote the hedonic and utilitarian value of products and services. Bus Horiz 61(3):431–442
Bajaj S, Garg N, Singh SK (2017) A novel user-based spam review detection. Procedia Comput Sci 122:1009–1015
Dewang RK, Singh AK (2018) State-of-art approaches for review spammer detection: a survey. J Intell Inf Syst 50(2):231–264
Rout JK, Singh S, Jena SK, Bakshi S (2019) Deceptive review detection using labeled and unlabeled data. Multimedia Tools Appl 76(3):3187–3211
Krishnaveni N, Radha V (2019) Feature selection algorithms for data mining classification: a survey. Indian J Sci Technol 12(6). https://doi.org/10.17485/ijst/2019/v12i6/139581
Towards Data Science, https://towardsdatascience.com
Martinez-Torres MR, Toral SL (2019) A machine learning approach for the identification of the deceptive reviews in the hospitality sector using unique attributes and sentiment orientation. Tour Manage 75:393–403
Ahmed H, Traore I, Saad S (2017) Detection of online fake news using N-gram analysis and machine learning techniques. In: International conference on intelligent, secure, and dependable systems in distributed and cloud environments. Springer, Cham, pp 127–138
Catal C, Guldan S (2017) Product review management software based on multiple classifiers. IET Software 11(3):89–92
Rout JK, Dalmia A, Choo KKR, Bakshi S, Jena SK (2017) Revisiting semi-supervised learning for online deceptive review detection. IEEE Access 5:1319–1327
Sun C, Du Q, Tian G (2016) Exploiting product related review features for fake review detection. Math Probl Eng
Wu Z, Cao J, Wang Y, Wang Y, Zhang L, Wu J (2018) hPSD: a hybrid PU-learning-based spammer detection model for product reviews. IEEE Trans Cybern
Shao Y, Trovati M, Shi Q, Angelopoulou O, Asimakopoulou E, Bessis N (2017) A hybrid spam detection method based on unstructured datasets. Soft Comput 21(1):233–243
Osman AH, Aljahdali HM (2017) Feature weight optimization mechanism for email spam detection based on two-step clustering algorithm and logistic regression method. Int J Adv Comput Sci Appl (IJACSA) 8(10):420–429
Elssied NOF, Ibrahim O, Osman AH (2015) Enhancement of spam detection mechanism based on hybrid k-mean clustering and support vector machine. Soft Comput 19(11):3237–3248
Lu XY, Chen MS, Wu JL, Chang PC, Chen MH (2018) A novel ensemble decision tree based on under-sampling and clonal selection for web spam detection. Pattern Anal Appl 21(3):741–754
Pandey AC, Rajpoot DS (2019) Spam review detection using spiral cuckoo search clustering method. Evol Intel 12(2):147–164
Nagwani NK, Sharaff A (2019) SMS spam filtering and thread identification using bi-level text classification and clustering techniques. J Inf Sci 43(1):75–87
Chakraborty M, Pal S, Pramanik R, Chowdary CR (2016) Recent developments in social spam detection and combating techniques: a survey. Inf Process Manage 52(6):1053–1073
Fontanarava J, Pasi G, Viviani M (2017) Feature analysis for fake review detection through supervised classification. In: 2017 IEEE international conference on data science and advanced analytics (DSAA). IEEE, pp 658–666
Abu Hammad AS (2014) An approach for detecting spam in arabic opinion reviews. An approach for detecting spam in arabic opinion reviews
Nair A, Phapale A, Yagnik V, Bathe K (2016) Opinion spam mining. Int Res J Eng Technol (IRJET) 3(4):1855–1859
Kiwi Data Science, https://kiwidatascience.com
Kokate U, Deshpande A, Mahalle P, Patil P (2018) Data stream clustering techniques, applications, and models: comparative analysis and discussion. Big Data Cogn Comput 2(4):32
Brown D, Japa A, Shi Y (2019) A fast density-grid based clustering method. In: 2019 IEEE 9th annual computing and communication workshop and conference (CCWC). IEEE, pp 0048–0054
Kafi A, Alam MSA, Hossain SB, Awal SB, Arif H (2019) Feature-based mobile phone rating using sentiment analysis and machine learning approaches. In: 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT). IEEE, pp 1–6
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Krishnaveni, N., Radha, V. (2021). Performance Evaluation of Clustering-Based Classification Algorithms for Detection of Online Spam Reviews. In: Jeena Jacob, I., Kolandapalayam Shanmugam, S., Piramuthu, S., Falkowski-Gilski, P. (eds) Data Intelligence and Cognitive Informatics. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-8530-2_20
Download citation
DOI: https://doi.org/10.1007/978-981-15-8530-2_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8529-6
Online ISBN: 978-981-15-8530-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)