Abstract
Tensor clustering is a knowledge management technique which is well known as a major algorithmic and technological driver behind a broad applications spectrum. The latter ranges from multimodal social media analysis and geolocation processing to analytics tailored for large omic data. However, known exact tensor clustering problems when reduced to tensor factorization are provably NP hard. This is attributed in part to the volume of data contained in a tensor, proportional to the product of its dimensions, as well as to the increased interdependency between the tensor entries across its dimensions. One well studied way to circumvent this inherent difficulty is to resort to heuristics. This article presents an enhanced version of a genetic algorithm tailored for community discovery structure in tensors containing spatiosocial data, namely linguistic and geolocation data. The objective function as well as the chromosome fitness functions by design take into account elements of linguistic propagation models. The genetic operators of selection, crossover, and mutation as well as the newly added double mutation operator work directly on the community level. Moreover, various policies for maintaining gene variability across generations are studied in an extensive simulation powered by Google TensorFlow. As with its predecessor, the proposed genetic algorithm has been applied to a dataset consisting of a large number of Tweets and their associated geolocations from the Grand Duchy of Luxembourg, a historically and de facto trilingual country. The results are compared with those obtained from the original genetic algorithm and their differences are interpreted.
Similar content being viewed by others
Notes
References
Androutsopoulos J (2011) Language change and digital media: a review of conceptions and evidence. Standard languages and language standards in a changing Europe
Backstrom L, Sun E, Marlow C (2010) Find me if you can: improving geographical prediction with social and spatial proximity. In: Proceedings of the 19th international conference on World Wide Web, ACM, pp 61–70
Beasley JE, Chu PC (1996) A genetic algorithm for the set covering problem. Eur J Oper Res 94(2):392–404
Booker LB, Goldberg DE, Holland JH (1989) Classifier systems and genetic algorithms. Artif Intell 40(1–3):235–282
Cardoso JF (1990) Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem. In: ICASSP-90, IEEE, pp 2655–2658
Croft W (2003) Mixed languages and acts of identity: an evolutionary approach. Mixed Lang Debate 145:41
Darwin C (1859) On the origin of species by means of natural selection. John Murray, London
Davis L (1991) Handbook of genetic algorithms. CUMINCAD, New York
Dawkins R (2006) The selfish gene, thirtieth, anniversary edition. Oxford University Press, Oxford
De Jong K (1988) Learning with genetic algorithms: an overview. Mach Learn 3(2):121–138
De Lathauwer L, Vandewalle J (2004) Dimensionality reduction in higher-order signal processing and rank-\((r_1, r_2, \ldots, r_n)\) reduction in multilinear algebra. LAA 391:31–55
Dixon RM (1997) The rise and fall of languages. Cambridge University Press, Cambridge
Djugasvilii JV (1950) Marxism and problems of linguistics. In: Pravda
Donoso G, Sánchez D (2017) Dialectometric analysis of language variation in twitter. arxiv:170206777
Drakopoulos G (2016) Tensor fusion of social structural and functional analytics over Neo4j. In: IISA, IEEE
Drakopoulos G, Kanavos A (2016) Tensor-based document retrieval over Neo4j with an application to PubMed mining. In: IISA, IEEE
Drakopoulos G, Kanavos A, Karydis I, Sioutas S, Vrahatis AG (2017a) Tensor-based semantically-aware topic clustering of biomedical documents. Computation 5(3):34
Drakopoulos G, Kanavos A, Mylonas P, Sioutas S (2017b) Defining and evaluating Twitter influence metrics: A higher order approach in Neo4j. Soc Netw Anal Min 7:52
Drakopoulos G, Kanavos A, Tsakalidis K (2017c) Fuzzy random walkers with second order bounds: an asymmetric analysis. Algorithms 10(2):40
Drakopoulos G, Stathopoulou F, Tzimas G, Paraskevas M, Mylonas P, Sioutas S (2017d) A genetic algorithm for discovering linguistic communities in spatiosocial tensors with an application to trilingual Luxembourg. In: MHDW
Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. TKDD 5(2):10
Eisenstein J (2015) Sociolinguistic variation in online social media. In: 2015 AAAS Annual Meeting
Eisenstein J, O’Connor B, Smith NA, **ng EP (2014) Diffusion of lexical change in social media. PLoS One 9(11):e113114
Eleta I, Golbeck J (2012) Bridging languages in social networks: how multilingual users of twitter connect language communities? Proc Am Soc Inf Sci Technol 49(1):1–4
Goel R, Soni S, Goyal N, Paparrizos J, Wallach H, Diaz F, Eisenstein J (2016) The social dynamics of language change in online networks. In: International Conference on Social Informatics, Springer, pp 41–57
Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99
Hale M (2007) Historical linguistics: theory and method. Wiley-Blackwell, New York
Hale SA (2014) Global connectivity and multilinguals in the Twitter network. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM, pp 833–842
Holland JH (1992) Genetic algorithms. Sci Am 267(1):66–73
Hong L, Convertino G, Chi EH (2011) Language matters in Twitter: a large scale study. In: ICWSM
Kanavos A, Drakopoulos G, Tsakalidis A (2017) Graph community discovery algorithms in neo4j with a regularization-based evaluation metric. In: WEBIST
Karatzoglou A, Amatriain X, Baltrunas L, Oliver N (2010) Multiverse recommendation: \(n\)-dimensional tensor factorization for context-aware collaborative filtering. In: Proceedings of the fourth ACM conference on Recommender systems, ACM, pp 79–86
Kershaw D, Rowe M, Stacey P (2015) Language innovation and change in on-line social networks. In: Proceedings of the 26th ACM Conference on Hypertext and Social Media, ACM, pp 311–314
Kershaw D, Rowe M, Noulas A, Stacey P (2017) Birds of a feather talk together: User influence on language adoption. In: Proceedings of the 50th Hawaii International Conference on System Sciences
Kirk NA, Mees B (2006) Stalin, Marr and the struggle for a Soviet linguistics. Verbatim 31(3)
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Kontopoulos S, Drakopoulos G (2014) A space efficient scheme for graph representation. In: ICTAI, IEEE
Labov W (2001) Principles of linguistic change, volume 2: social factors. In: Language in society. Wiley, Hoboken, United States
Labov W (2007) Transmission and diffusion. Language 83(2):344–387
Lu S, Wang S, Zhang Y (2016) A note on the weight of inverse complexity in improved hybrid genetic algorithm. J Med Syst 40(6):1
Matras Y (2013) Languages in contact in a world marked by change and mobility. Revue française de linguistique appliquée 18(2):7–13
Matsumoto K (2010) The role of social networks in the post-colonial multilingual island of Palau: mechanisms of language maintenance and shift. Multilingua J Cross-Cultural Interlang Commun 29(2):133–165
Maybaum R (2013) Language change as a social process: Diffusion patterns of lexical innovations in Twitter. In: Annual Meeting of the Berkeley Linguistics Society, pp 152–166
Michael L, Bowern C, Evans B (2014) Social dimensions of language change. In: Evans B (ed) Bowern C. Routledge Handbook of Historical Linguistics, Routledge, pp 484–502
Milroy J, Milroy L (1985) Linguistic change, social network and speaker innovation. J Linguistics 21(02):339–384
Milroy L (1980) Language and social networks, 2nd edn. Blackwell Oxford, Oxford
Nevalainen T (2015) Social networks and language change in Tudor and Stuart London-only connect? Eng Lang Linguistics 19(2):269–292
Nion D, Sidiropoulos ND (2010) Tensor algebra and multidimensional harmonic retrieval in signal processing for MIMO radar. IEEE Trans Signal Process 58(11):5693–5705
Pakendorf B (2014) Historical linguistics and molecular anthropology. In: Evans B (ed) Bowern C. Routledge Handbook of Historical Linguistics, Routledge
Papalexakis E, Doğruöz AS (2015) Understanding multilingual social networks in online immigrant communities. In: 24th WWW, ACM, pp 865–870
Rahmat-Samii Y, Michielssen E (1999) Electromagnetic optimization by genetic algorithms. Microwave J 42(11):232–232
Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: ICML, ACM, pp 792–799
Tanese R (1989) Distributed genetic algorithms for function optimization. University of Michigan, Michigan
Trudgill P (2011) Social structure, language contact and language change. The SAGE Handbook of Sociolinguistics pp 236–249
Wang S, Yang M, Li J, Wu X, Wang H, Liu B, Dong Z, Zhang Y (2017) Texture analysis method based on fractional Fourier entropy and fitness-scaling adaptive genetic algorithm for detecting left-sided and right-sided sensorineural hearing loss. Fundamenta Informaticæ 151(1–4):505–521
Weinreich U, Labov W, Herzog MI (1968) Empirical foundations for a theory of language change. University of Texas Press, Texas
Westin CF, Maier SE, Mamata H, Nabavi A, Jolesz FA, Kikinis R (2002) Processing and visualization for diffusion tensor MRI. Med Image Anal 6(2):93–108
Acknowledgements
This article is part of VALIS.IX, an independent research and innovation project promoting the study of European lingustic diversity and heritage. Moreover, this article is part of project Tensor 451, a long term research initiative whose primary objective is the development of novel, scalable, numerically stable, and interpretable tensor analytics.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Drakopoulos, G., Stathopoulou, F., Kanavos, A. et al. A genetic algorithm for spatiosocial tensor clustering. Evolving Systems 11, 491–501 (2020). https://doi.org/10.1007/s12530-019-09274-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-019-09274-9
Keywords
- Multilingual social networks
- Multimodal social networks
- Cross cultural communication
- Language variation models
- Tensor clustering
- Google TensorFlow
- Genetic algorithms
- Gene variability
- Geolocation data
- Spatiosocial data
- Humanistic data
- Higher order data