1 Introduction

The evolution of online social networks (OSN) has a crucial role for the rapid dissemination of information. Information and communication technologies have increased, almost unlimitedly, the capacity to communicate, to inform and to be informed, as well as to acquire knowledge (Fu et al. 2017; Savic et al. 2019) from interpersonal networks that are always producing information (Piedrahita et al. 2017). It is recognized that digital is disrupting business models, while creating new opportunities for companies to engage, not only with their customers, but with all stakeholders.

This paper discusses how social network analysis (SNA) can help to get useful information from network structures and relationships to support decisions. To develop decision-support from social interactions, we analyzed data from a Facebook fan page of a well-known Portuguese airline (TAP) account, during a ten-day period strike and during the following two days. The main goal was to assess the airline’s service level and responsiveness, as well as the development of business aids to evaluate and/or develop strategies to improve customer service, during a strike, based on the information retrieved from social media. The case study demonstrated that SNA metrics such as degree, closeness, and betweenness centrality can be used to identify influential customers, improve response times, identify active users in need of additional support, and resolve issues more effectively. Additionally, SNA metrics such as in-degree, out-degree, PageRank, and eigenvector centrality can be used to measure the airline’s performance in terms of speed of answer and customer satisfaction, hel** call-center managers to identify the most active users in the network, the need for additional support, and highly influential customers who may impact overall customer satisfaction.

OSNs can be monitored to filter social conversations (Freire et al. 2022), providing businesses tools to better understand their customers and to build customer profiles. Moreover, by combining different techniques it is possible to interpret social networks’ activity. Computerized systems can capture these social conversations, based on customer needs and their interactions. Together, with SNA, data mining (DM) and natural language processing (NLP) algorithms for OSN, content analysis can be used to interpret web discourse. In fact, in business contexts, it is important to understand customers (Isson 2018; Liu et al. 2020; Wang et al. 2020), as quickly as possible, to improve decision making.

SNA incorporates several centrality metrics that aid in studying social networks (further details will be provided along Sect. 2). NLP is the computer science field in which machines extract knowledge from natural language texts (Antunes et al. 2016; Isson 2018) and DM regards the automated retrieval of noteworthy data structures, integrating data analysis to infer models to meet specific objectives (Fu et al. 2017; Yang et al. 2019) and SimpliFlying (2019), establish that the attention to social media is now part of airlines’ daily operations, namely to bond with passengers. Despite this situation, social network analysis is not a mainstream to support decision-making in crisis management, though SNA can provide a deep understanding of online social interactions (Freire et al. 2022). Moreover, we stand that such information is not only easy to obtain, but that it is a valuable cost-efficient resource because there are many free-available tools that can produce the information, as presented in this paper.

In Sect. 2 we introduce concepts on decision-making using SNA and we present an overview of studies regarding OSN’s and airline services. Section 3 covers web discourse in OSN’s context. In Sect. 4 we present a case study, and the obtained results are addressed in the following section. Final remarks are presented in the last section.

2 Social network analysis and decision-making support for airline services

Making decisions often involves uncertainty. This can be due to the fact that decision-makers may plan actions that will take place in the future or because their decisions are not always based on actual knowledge or absolute principles, but on assumptions, perceptions, opinions, and personal values or beliefs. Fedushko et al. (2023) suggest that strategic decision-making, based on OSN, involves understanding the interconnected decisions and their impact on network users to gain an advantage. On the other hand, Antunes and Costa (2011) argue that decision-making support processes should incorporate rational aids instead of mere “gut feelings”, thus enhancing the possibility of making better decisions. Freire et al. (2022) build upon these ideas to establish that decision-making processes dependent on OSN could benefit from the use of SNA metrics. These metrics can help to understand the interaction of users, involved posts, and encompassed concepts, providing a deeper understanding of the discussion and its logical sequence.

Social network data and information are usually associated with computer systems. However, OSNs are considered an essential source of real-time information for decision-making (Arafeh et al. 2021) and it might be a manager’s greatest technological innovation for decision support. As OSNs involve virtual collaboration, including the execution of virtual organizational tasks, understanding the effects of online collaboration and information sharing is crucial for supporting decision-making in the current context, when compared to face-to-face collaboration. In contemporary organizations, decision-makers need to gather relevant information to increase their knowledge of available alternatives for a particular choice, by means of analysis tools, in order to capture and analyze everything a social user publishes (texts, images, etc.), to make faster judgments on the obtained results.

A social network facilitates the exchange of information and knowledge (Wu et al. 2017; Camacho et al. 2020; Aggrawal and Anand 2022) by connecting various entities through social relationships (Wasserman and Faust 1994; St-Onge et al. 2022). To understand the impact of this kind of network on group behavior and individual influence, SNA can be used to describe its structure (Wasserman and Faust 1994). Studies on web discourse primarily focus on computational aspects (Herring 2013), such as NLP (Yuksel and Tan 2018; Giarelis et al. 2021), online content analysis, and speech analysis (Moser et al. 2013; Kok and Rogers 2017; Pennington 2017; Goritz et al. 2019), with a significant emphasis on text analysis and utilizing other knowledge areas. However, SNA and semantic analysis of posts have not received much attention and remain largely theoretical (Zhao et al. 2021). Freire et al. (2022) provide an overview of such studies and conclude that they mostly focus on rhetoric, argumentation, semantics, and pragmatic issues rather than discursive aspects for text analysis on the web. Text classification algorithms, such as Word2Vec proposed by Mikolov et al. (2013) or GloVe (Pennington et al. 2014), have been developed to represent semantic information from a given content. Alam et al. (2020) explore the use of NLP and computerized visualization techniques to leverage information created during disaster events and suggest combining different computational techniques in such contexts. Fani et al. (2020) use temporal content similarity to learn neural embeddings for users and neural graph embedding techniques to use social network structures to represent information. Wankhade et al. (2022) discuss various methods to obtain and analyze people’s opinions on different topics, as well as the application of sentiment analysis.

Despite the existence of some theoretical frameworks, most previous studies on airlines’ OSNs have been exploratory and mainly focused on machine learning or lexicon-based methods to categorize airline companies’ OSN content. Many of these studies only regarded the application of sentiment analysis to textual content. For example, Dutta Das et al. (2017) analyzed Twitter data from Emirates and Jet Airways using the Naïve Bayes algorithm in R and Rapid Miner tools to map tweets into three categories (positive, neutral, and negative). Rane and Kumar (2018) used preprocessing techniques to clean tweets from six US airline companies and used Doc2vec, a concept of deep learning, to perform phrase-level analysis that takes word order into consideration, leading to an aggregation of sentiments in three categories using different classification strategies. Tiwari et al. (2019) presented positive and negative sentiments and their correlation regarding customer tweets, by means of BIRCH (balanced iterative reducing and clustering using hierarchies), a hierarchical clustering approach, combined with rule mining techniques. Bartos and Badanik (2019) emphasized the importance of OSNs for airlines and presented examples of how airlines can deal with communication crises, contributing to airlines’ revenue growth. Naseem et al. (2019) presented a deep intelligent contextual embedding to solve language complexities and embed word sentiments. The authors’ model addresses the complexities of other languages with a character embedding. Al-deen et al. (2021) proposed a combination of deep neural networks with a multi-head attention mechanism. They also conducted sentiment analysis and text classification tasks using a large Twitter dataset on the major US airline problems to experiment the performance of their model. Kumari and Prajna (2021) used a Twitter dataset of passengers’ feedback of a US airline to propose a combination of classification methods and bagging of machine learning-based text classification techniques for the development of a voting classifier.

As none of the previous studies have considered the integration of all network elements, web discourse entities, process steps, or SNA techniques, in this study we present such integration by analyzing an airline’s OSN, using SNA metrics, that act as indicators for possible customer service actions. To make this article more self-contained we present, in Table 1, a brief explanation of the used measures to determine the centrality, influence, and similarity of posts (from which we determined the conclusions that will be presented in Sect. 5).

Table 1 SNA metrics

3 Web discourse in OSN’s context

OSN’s comprise interacting processes in which emerging networks are built through an established dialogue (discourse). The complexity associated with the analysis of social relations, social interaction data analysis requires the use of methods and techniques, studied and validated in different areas of knowledge, to explain the discursive phenomena of communication. The language used in OSN’s tends to be casual and informal, mixing linguistic and non-linguistic resources (Freire et al. 2015), especially to communicate quicker and express negative emotions, and, in the airline context, it also bares a multi-lingual nature (Antunes et al. 2014). This usually produces tangled, disordered, incomplete, and sometimes garbled data.

Briefly, the discourse produced on Facebook has a minimum of 3 entities (user, post, and concept) that are represented as square matrices, while adjacency and affiliation matrices are used to depict web discourse. The adjacency matrix identifies and formally represents the links between a single entity and the affiliation matrices characterize the links between each of them. This type of affiliation matrices (Wasserman and Faust 1994; Borgatti 2009) can be used to analyze interactions among entities at three different levels, namely the interactions of: users, users and posts, as well as posts and concepts. Using information from our case study, Fig. 1 exemplifies the matrices of the three web discourse entities.

Fig. 1
figure 1

Adjacency and affiliation matrices of the web discourse

In order to provide a formalization to analyze an airline’s Facebook content and to determine its service level and overall responsiveness, two stages are needed: firstly, the analysis of the semantic contents embedded in social web discourse and, secondly, the social interactions between users, e.g., the links embedded within the social network structure. We distinguish the entities and connections described in Table 2, as main components of the information provided by OSN, as depicted in Fig. 2.

Table 2 Type of entities and connections
Fig. 2
figure 2

Web discourse abstraction

According to graph theory (Wasserman and Faust 1994; Borgatti 2009; Burguillo 2018; Bechberger and Perryman 2020; Galety et al. 2022), instead of using the three matrices (adjacency and affiliation matrices) it is possible to represent a network composed of several entities as a two-mode network. Therefore, in order to create the semantic network embedded within the web discourse of the airline’s OSN, we defined, in Eq. 1, the web discourse graph (WDG) as a tuple:

$${\text{WDG}} = \left( {U,P,C,L_{{1}} ,L_{{2}} ,L_{{3}} ,L_{{4}} } \right)$$
(1)

where U = (u1,…, uk) is the set of users, P = (p1,…, pw) was the collection of content produced by U, and C = (c1,…, cm) encompassed all the concepts within a post or a comment. In Eq. 2, we represented the concept c ∈ C of the produced content post p ∈ P of each user u ∈ U as:

$$X_{{{\text{upc}}}} = \left[ {x_{{{\text{upc}}\;1}} \cdot \, x_{{{\text{upc}}\;n}} } \right]$$
(2)

which we referred to as the user’s produced content, where xupc n ∈ ℝ[0,1] shows the concept c of a post p produced by a user u. The equation represents the vector of user u’s produced content for a particular concept c in the WDG. Each element of the vector, Xupc n, represents the degree of association or relevance that the content created by user u has with concept c, where Xupc n is a real number between 0 and 1. The vector Xupc is used to represent the user’s contribution to the concept c in the OSN and to calculate various network measures, such as centrality, influence, or similarity, that help to analyze and understand the structure and dynamics of the WDG. L1, formalized in Eq. 3:

$$L_{{1}} = \, \left( {u_{1} ,u_{k} } \right) \in U \times U$$
(3)

refers to a set of links, that reflect each user u1 interaction with user uk. The notation U × U indicates that the links are established between two elements of the set U, which represents all the users in the OSN. Specifically, the links in L1 represent the interactions between users (such as direct messages, comments, or reactions to posts), which help to capture their social ties. The links in this equation are bidirectional or undirected, as directionality is not specified. L2, presented in Eq. 4 as:

$$L_{{2}} = \, \left( {u_{k} ,p_{w} } \right) \in U \times P$$
(4)

represents a set of links reflecting each comment of user uk regarding a post pw. The notation U x P indicates that the links are formed between an element of the set U, which represents all the users in the OSN, and an element of the set P, which encompasses all the content produced by users. Specifically, the links in L2 represent comments posted by a user in response to a particular post, which helps to capture the interaction between users and the content in the OSN. The equation of L3, described in Eq. 5 as:

$$L_{{3}} = \, \left( {p_{1} ,p_{w} } \right) \in P \times P$$
(5)

integrates a set of links within the WDG, reflecting each comment p1 posted in a post pw (typically a comment reply). The notation P × P indicates that the links are generated between two elements of P, which encompasses all the content produced by users. Specifically, the links in L3 represent comments posted in response to a particular post, forming a structure that is similar to a thread or discussion. Equation 6, which describes L4:

$$L_{{4}} = \, \left( {u_{k} ,p_{w} ,c} \right)U \times P \times C$$
(6)

is constituted by a set of links, that reflects each user uk that reacted to post pw, by using textual content. The notation U × P × C indicates that the links are established between an element of the set U, which represents all the users in the OSN, an element of the set P, which encompasses all the content produced by users, and an element of the set C, which represents the concepts within the content. Explicitly, the links in L4 represent a user’s reaction to a post, using textual content that relates to a specific concept, allowing to capture how users engage with and interpret concepts within the content of the OSN. The global network of all users, posts, and concepts generates a bipartite graph of the WDG (described in Eq. 7):

$$X = \, [x_{{{\text{upc}}}} :u \in U,p \in P,c \in C]$$
(7)

where each node represents either a user, a post, or a concept, and the links represent the associations between these entities. The graph X is constructed by taking the cartesian product of the sets U, P, and C, and then for each combination of u, p, and c, creating a node with the vector of produced content Xupc that represents the degree of association or relevance between user u, post p, and concept c. Thus, the bipartite graph X captures the relationships between users, posts, and concepts, and is used to analyze the structure and properties of the OSN, such as clustering, community detection, or network flow.

To create an interaction network, we defined, in Eq. 8, the customer user social network structure as a directed graph:

$$G_{{{\text{in}}}} = \left( {U,L} \right)$$
(8)

where U represented the set of users, and L the connections between the users. These connections were ordered pairs of users (u1, uk) ∈ L representing a social tie from u1 to uk (e.g., u1 interacting with uk). These connections were also ordered pairs of users and posts elements such as (uk, pw) ∈ L, representing a social reaction tie from uk to pw.

To create the semantic network, we defined, in Eq. 9, such network as a directed graph:

$$G_{sn} = \, \left( {P,C,L} \right)$$
(9)

where P was the content corpus (e.g., posts and comments), C encompassed the textual data, concepts, within posts, and L included the connections from the concept to the post. These connections were ordered pairs of concepts and posts elements such as (ck, pw) ∈ L representing a reaction tie from ck to pw. To build the semantic network, we identified the set of valid concepts Z that were extracted, and we linked each content to each concept z ∈ Z, i.e., Xupz. Later, to extract the available concepts from the users’ content collection and to link them to the posts, the cleaning algorithm, as well as the cleaning database developed by Freire et al. (2022), was applied. We aggregated all posts and comments into a single dataset that produced our semantic corpus. We identified the most relevant post in P, by calculating in-degree, PageRank and eigenvector metrics, and the key concept in Z, by calculating the out-degree metric, for every two-day datasets (all 6 of them), as further explained in Sect. 5.1.

4 The case study

Following the framework in Freire et al. (2017), users’ activities and supplementary data were collected, followed by data processing and representation. Data collection was carried out between May 1st and 12th. When comparing the various snapshots, it was observed that the first snapshot, referring to May 1st and 2nd, recorded the highest number of users (394). It was also observed that the data set for May 5th and 6th contained the highest number of posts and concepts (201 posts and 10,730 concepts, respectively). In total, 1328 users were identified, and 877 posts containing 43,969 concepts were collected.

A graph database was used to store structured data, while unstructured data were cleaned for a later standardization process before graph analysis and data visualization and the production of the output files. NodeXL (Hansen et al. 2020) was used for a periodical data extraction of two days, over the period of twelve days (creating six datasets, in which the first five regarded strike days and the last one an afterwards period). When collecting the data, only a set of OSN snapshots was permitted when using raw data from the user|user network, which led to the creation of two additional networks, to identify the relevance of posts, and to determine which concepts were used the most.

Using data mining and relational database techniques, as well as a common spreadsheet, we were able to aggregate the matrices regarding the three entities that support web discourse and represent the main network. Processed data were treated in Gephi because it not only responds to the needs of graphic studies using graphical user interface, but also due to its open source nature and free availability (Bastian et al. 2009), that allows the visualization and manipulation of the networks, if intended, and a later application of SNA techniques for its analysis. By using social network metrics, the most relevant nodes (users, posts, or concepts) were identified and classified according to their importance. For ethical, legal and privacy concerns, all user data was made anonymous, abiding national and international laws.

Semantic processing was also an important step of the analysis process, in which the data obtained from the retrieved 769 posts, was transformed into semantic networks. Contrasting to formal publications, the contents generated on OSN are not bound by grammatical and spelling rules, as abbreviations are prone to happen in web discourse texting, especially within posts. To solve this problem, the framework proposed by (Freire et al. 2017) adopted text mining to discover posts’ embedded concepts and stemming to normalize synonyms within posts, using a synonym table (for instance, every form of the concept “cancel”, such as “cancelling”, “canceled” and “cancelled”, were converted to a single identical stem concept “cancel”, thus hel** to reduce the sparsity of the output dataset).

The cleaning data process that was configured for each lexical language of each post involved: (1) the removal of symbols, punctuation or spaces between concepts; (2) the search for emoticons and their replacement by a standardization term; (3) the search for punctuation characters, and also their replacement by a standardization term; (4) the search for stop words and their removal; (5) the normalization of synonyms to substitute the identical, or almost identical, meaning of concepts with a single one. The output posts encompassed 20,438 concepts in 7 different languages (German, Spanish, French, Italian, Dutch, Portuguese, and English).

5 Results and data visualization

After using Gephi to create the networks for each dataset (snapshot), we calculated the centrality measures of each dataset, as shown in Table 3.

Table 3 Most influential users and most relevant posts, according to SNA metrics

As expressed in Table 3a, the airline company and fan page owner (TAP) exhibited the highest values for in-degree, out-degree and betweenness centrality. Such results tell that the airline company was quite active (regardless of some TV interviews, conducted at the time, that expressed the opposite) and did have some degree of control over the information flow.

Regarding closeness centrality, and as indicated in Table 3, many users (other than TAP) got the highest possible value (92 users in the first dataset; 83 in the second; 102 in the third; 58 in the fourth; 68 in the fifth; and 28 users in the last one), meaning not only that they communicated much more directly (or with lesser intermediate users) to other users than the airline company did, but also that they were able to disseminate information throughout the network rather quickly, as closeness centrality is related to the amount of time in which a certain information gets shared by all network nodes. It is important to notice that the fact that none of those users were TAP does not imply that there was no closeness between TAP and other users of the page. It only indicates that, in each time interval, there were several other users who had a shorter path to other nodes than TAP did. In a strike situation like this, determining the users with higher closeness centrality seems rather important to mitigate users’ flaming, eventually by addressing their complaints more thoroughly, kinder, or quicker, as their interventions will propagate faster throughout the network, generating a “fireball” effect (by flaming others). This sort of user management was not addressed by the airline company during the ten days of strike, as only in the latest dataset, when the strike was over, we observed a reduction of users with high closeness centrality.

According to PageRank and eigenvector metrics, the airline company got the higher values most of the time, which corroborated a control in information flow, and a centrality position within the network, although two other users gained relevance during the first half of the strike. When those users got their situation solved, the airline company became central within the network once again.

To analyze the messages, in-degree, PageRank, and eigenvector, were the selected metrics (as depicted in Table 3b), because in-degree quantifies incoming connections from users to posts, while PageRank and eigenvector accounted for the amount and quality of a node’s connections and the centrality of its connected nodes, respectively. In Table 4 we summarize the obtained results, for each metric, when analyzing the 6 snapshots of the airline company.

Table 4 SNA metrics for the airline company

Snapshot 3 was the period in which the highest values were recorded for in-degree (for posts or comments published on the fan page), out-degree (for airline replies) and betweenness centrality. If we think of Facebook as a call-center service, it can be assumed that in-degree quantified the flow of incoming calls, and that out-degree quantified the effort in answering the calls. From this perspective, it is possible to prove the effort from the airline company in responding to user requests. Although it is not possible to assure from data, whether such effort turned out to be a mere apology or an actual resolution of a problem, it was, nonetheless, an effort in reaching users, and providing some sort of feedback, as expected from customer service (an apology will always be a far better response than mere silence). In-degree, out-degree, and betweenness centrality, therefore, can be used not only as thresholds for customer-level standards, but also as verification metrics of the defined thresholds.

Still in Table 4, the fourth snapshot evidenced the highest value in closeness centrality and PageRank (0.966 and 0.011 respectively), meaning that those were the days when the airline company directly communicated with more users and that such activity was felt of actual importance. At this point in time, 6 days of strike had already elapsed, and the airline seemed to have picked up the pace and deployed more resources into getting to know users’ problems and to address them, while reducing the pressure for information/contact from the airline company. Due to this fact, we stand that it is important to get closeness centrality and PageRank as high as possible, as it translates into a higher customer service level.

5.1 Web discourse network

Improved visualization and clearer presentation could greatly enhance numerous discursive exchanges on OSNs. Distinguishing topics, disagreements, and points of view more effectively, as well as navigating through critical questions rather than following a chronological order, are crucial needs that must be addressed. To analyze the OSN data and to display the final information to decision makers, which is presented in tables of a spreadsheet, several intermediary networks are seamlessly built. If intended, each of those networks could also be depicted for analysis. For academic purposes we will present the most illustrative ones. In Fig. 3 we evidence the web discourse network with all entities, while in Figs. 4 and 5 we depict the sub-networks user|post and post|concept, respectively.

Fig. 3
figure 3

Network: user|post|concept. a, b, c, d and e networks during the ten days of the strike. f network on the following two days. Each network consists of nodes representing users, posts, and concept entities, and edges representing connections between them. The nodes are color-coded based on subgroups, and the size of each node represents its importance in the network (a more detailed visualization is available at https://drive.google.com/drive/folders/1cmrV2ekslItWloBKVTB1hF2NXYAvStRw?usp=share_link)

Fig. 4
figure 4

Network: user|post. a, b, c, d and e networks during the ten days of the strike. f network on the following two days. Each network comprises nodes that represent users and posts, along with edges that depict connections between them. To distinguish different subgroups, nodes are color-coded accordingly (a more detailed visualization is available at https://drive.google.com/drive/folders/1qI0u3cXu-q0DTO-f1lrpMxGtezEN-E-F?usp=share_link)

Fig. 5
figure 5

Network: post|concept. a, b, c, d and e networks during the ten days of the strike. f network on the following two days. Each network is composed of nodes that represent both posts and concepts. The size of each node and label reflects the frequency of the most commonly used concept within the network (a more detailed visualization is available at https://drive.google.com/drive/folders/1RaHi1ly1FEHuU7WtJdzfG6SwhAZgQSu9?usp=share_link)

In Fig. 3, from (a) to (f), we represented the obtained results for all the 6 datasets. Identified clusters use modularity class, nodes’ size is based on betweenness centrality and labels’ size uses in-degree. Larger nodes indicate that they are more central to the network, playing a critical role in information flow and network cohesiveness. The size of the labels assigned to each node is proportional to the number of interactions received by the corresponding user or post, providing valuable information about the most important entities in the network. This information can be used to identify the most commented or viewed posts, as well as the most influential users within the network. Similar colors represent different subgroups, as well as denser interconnected nodes. Each depicted network conveys users, posts, and concepts, and each one represents a single dataset of a two-day period, as explained before. All representations in Fig. 3 were obtained by applying the Network Splitter 3D visualization algorithm (Barão, 2014) in Gephi, after the application of ForceAtlas3D Layout visualization algorithm.

After determining the above-mentioned centrality metrics, all six networks were disaggregated and only two of the entities of the web discourse were analyzed: users and posts. As seen in Fig. 4, identified clusters use modularity class, nodes’ size is based on in-degree centrality and labels’ size uses out-degree. Moreover, the size of each node is proportional to its importance within the network. Larger nodes indicate users that are more central and receive a higher number of interactions within the network. The label size, as in Fig. 3, reflects the number of user interactions throughout the network, highlighting the most active users.

From the posts it was possible to determine that many users complained that they got no answer from the airline’s call-center, nor any response to their emails, while service offices were closed. Additionally, the airline’s website did not provide any information about the strike. Despite contrasting with the earlier obtained perspective from Table 4 (in which was proven that there was an effort from the airline company in responding to user requests), this situation can mean that not every user was, in fact, assisted. This is corroborated by the isolated nodes between users (and their posts) and the airline company (TAP) that are possible to perceive in Fig. 4, meaning that the airline did not address the posts from those people. Nonetheless, not every isolated node meant an unaddressed request and taking a deeper look into Fig. 4a, we can observe, for instance, three large, isolated posts (a green, a blue, and an orange one) where no such thing happened. For instance, the green post (post-01) was a message posted by a user (User-94) informing that, due to the strike, he would not be able to perform that night in a well-known London electronic music club. This post got many other replies and comments (thus having the highest metrics, namely an in-degree of 66, a PageRank of 0.062 and an eigenvector of 1); the blue post (post-02) was a political endorsement to the airline as a strategic asset for the country and the need to provide it with better resources to fully develop its potential, that also got many replies (achieving an in-degree of 56, a PageRank of 0.053 and an eigenvector of 0.848); and the orange post (post-35) was an open critic to pilots and their high wage level, that, again, got many replies (achieving an in-degree of 30, a PageRank of 0.026 and an eigenvector of 0.455).

From an airline company point of view, identifying these posts (with high in-degree, PageRank, and eigenvector values) is important, in order to allocate resources more effectively. As posts get more and more interactions, airline companies may want to filter the ones that really need answering, so it can get a more adequate level of customer service being demanded. In addition, having well-defined and disseminated public relation rules about which posts the airline’s staff should not engage in can reallocate resources for actual customer service. The posts that got answered are represented by the nodes connected to the airline company (User-TAP). All figures denote that during the strike days, the airline company did not address many of the questions posted on Facebook.

5.2 The semantic network

Figure 5 represents the network that encompasses posts and its concepts. By using out-degree, it was possible to determine the most frequent and relevant concepts extracted from posts, to create a semantic sub-network, generated to look for specific information used in posts. This approach allowed us to identify the nodes that received the most direct connections, indicating how many times a concept had been used in the network. To this purpose, we represented the obtained results for all the 6 datasets for every two-day period.

Using a semi-automated text analysis, it was possible to extract a list of commonly used concepts and to rank them. All extracted concepts that belonged to the same post were seen as a community, by means of a modularity class algorithm. This indicated that they were more densely connected with each other than with the rest of the network. The out-degree metric allowed to determine which nodes had the higher number of outbound links, informing on how many times a concept had been used in a post.

This approach (getting to know the network’s concepts) allows airline companies, in general, to grasp which are the major issues that are being reported, to find if common solutions can be provided, optimizing customer response, by making it more cost-effective and to better allocate resources to specific problems, when possible. According to the density and number of connections in Fig. 5 (especially during the first 8 days of the strike—networks a–d), where the same concepts are being addressed over and over again, it is reasonable to assume that the airline’s customer response could have been better coordinated and that the publication of the same information that was provided many times (on the airline’s website, on Facebook’s main page, or by using the company’s billboards at the airport, for instance) could have taken a load from answering users (one by one), using the airline’s Facebook account, thus reinforcing customer service level.

Although, as mentioned in Sect. 3, users tend to mix characters and symbols to communicate quicker, especially when expressing negative emotions, the discursive exchanges found at the airline’s fan page, did not resemble literature characterizations and definitions. The collected messages were quite long, formal, and well-formed (like in a commercial letter), bearing few abbreviations or emojis and, despite the apparent dissatisfaction, quite polite (interestingly, it was found that the concepts “good morning”, “good afternoon” and “thank you” were widely used). Moreover, in the posts that registered the highest values for the in-degree metric, i.e., the messages that were mostly viewed or commented on in each snapshot, it was verified that they either had a high number of concepts, or a reduced number of concepts. For example, post-134, on the 3rd and 4th of May, contained a total of 942 concepts and, in the opposite direction, post-103 on the 11th and 12th of May, only had 8 concepts. The visualization of these two messages (see Fig. 6) allowed us to conclude that, despite both showing dissatisfaction, the first was a complaint and the latter intended to express indignation/outrage denigrating the airline’s image.

Fig. 6
figure 6

Example of posts with the highest and lowest number of concepts

Messages containing a higher number of concepts not only can indicate valid assistance requests, but that messages with only a few concepts may turn out to be swearing or foul language. In this case, post-134 contained information that could be used to respond to the customer assertively, since its concepts identified the time, origin, and destination of a flight. Regarding post-103, its identification would be important, even if censoring was the decision to be made. Though not tested in this case study and thus requiring future analysis, defining thresholds for the number of concepts within each post may become important for airlines to filter valid assistance requests, from merely defamatory comments that could be automatically censored or removed from the network, as well as the user that made them.

6 Final remarks

Classic decision support systems are usually seen as the major trend in decision support, but the use of OSN’s and associated technology are driving the creation of new frameworks to simplify data manipulation and decision-making and support, especially when such tools are available for a very low cost or even for free (which was the case of the used software in this paper). While it is true that the building of interaction networks and semantic networks has been previously explored, our study offers an approach for analyzing airline-related Facebook user data by integrating all network elements, web discourse entities, process steps, and SNA techniques, to an airline’s OSN, as, to best of our knowledge, no previous studies have considered such integration. Understanding an OSN context means not only that an organization can follow what their customers do, but also that they can get much closer to what they say and want. With the explosive growth of online available social data, businesses no longer can neglect it when making decisions.

SNA can provide valuable insights into the performance of an airline’s Facebook account when it is considered as a call-center service. By analyzing the interactions between the Facebook account and customers, call-center managers can identify patterns, trends, and areas for improvement. Several SNA metrics, described in Table 5, such as degree, closeness, and betweenness centrality, can be used to improve the performance of an OSN call-center.

Table 5 Summary of metric contributions

Firstly, SNA can be used to identify influential customers by using degree centrality, eigenvector, and PageRank measures. These metrics can help call-center managers to prioritize their interactions with influential customers who have the most connections and may influence the opinions of others. Secondly, closeness centrality can be used to improve response times and enhance the overall customer experience, by identifying customers who are closest to each other and are most likely to need a quicker response. In-degree and out-degree measures can also be used to identify the most active users in the network who need additional support.

Thirdly, betweenness centrality can be used to identify which users act as bridges between other users in an OSN call-center. This metric can help call-center managers to resolve issues more effectively and use customer feedback to drive improvements in their operations. Fourthly, betweenness centrality can be used to identify users who are most effective at resolving issues and which customers are most likely to provide valuable feedback. By prioritizing interactions with these users, call-center managers can improve the resolution at first call of issues and drive improvements in their operations.

Lastly, SNA metrics such as in-degree, out-degree, PageRank, and eigenvector centrality can help call-center managers to measure the company's performance in terms of the average speed of answer and customer satisfaction score. By monitoring these metrics, call-center managers can quickly identify the most active users in the network, the need for additional support, and highly influential customers who have a significant impact on customer satisfaction.

The performed analysis can help airline companies, in general, to identify troubles in traditional contact channels and to prompter prevent service breakdowns. Calculating SNA metrics and using network visualization techniques allows to use OSN’s as important business aids, thus enhancing their value as business information sources for decision-making, as we ratified that SNA contributes for a deeper understanding of customers’ behavior and interaction (amongst themselves and with the airline companies).

By analyzing the posts during the strike days of the referred airline company, categorizing such posts would have allowed the airline company the means to address cases complaints, namely by creating/focusing task groups for hel** in reservation changes, information requests (refund procedures, available hotel options, luggage issues, etc.). The identification of concepts could have also been used by the airline company to reinforce the means for providing permanent information (over service counters or online channels) about flight cancellation; rescheduling flight procedures or even to filter valid assistance requests from defamatory ones, though further research is needed on this matter. The OSN visualization by evidencing relevant conversations would have allowed the airline company to faster address clients’ dissatisfaction sentiment and whether such feeling was being aggravated or mitigated with customer service actions.