On Algorithmic Content Moderation

Prem, Erich; Krenn, Brigitte

doi:10.1007/978-3-031-45304-5_30

Erich Prem^8,9 &
Brigitte Krenn^8,9

8525 Accesses

Abstract

This chapter provides an overview of the challenges involved in algorithmic content moderation. Content moderation is the organized practice of screening user-generated content (UGC) on Internet sites, social media, and other online outlets to determine the appropriateness of the content for a given site, locality, or jurisdiction. The most common technical approaches consist in using classifier systems that assign predefined category labels to individual posts. We briefly introduce pre- and post-moderation and provide real-world examples of algorithmic moderation systems used by an Austrian daily newspaper. We point to significant challenges of moderation such as the ambiguities of natural language and the implications for freedom of expression. We conclude with issues that algorithmic content moderation raises for societal power relations and democratic control.

You have full access to this open access chapter, Download chapter PDF

Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation

Moderation Techniques for Social Media Content

The German Comment Landscape

1 Introduction

For all we know, the human capacity for language is second to no other species in the animal kingdom. It is regarded as essential for the development of human society and its cultural achievements. In addition, humans regard language as something extremely personal. In expressing thoughts, desires, intentions, or beliefs with words, humans experience themselves as individuals. This is one of the reasons why freedom of expression is considered a human right in many jurisdictions, even where it may not be practically granted. Hence, it is only natural that efforts to moderate linguistic expressions in the digital realm are an important topic in Digital Humanism. Another reason why content moderation deserves a prominent place in digital humanism is that speech has traditionally been the medium through which politics happens. From the public debate in the ancient Greek polis to the modern-day speeches in mass media, leaders lead through language and are challenged in debates. Language thus is the medium that facilitates power, and it can be the medium by which power is taken away as in the democratic vote of the people.

The intention to regulate (or “moderate”) what becomes published is not new nor is it exclusive to digital media. What should be published has probably been a central societal, political, religious, and ethical concern for as long as writing exists, but definitely ever since the invention of the printing press. It was a subject of censorship and is still regulated by law and ethical norms, including traditional mass media (newspapers, books, TV, etc.) in modern liberal democracies. For example, many countries have laws prohibiting the publication of terrorist content and have rules for the publication of certain types of material, such as age limits for pornographic content. Often, there are self-governing bodies that regulate what can be published, for example, in mass media or in advertising. Traditionally, such limitations on publication were implemented through reviewers, editors, censors, or courts and performed by humans. They could limit publication, remove content, or restrict the audiences of certain publications. In principle, these instruments are still applicable in the digital world. However, an important new quality of content moderation emerges, when the decision to regulate content (including who can see it) is taken with the help of algorithms.

2 What Is Algorithmic Content Moderation

Content moderation is a fairly recent field of digital technologies. Even though language technologies relevant for content moderation today have been develo** for decades, researchers pay much closer attention to the area since the development of large online social media platforms. In many cases, content moderation is a response to the fact that discourse on such platforms has proven problematic. Online social platforms facilitate the distribution of untrue information (fake news), the creation of environments where people are only exposed to opinions reconfirming their interests and beliefs (filter bubbles), verbal abuse, and many other troublesome phenomena. While these phenomena are by no means exclusive to digital media or social networks, they may be exacerbated in large communities of speakers with no personal interaction other than the messages that they exchange online. Up until a few years ago, relatively few scientific papers dealt with the topic. The number of scientific publications has been increasing since 2015 (Fanta, 2017), primarily focusing on technical approaches, ethical challenges, and the perception of automatically generated content by journalists and the public. Communication science has also dealt theoretically and empirically with automated content and, in particular, automated journalism for several years.

Content moderation addresses a topic that not only concerns individuals and their linguistic online expressions; it deals with communication and how humans establish social relations. It addresses how we interact with each other and how we make sense of the world. In the following definition of content moderation, we refer to Roberts (2017):

Content moderation is the organized practice of screening user-generated content (UGC) posted to Internet sites, social media, and other online outlets, in order to determine the appropriateness of the content for a given site, locality, or jurisdiction. The process can result in UGC being removed by a moderator, acting as an agent of the platform or site in question. … The style of moderation can vary from site to site, and from platform to platform, as rules around what UGC is allowed are often set at a site or platform level and reflect that platform’s brand and reputation … The firms who own social media sites and platforms that solicit UGC employ content moderation as a means to protect the firm from liability, negative publicity, and to curate and control user experience. (Roberts, 2017, p. 1)

Content moderation is not only an issue for content providers such as online newspapers but also relevant for social media platforms such as Twitter or Facebook. It is relevant for text-based online systems as well as networks that focus on other types of content such as images, videos, and even music. In this chapter, we focus on text-based systems.

There are two main reasons to screen user-generated content (UGC).

Reason 1: Depending on national legal regulations, media providers are liable for the content published via their sites. This is particularly the case for online newspapers who underlie national regulations as of what content is permitted. For Austria, this is regulated in media law.^{Footnote 1} The situation is less clear for providers of social media platforms such as Facebook, Twitter, or TikTok, which is a comparably new worldwide phenomenon. Thus, the formulation of international requirements of conduct and legal standards is necessary. Respective initiatives are underway. An example is the European Digital Services Act (DSA), which is an endeavor of the European Union to regulate online services including social media.^{Footnote 2} The DSA was put into force in November 2022 and is planned to be fully applicable by February 2024. Due to national legal regulations, it has been vital for providers of online newspapers ever since their existence to filter out UGC that conflicts with the law. This activity is called pre-moderation and is done before posts go online. Pre-moderation is typically done automatically, because of the sheer quantity of incoming posts and the speed the posts need to be processed in order to guarantee real-time communication.

Reason 2: Individual content providers have different editorial concepts, and depending on these, they have differing demands on how people are expected to communicate with each other. This is typically communicated via the terms of use and specific rules of netiquette. Fora of online newspapers with a claim to quality are typically moderated by human moderators. An example of an online newspaper with a strong moderation policy is derStandard,^{Footnote 3} which has a team of human forum moderators whose main task is to support a positive discussion climate in the newspaper’s online fora.^{Footnote 4} This kind of moderation is called post-moderation as the moderation activities relate to posts that are online. Apart from community activities where users can flag inappropriate posts, natural language processing (NLP) plays an important role in supporting human moderators in finding posts that might be interesting to a larger group of readers of a forum than just to the few who participate in a certain thread. Automatic systems can also help identify fora or phases in the discussions, which become increasingly emotional or discriminating. Examples will be given in the following section.

There are more reasons for algorithmic content moderation, for example, the identification of protected intellectual property (see chapter by Menids in this volume).

3 Technical Approaches to Content Moderation

Technical approaches to content moderation are based on classifiers. Text classification is a core method in NLP, employing different methods of machine learning. Classifiers are used to categorize data into distinct groups or classes. Roughly, they are mathematical models that use statistical analysis and optimization to identify patterns in the data. To train a classifier, a certain amount of labeled data is required, representing in-class and out-of-class examples. A number of classifiers exist, including logistic regression, Naïve Bayes, decision tree, support vector machine (SVM), k-nearest neighbors (KNN), and artificial neural network (ANN) (see, for instance, Kotsiantis et al. (2007) for a review of classification techniques and Li et al. (2022) specifically for text classification).

In the case of fora in online newspapers, text-based classifiers are employed, which assign predefined category labels to individual posts. In practice, a number of different classifiers are in use, depending on the moderation tasks at hand. In pre-moderation, UGC is classified into content that can or cannot be posted on the media site as it adheres to or infringes the requirements of the respective (national) media law, or should not be published because it violates the medium’s defined online etiquette or community policy (Reich, 2011; Singer, 2011). In post-moderation, classifiers support the forum moderators in identifying postings of interest. What is of interest is defined by the individual media companies and may differ across individual resorts up to individual articles. All in all, moderation is an important success factor for online discussion culture (Ziegele & Jost, 2016).

The classifier technologies in use widely differ depending on the time the classifiers were developed. Earlier classifiers often use decision trees and support vector machines (SVMs); more recent ones are based on neural networks (deep learning). As technology advances over time, deep learning-based approaches typically lead to better results than classical machine learning-based approaches such as decision trees or SVMs. For illustration, examples from the Austrian online newspaper derStandard.at are given in the following.

In pre-moderation, derStandard.at has been using a decision tree-based system (Foromat, developed by OFAI^{Footnote 5}) since 2005. While before the implementation of the system human editors had to manually inspect the incoming posts and decide which ones could go online, Foromat significantly reduced the amount of posts that needed to be manually inspected. With the volume of postings drastically increasing with the shift from print to online, manual pre-moderation became simply impossible and thus was left to the system. Accordingly, community measures such as the possibility for users to flag content as inappropriate and post-moderation became more important as a means to filter out inappropriate postings after their publication.

Post-moderation is key for encouraging an agreeable discussion climate in fora. Starting in 2016, the De-Escalation Bot was developed by OFAI together with derStandard (Schabus et al., 2017; Schabus & Skowron, 2018).^{Footnote 6} This is another kind of classification task where the classes were designed to prevent escalation in fora and to identify valuable contributions to discussions. The thus classified posts are then sifted by the moderators, and those which the moderators consider of general interest are ranked at the top of a forum to be easily accessed by all users of the forum. According to derStandard this has noticeably improved the quality of the discourse.^{Footnote 7} A more recent collaboration between OFAI and derStandard lead to a classifier that helps moderators to identify misogynist posts in order to counteract online discrimination against women (Petrak & Krenn, 2022). This is an important precondition to foster female contributions to forum discussions. While the proportion of individuals who identify themselves as men or women among the online readers of derStandard is relatively balanced at 55–45%, there is a clear imbalance when it comes to active contributions, i.e., only 20% of posters identify themselves as female (surveyed on the basis of indication of “salutation” in new registrations).

The limitations in the area of classifier-supported moderation lie primarily in the necessary provision of correspondingly large training data annotated by domain experts (typically moderators). So far, this is usually done once when the classifier is developed. Over time, however, the wording of posts may change as users may counteract moderation strategies, also what is considered relevant and desirable content is likely to change over time, as well as which marginalized user groups and what measures are required to encourage their contributions. Therefore, mechanisms need to be integrated in moderation interfaces where the moderators easily can collect new training data during their daily work, and classifiers capable of online learning need to be developed. This, however, is still a question of basic research [for some further reading on online learning, see Cano and Krawczyk (2022) and Mundt et al. (2023)]. Likewise, depending on the information to be identified, the available training data, and the machine learning architecture used, the accuracy rates can vary significantly. In all cases, however, the moderation quality to be achieved at the end strongly depends on the human experts, the forum moderators. The advantage of the classifiers is to direct the moderators to potentially relevant posts, whereas the final moderation decision lies with the moderator.

Apart from encouraging an agreeable discussion climate, tendentious and fake news detection is another important aspect of content moderation. This is particularly relevant in social media, which significantly differ from fora in online newspapers. Whereas in online newspapers a forum is related to an individual article or blog entry written by a journalist and redacted according to the editorial policy of the respective newspaper, UGC on social media platforms is far less controlled. Accordingly, social media platforms offer a high degree of freedom of expression while being open to all kinds of propaganda and misinformation. This became particularly obvious with the US presidential elections in 2016. Up to date, the identification of fake and tendentious news has become a very active field of research (see, for instance, the SemEval2019^{Footnote 8} task on hyperpartisan news detection (Kiesel et al., 2019), where 42 NLP systems from all over the world designed for identifying extreme right- or left-wing news competed against each other). The comparison of the systems showed that no single method had a clear advantage over others. Successful approaches included both word embeddings and handcrafted features. SemEval2023 subtask 3 addresses persuasion techniques identification. Here especially the identification of manipulative wording and attack on reputation are of interest for tendentious news detection.^{Footnote 9} (See also Zhou and Zafarani (2020) for a discussion of theoretical concepts and approaches to fake news detection.) Automated fact checking is another area of NLP that addresses the task of assessing whether claims made in written or spoken language are true or false. This requires NLP technology to detect a claim in a text, to retrieve evidence for or against the claim, to predict a verdict whether the claim is true or false, and to generate a justification for the verdict (cf. Guo et al., 2022). Apart from NLP-based research on automated fact checking, there is a broad range of journalist-based fact-checking initiatives and sites, such as PolitiFact, a fact-checking site for American politics; EUfactcheck, an initiative of the European Journalism Training Association; the European Fact Checking Standards Project where European organizations involved in fact checking cooperate to develop a code of integrity for independent European fact checking; or Poynter an international fact-checking network^{Footnote 10}—to mention only some existing fact checking initiatives.

Disinformation detection is a moving target because new topics and concerted propaganda are constantly evolving, and fact checking must adapt accordingly. Moreover, there is ever-growing and refining technology for automated disinformation production due to technological progress in deep learning, which enables the creation of deepfakes, i.e., the use of deep learning for the automated generation of fake content. As examples, see the recent large generative language models (such as OpenAI’s GPT family or Google’s PaLM, Meta’s LLaMA, etc.) that are able to flexibly generate text based on prompts or the advances in neural visual content generation (e.g., Google’s Imagen or OpenAI’s DALL-E) where pictures are generated on the basis of textual input, as well as the possibilities to generate realistically looking, however, fake audio and video content. (See Zhang et al., 2023, for a survey on generative AI.) Note that the developments in generative AI are fast and up-to-date models at the time of writing this contribution may be outdated soon.

4 Societal Challenges

Freedom of expression. As already mentioned, the identification of illegal content is an important reason for content moderation in general and algorithmic moderation in particular. Often, such content will be removed after detection, and it may incur further legal procedures. For example, it is illegal in Austria to publicly deny the Holocaust. In addition, there is content that conflicts with the terms of use or the community guidelines of a social network. For example, some networks exclude nudity or have strict policies regarding false and misleading information or “fake news.”

Often, however, content moderation is publicly debated in the context of unwanted (or harmful) content. Such content is much more difficult to define. Note that in many countries, it is not in principle illegal to lie or to use abusive language. Still, such content is often considered societally unwanted, for example, because it affects certain parts of society stronger than others, may lead to the spread of dangerous information, or be used for exerting unwanted (e.g., political) influence. It is, however, a major problem of the notion of harmful content that it lacks precision and definition. It is often not very clear who is harmed and in whose interest it is to identify, mark, or remove such content. This is one of the reasons why calls to remove harmful content raise concerns and accusations of censorship. In addition, even productive debates may benefit from some degree of strong language and therefore, according to the European Court of Human Rights, may require information that offends, shocks or disturbs (ECHR, 2022). This makes it even less clear what precisely should be considered harmful in online debates (Prem, 2022).

The extent to which algorithmic content moderation actually interferes with the principle of freedom of expression is difficult to analyze in practice. This requires a detailed topical analysis of the practices of deletion, for which data are often lacking. It also requires a differentiated analysis of the degree to which banned content (or banned users) can turn to other media to express their thoughts. The core challenge is to strike a balance between protecting users and legal obligations on the one hand and safeguarding freedom of expression on the other (Cowls et al., 2020).

Challenges of meaning and languages. A central challenge in content moderation and indeed in most aspects of language technology is the identification of what users actually mean. Human utterances are easily misunderstood in everyday life, but the notion of meaning is also an elusive philosophical concept that has been debated for centuries. Features of natural languages such as humor, irony, mockery, and many more are notoriously hard to detect (Wallace, 2015), not only by machines but to some extent also by humans. Many algorithms for content moderation are context-blind and have great difficulties detecting nuance. However, such nuance is often important in debate. A particular difficulty in algorithmic moderation is to follow discussions over an extended stretch of online debate. In fact, still today, many algorithms operate only on single posts, while the intended meaning of a post may require taking into account longer stretches of dialogue. Estimates suggest only a 70% to 80% accuracy of commercial tools (Duarte et al., 2017). For example, in 2020, Facebook removed posts with the hashtag #EndSARS in Nigeria. It was intended to draw attention to police attacks against protesters, but the moderation algorithms mistook it for misinformation about COVID-19 (Tomiwa, 2020).

Another huge challenge for the practice of algorithmic moderation is the fact that many technologies only work well for a few common languages, first and foremost English. Rarer languages in particular often lack datasets suitable to train NLP models. This raises another ethical and perhaps legal challenge regarding the fairness of different standards in content moderation for different languages. In some regions, such as the European Union, there are a variety of different languages that should be treated equally. There are 24 official languages of the EU, and some of these have a relatively small community of speakers (e.g., Maltese), and significantly fewer texts are available. There are also fewer datasets available for researchers and for AI development. Moreover, the computational and economic power required to train large language models as well as the access to large datasets, which are the basis of current NLP systems today, lies in the hands of private companies such as OpenAI, Meta, or Google.

Political challenge: power of control, silencing, selection, and redress. Online content moderation raises important issues of power and power relations. Online moderation can have a decisive influence on which topics in a debate disappear from public discourse. It can silence specific groups, and it has the power to control whose online contributions are shown to whom. Hence, the delegation of content moderation to algorithms turns content providers into subjects of algorithmic decision-making.

This leads to other important problems, namely, what happens when content is wrongfully reported, deleted, or in any other way restricted. Firstly, there is the problem of informing the authors that their contributions were subject of moderating interventions. Such information can be given, especially where content is considered illegal, but in practice, this is not always the case. In particular, the decision to limit the visibility of a post and to not prominently show it is hardly ever easily accessible for a content contributor. Secondly, the question arises how to contest the decision that content is considered inappropriate or illegal. Today, the focus of public regulation is more on the deletion of illegal content rather than on redress procedures and re-instantiation. Note that regulators do not always prescribe deletion directly but are introducing strict liability regimes that provide strong incentives for platforms to perform algorithmic content moderation even before content is reported by users or third parties (Cowls et al., 2020). This poses a significant threat for freedom of expression as it can mean that certain views are systematically suppressed with little or no chance of forcing online platforms to publish content. It also leads to the question who should be the regulatory body deciding upon complaints and implementing the re-instantiation of wrongfully deleted content. Wrongful deletion is a significant problem. For example, in Q2/2020, more than 1.1 million videos were removed from YouTube (Cowls et al., 2020). Such excessive deletion may be a consequence of regulation. Lawmakers have a tendency to regulate that removal of illegal content must be implemented within very short time frames. While regulation usually does not prescribe the use of algorithms directly, it is in practice the only way in which social networks or publishers can fulfill their legal obligations.

Since many newer techniques are based on statistical machine learning, there is an additional danger of bias and discrimination. Algorithmic content moderation systems can replicate and amplify existing biases and discrimination in society. For example, if the system is trained on biased datasets or programmed with biased algorithms, it may disproportionately flag and remove content from marginalized communities (Haimson et al., 2021). In addition, there is a lack of transparency. Content moderation algorithms are often proprietary, meaning that the public does not have access to the underlying code or the criteria used to determine what content is flagged or removed. This lack of transparency can make it difficult for users to understand why their content was removed (Suzor et al., 2019), and for researchers, it makes it hard to study the impact of these algorithms on society.

There are various ways to address issues of transparency and bias ranging from stricter regulation to increasing accountability or democratic approaches. Vaccaro et al. (2021) propose adding representation (i.e., moving toward participatory and more democratic moderation), improving communication (i.e., better explaining the reasons for moderating interventions), and designing with compassion (i.e., emphasizing empathy and emotional intelligence in moderation decisions). Other measures that have been proposed (Cowls et al., 2020) include ombudspersons, enforceable statutory regimes, and improved legal protection.

5 Conclusions

Content moderation is not a new phenomenon. It has existed at least since written forms of expressions developed and was performed by humans in charge. However, algorithmic content moderation is a new phenomenon tied to the explosion of digital content in online media and social networks. Today’s content moderation is a combination of human moderators and algorithmic support, however with a strong trend toward more automation in reaction to growing amounts of content and increasing regulation. There is a range of reasons for content moderation. This includes legal aspects (e.g., duties of platform owners to remove illegal content) and practical aspects such as filtering content for relevance and aiming for showing users the most relevant content. Social reasons for content moderation may include the enforcement of user guidelines or the identification of content that is considered inappropriate or harmful.

Moderation takes many different forms. Content may be signaled or highlighted to inform users about issues with the content; it can be practically hidden from users by downranking and thus decreasing the likelihood of this content to be viewed, or it can be entirely deleted. Content can also be delayed in publication and reported to boards or authorities. There are great differences in the extent to which users are informed about their content being flagged, and there is only limited access to current practices for many of the social networks today. Algorithmic content moderation is often a response to legal requirements. Lawmakers may not prescribe algorithmic deletion directly but prescribe short time limits or severe fines so that algorithms are the only practically viable approaches given the large volumes of data and the high frequency of user interaction.

Challenges arise from the difficulty to interpret human language automatically, especially regarding context and nuances of expression. Also language technologies and training data available for rarer languages are less advanced than those for English. Potential pitfalls include limitations of freedom of expression, bias and unfair treatment, and political influence exerted through silencing of dissent or tendentious moderation. It is possible to perform algorithmic moderation with societally beneficial intent. This includes the abovementioned example of de-escalation, encouraging factful and constructive online debates (Kolhatkar & Taboada, 2017; Park et al., 2016) and pursuing other moderation objectives, such as making voices of systematically underrepresented groups better heard.

The advent of generative artificial intelligence and large language models is generally considered a game changer in text-based AI and expected to trigger a plethora of new applications and systems. Most of the challenges described in this chapter also apply to AI text generators as they pose similar questions of classifying text as harmful or illegal, using and abusing generated text for political propaganda, facilitating the creation of text for children and many more. Similarly issues of discrimination, bias, and transparency also apply to large language models and require further research, social debate, and political agreement and intervention.

Discussion Questions for Students and Their Teachers

1.
What are the differences between illegal and harmful content? Discuss how these issues were dealt with in traditional, non-digital publishing. As an example, consider advertising in traditional mass media.
2.
What constitutes a good online debate? Should all online discussion only be factual and all commentary be friendly and respectful, or is it sometimes necessary to simplify and use strong words?
3.
What can be done to make a debate constructive? Consider techniques used in real-world discussions, and then think about which mechanisms can be transferred to the virtual world.
4.
What are the main democratic threats emerging from algorithmic content moderation? Consider who has the power of running social network infrastructure and who is in charge of sha** online discourse. What can be done to balance the power to foster democratic principles?

Learning Resources for Students

1.
For a survey and a taxonomy of different approaches to text classification, see Li et al. (2022). It also includes benchmarks and a comparison of different approaches.
2.
N. Persily and J.A. Tucker’s (2020) book ‘Social Media and Democracy” provides a comprehensive account of social media, content moderation and the challenges for democracy.
3.
There is an online summary video available for the ACM opinion piece referenced above (Prem 22): https://youtu.be/SjAH2HYKEhM illustrating the problem of illegal and harmful content and questions regarding freedom of expression.
4.
Á. Díaz and L. Hecht-Felella (2021) provide a detailed discussion of many issues listed here and recommendations from a more legal perspective. The paper includes a critical perspective on content moderation regarding the representation of viewpoints from minorities.
5.
The so-called Teachable Machine https://teachablemachine.withgoogle.com/ is a web-based tool for creating your own machine learning classification models. Its graphical user interface (GUI) allows you to train classifiers without prior programming knowledge and without the need to have special expertise in machine learning. A short overview is Teachable Machine: Approachable Web-Based Tool for Exploring Machine Learning Classification by Carney et al. (2020).

Notes

1.
Mediengesetz https://www.ris.bka.gv.at/GeltendeFassung.wxe?Abfrage=Bundesnormen&Gesetzesnummer=10000719 (URL retrieved 3.4.2023).
2.
https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/europe-fit-digital-age/digital-services-act-ensuring-safe-and-accountable-online-environment_en (URL retrieved 3.4.2023).
3.
https://www.derstandard.at/, https://www.derstandard.de/
4.
https://www.derstandard.at/story/2000140862539/so-sind-wir-wie-wir-unsere-foren-moderieren
5.
https://www.ofai.at
6.
See also https://ofai.github.io/million-post-corpus/ for details on the data set used for classifier training.
7.
https://www.derstandard.at/story/2000114011196/25-jahre-online-innovation
8.
SemEval (https://semeval.github.io/) is a series of international NLP research workshops dedicated to specific text analysis tasks. The tasks reflect challenging problems in NLP and vary from year to year. In setting up these challenges, task-specific high-quality datasets are created, and state-of-the-art NLP systems compete against each other under identical conditions so that methods can be compared.
9.
https://propaganda.math.unipd.it/semeval2023task3/
10.
https://www.politifact.com/
https://eufactcheck.eu/
https://www.disinfo.eu/projects/european-fact-checking-standards-project/
https://www.ifcncodeofprinciples.poynter.org/know-more

References

Cano, A., & Krawczyk, B. (2022). ROSE: Robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams. Machine Learning, 111(7), 2561–2599.
Article MathSciNet Google Scholar
Carney, M., Webster, B., Alvarado, I., Phillips, K., Howell, N., Griffith, J., et al. (2020, April). Teachable machine: Approachable Web-based tool for exploring machine learning classification. In Extended abstracts of the 2020 CHI conference on human factors in computing systems (pp. 1–8).
Google Scholar
Cowls J., Darius P., Golunova V., Mendis S., Prem E., Santistevan D., & Wang, W. W. (2020). Freedom of expression in the digital public sphere – Strategies for bridging information and accountability gaps in algorithmic content moderation. Zenodo. https://doi.org/10.5281/zenodo.4292408.
Díaz, Á. & Hecht-Felella, L. (2021) Double standards in social media content moderation. Brennan Center for Justice. https://www.skeyesmedia.org/documents/bo_filemanager/Double_Standards_Content_Moderation.pdf
Duarte, N., Llanso, E., & Loup, A. (2017). Mixed Messages? The Limits of Automated Social Media Content Analysis. Center for Democracy and Technology. https://bit.ly/3mdXFCB
Google Scholar
ECHR. (2022). Guide on article 10 of the European convention on human rights. European Court of Human Rights, p. 101. https://www.echr.coe.int/Documents/Guide_Art_10_ENG.pdf
Fanta, A. (2017). Putting Europe’s robots on the map: Automated journalism in news agencies. Reuters Institute for the Study of Journalism. https://reutersinstitute.politics.ox.ac.uk/sites/default/files/2017-09/Fanta%2C%20Putting%20Europe%E2%80%99s%20Robots%20on%20the%20Map.pdf
Google Scholar
Guo, Z., Schlichtkrull, M., & Vlachos, A. (2022). A survey on automated fact-checking. Transactions of the Association for Computational Linguistics, 10, 178–206.
Article Google Scholar
Haimson, O. L., Delmonaco, D., Nie, P., & Wegner, A. (2021). Disproportionate removals and differing content moderation experiences for conservative, transgender, and black social media users: Marginalization and moderation gray areas. Proceedings of ACM Human Computer Interaction 5, CSCW2, Article 466 (October 2021), 35 pages. https://doi.org/10.1145/3479610.
Kiesel, J., Mestre, M., Shukla, R., Vincent, E., Adineh, P., Corney, D., et al. (2019). Semeval-2019 task 4: Hyperpartisan news detection. In Proceedings of the 13th international workshop on semantic evaluation (pp. 829–839).
Google Scholar
Kolhatkar, V., & Taboada, M. (2017). Using New York Times picks to identify constructive comments. Proceedings of the 2017 EMNLP Workshop: Natural language processing meets journalism (pp. 100–105). Copenhagen.
Google Scholar
Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 160(1), 3–24.
Google Scholar
Li, Q., Peng, H., Li, J., **a, C., Yang, R., Sun, L., Yu, P. S., & He, L. (2022). A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology (TIST), 13(2), 1–41. https://doi.org/10.1145/3495162
Article Google Scholar
Mundt, M., Hong, Y., Pliushch, I., & Ramesh, V. (2023). A holistic view of continual learning with deep neural networks: Forgotten lessons and the bridge to active and open world learning. Neural Networks.
Google Scholar
Park, D., Sachar, S., Diakopoulos, N. & Elmqvist, N. (2016). Supporting comment moderators in identifying high quality online news comments. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 1114–1125). ACM.
Google Scholar
Persily, M., & Tucker, J. A. (2020). Social media and democracy. Cambridge University Press.
Book Google Scholar
Petrak, J., & Krenn, B. (2022). Misogyny classification of German newspaper forum comments. ar**v preprint ar**v:2211.17163.
Google Scholar
Prem, E. (2022). A brave new world of mediated online discourse. Communications of the ACM, 65(2), 40–42. https://cacm.acm.org/magazines/2022/2/258226-a-brave-new-world-of-mediated-online-discourse/fulltext
Article MathSciNet Google Scholar
Reich, Z. (2011). User comments. The transformation of participatory space. In J. B. Singer, A. Hermida, D. Domingo, A. Heinonen, S. Paulussen, & T. Quandt (Eds.), Participatory journalism. Guarding open gates at online newspapers (pp. 96–117). Wiley-Blackwell.
Chapter Google Scholar
Roberts, S. T. (2017). Content moderation. In L. Schintler & C. McNeely (Eds.), Encyclopedia of big data. Springer. https://doi.org/10.1007/978-3-319-32001-4_44-1
Chapter Google Scholar
Schabus, D., & Skowron, M. (2018). Academic-industrial perspective on the development and deployment of a moderation system for a newspaper website. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018).
Google Scholar
Schabus, D., Skowron, M., & Trapp, M. (2017). One million posts: A data set of German online discussions. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 1241–1244).
Google Scholar
Singer, J. B. (2011). Taking responsibility. Legal and ethical issues in participatory journalism. In J. B. Singer, A. Hermida, D. Domingo, A. Heinonen, S. Paulussen, T. Son J. The Korean Observer. 2016. Will robot reporters replace humans? http://www.koreaobserver.com/will-robot-reporters-replace-humans-63057/ (aufgerufen 22.4.2020).
Suzor, N., West, S., Quodling, A., & York, J. (2019). What do we mean when we talk about transparency? Toward meaningful transparency in commercial content moderation. International Journal of Communication, 13, 18. https://ijoc.org/index.php/ijoc/article/view/9736
Google Scholar
Tomiwa, I. (2020). Facebook’s content moderation errors are costing Africa too much. Slate, October 27, 2020. https://slate.com/technology/2020/10/facebook-instagram-endsars-protests-nigeria.html
Vaccaro, K., **ao, Z., Hamilton, K., & Karahalios, K. (2021). Contestability for content moderation. Proceedings of the ACM Human-Computer Interaction 5, CSCW2, Article 318 (October 2021), 28 pages. https://doi.org/10.1145/3476059.
Wallace, B. C. (2015). Computational irony: A survey and new perspectives. Artificial Intelligence Review, 43, 467–483. https://doi.org/10.1007/s10462-012-9392-5
Article Google Scholar
Zhang, C., Zhang, C., Zheng, S., Qiao, Y., Li, C., Zhang, M., et al. (2023). A complete survey on generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 all you need?. ar**v preprint ar**v:2303.11717.
Google Scholar
Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5), 1–40.
Article Google Scholar
Ziegele, M., & Jost, P. B. (2016). Not funny? The effects of factual versus sarcastic journalistic responses to uncivil user comments. Communication Research. Advance online publication, pp. 1–30.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Philosophy, University of Vienna, Vienna, Austria
Erich Prem & Brigitte Krenn
Austrian Research Institute for Artificial Intelligence, Vienna, Austria
Erich Prem & Brigitte Krenn

Authors

Erich Prem
View author publications
You can also search for this author in PubMed Google Scholar
Brigitte Krenn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erich Prem .

Editor information

Editors and Affiliations

TU Wien, Vienna, Austria
Hannes Werthner
DEIB, Politecnico di Milano, Milano, Italy
Carlo Ghezzi
Department of Computing, Imperial College London, London, UK
Jeff Kramer
Ludwig-Maximilians-Universität München, München, Germany
Julian Nida-Rümelin
Lero & The Open University, Milton Keynes, UK
Bashar Nuseibeh
University of Vienna, Vienna, Austria
Erich Prem
Middlebury College and Santa Fe Institute, Middlebury, VT, USA
Allison Stanger

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Prem, E., Krenn, B. (2024). On Algorithmic Content Moderation. In: Werthner, H., et al. Introduction to Digital Humanism. Springer, Cham. https://doi.org/10.1007/978-3-031-45304-5_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-45304-5_30
Published: 21 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45303-8
Online ISBN: 978-3-031-45304-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On Algorithmic Content Moderation

Abstract

Similar content being viewed by others

Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation

Moderation Techniques for Social Media Content

The German Comment Landscape

1 Introduction

2 What Is Algorithmic Content Moderation

3 Technical Approaches to Content Moderation

4 Societal Challenges

5 Conclusions

Discussion Questions for Students and Their Teachers

Learning Resources for Students

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

On Algorithmic Content Moderation

Abstract

Similar content being viewed by others

Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation

Moderation Techniques for Social Media Content

The German Comment Landscape

1 Introduction

2 What Is Algorithmic Content Moderation

3 Technical Approaches to Content Moderation

4 Societal Challenges

5 Conclusions

Discussion Questions for Students and Their Teachers

Learning Resources for Students

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation