Stefanie Ullmann, Research Associate with the Giving Voice to Digital Democracies research project, reports on the workshop ‘Understanding and automating counterspeech’, held in September 2021.

Hateful speech and misinformation continue to plague online discourse and communication. In a few countries, such as Germany and South Africa, official hate speech laws have been implemented, while others, like the United Kingdom, released white papers calling for regulation and independent watchdogs. Meanwhile, social media companies still follow the same reactionary approach to combating hateful speech and misinformation on their platforms. Once reported, content is moderated and, if it truly violates the company’s guidelines, taken down and the user may be blocked. Naturally, this method has evoked criticism from libertarian voices who fear that one of our most basic human rights, i.e. freedom of speech, is at stake. It is thus not surprising that, in recent years, a Counterspeech approach has attracted the attention of researchers and experts from a great variety of different professional and academic fields.

While most likely having existed for much longer, the concept of Counterspeech can officially be traced back to the 1920s and former lawyer and Associate Justice of the Supreme Court of the United States, Louis D. Brandeis, who famously proclaimed, “If there be time to expose through discussion, the falsehoods and fallacies, to avert the evil by the processes of education, the remedy to be applied is more speech, not enforced silence.” More recently, the Dangerous Speech Project has defined Counterspeech as “any direct response to hateful or harmful speech which seeks to undermine it.” Research into the real-life application of Counterspeech on social media is still in its infancy but initial results are promising. Studies suggest that Counterspeech can be successful particularly in one-to-one conversations and it has positive effects on bystanders and silent followers of the discussion, decreasing the likeliness of others resorting to harmful language. An increasing number of studies on Counterspeech in recent years demonstrate great interest in the topic in a variety of different fields. Therefore, we decided this would be a perfect time to host an interdisciplinary event on the subject on Wednesday, 29 of September 2021. The workshop took place online and was split into four sessions.

The first session looked at Counterspeech from both sociological and practical perspectives. The expert panel for this session consisted of Amalia Álvarez-Benjumea (Max Planck Institute for Research on Collective Goods), Erin Saltman (Global Internet Forum to Counter Terrorism) and Sina Laubenstein (No Hate Speech Movement Germany). In her research, Álvarez-Benjumea has investigated the production of hateful speech and the role of social norms under controlled conditions (Álvarez-Benjumea & Winter 2018). Her findings show, for instance, that “[o]bserving counterspeech from previous users encourages new participants to speak against hate speech” in an online setting. She concludes that “hate speech propagates because it is regulated by social norms, not despite them.” Based on her experience of working at Facebook and her current role as Director of Programming at the GIFCT, Saltman shared with us her key methodological and best practice strategies for combating online extremism and hatred (Saltman, Kooti & Vockery 2021). She addressed the important distinctions between behaviour and sentiment, i.e. changing someone else’s behaviour online versus actually changing someone’s mind as well as strategy types, i.e. prevention content versus strategic counter content. Practitioners and activists should first think about who they want to reach and then decide on the best strategy for their audience. Finally, Laubenstein shared with us further valuable insights into her work as a practitioner and activist countering online hate speech. In addition to knowing one’s audience, she pointed out the importance of considering one’s goal first. As a counter speaker, is the goal to change someone’s mind and to convince them of another opinion? Or is the goal to demonstrate disagreement to the wider audience as well as support for the victim? The work of a counterspeaker is laborious and it requires patience and determination and continuing one-on-one conversations to be successful in conveying the message.

The focus of session two was on computational and natural language processing (NLP) approaches to Counterspeech and we were joined by speakers Punyajoy Saha (Indian Institute of Technology Kharagpur) and Bertie Vidgen (Alan Turing Institute). Vidgen first talked us through current approaches to tackle online hate speech, which rely largely on actions performed by both human reviewers and automated decision-making systems. Despite being the most widely used approach, it is far from perfect and social media companies still face a great number of challenges. He also addressed options of increasing friction on platforms, which is, however, undesirable to the companies as it negatively impacts financial gains. Furthermore, Vidgen explained how we can use Artificial Intelligence (AI) to generate Counterspeech, which is what his work currently involves. On the basis of Paul Graham’s hierarchy of disagreement (2008), he and his team have trained an AI to devise desirable counter responses to hateful comments. Finally, Vidgen described their use of what is called an “adversarial approach” to machine learning, which intends to deliberately deceive the model to check for potential pitfalls and weaknesses in the system (Vidgen et al. 2020). In his talk, Saha talked more about his work on automating Counterspech, its challenges and opportunities. He pointed out that while Counterspeech does not violate freedom of speech and is flexible and responsive, there are still a few risks to consider. The task of countering hate is often left to the ones targeted. Moreover, it can be difficult to formulate a proper counter response and speaking up may lead to further harassment. Saha further explained that from an NLP perspective, automating Counterspeech is mainly a response generation problem. He talked us through the process of collecting and annotating data, which is still largely done manually. His research also reveals that Counterspeech found online predominantly consists of hostile responses, which is undesirable and ineffective (Mathew et al. 2019). Saha also stressed that for a fully automated approach to Counterspeech generation, we need to be aware of the risk of biases inherent in training data. His research shows that the specific Counterspeech strategy is of great importance and varies among different communities. While in the case of the African-American community, counter speakers tend to call out racism and warn of consequences, hatred targeted at the Jewish community leads to people expressing affiliation with the community. Finally, Saha spoke about future research and the plan to “[f]ocus on building tools for counter speakers along with performance guarantees of these generation models in a human-in-the-loop setting.”

Session three looked at Counterspeech through the lenses of philosophy and media communication with speakers Rae Langton (University of Cambridge), Lynne Tirrell (University of Connecticut) and Babak Bahador (George Washington University). Langton, who has argued in her work that we can make a speech act fail by blocking a presupposition (Langton 2018), emphasised that Counterspeech will take different forms depending on what kind of speech, and even specifically what kind of hate speech, it is opposed to. She distinguished between three types of hate speech; hate speech as an attack or incitement of violence, as a claim or form of propaganda and as an imperative (e.g., “keep out!”, “get out!”). Furthermore, Langton stressed that the authority of the speaker influences the effectiveness of hate speech. Tirrell described the purpose of Counterspeech as “taking away the license to repeat, reuse and carry forward the harm.” In her presentation, she referred to the sociology of discourse and its relevance to language. She mentioned Wittgenstein and his definition of “language games” as “consisting of language and the actions into which it is woven” (Wittgenstein 1953, §7) (Tirrell 2012). Extending the concept of language games, Tirrell recalled David Lewis’ (1979) notion of scorekeeping, according to which a speaker commits themselves to and takes responsibility for a speech act. She emphasised the element of time and its role in counterspeech. The instantaneous nature of speech on the internet represents a challenge to acts of blocking and interfering before the harm is done. Finally, Tirrell introduced an adaptation of the key journalist questions – who, what, when, where, why, how? – to online resistance and counterspeech. Finally, Bahador focused on more extreme forms of hate speech such as dehumanising and demonising language as well as incitement to harm people. In his research, he looks specifically at the use of hate speech by powerful people and in political contexts. He has previously identified five factors that help disaggregate Counterspeech: audiences, goals, tactics, messaging and effects (Bahador 2021). Bahador emphasised the importance of the “herding effect” in broader online audiences. People who are potentially willing to step in are more likely to do so when observing action taken by others. He expressed how vital sentiment changes are in online discussions when particularly vulnerable and impressionable users are amongst audiences. The direction an online conversation takes is important as it is likely to influence them to either become perpetrators themselves or oppose online hate.

Last but not least, the fourth session brought together experts with backgrounds in anthropology, law, mathematics and computer science. The participants of this session were Cathy Buerger (Dangerous Speech Project), Nadine Strossen (New York Law School), Kenneth S. Stern (Bard Centre for the Study of Hate) and Joshua Garland (Santa Fe Institute). In her talk, Buerger focused on the question of effectiveness of Counterspeech. As an anthropologist, she emphasised her interest in the counterspeakers themselves and how working and interacting with those has challenged the relational understanding of hate and counterspeech. She shared with us insights into survey results and interviews with participants around the world (see also Buerger 2021). Buerger also argued in favour of potentially extending the current definition of Counterspeech to include more than “speech”, i.e. visual representations like images, emojis. With her background in and knowledge of US and international law, Strossen associates herself with an even broader definition of not only Counterspeech but also other countermeasures that, according to her, must be taken into account, such as anti-discrimination laws against hate crimes. She mentioned three kinds of counterspeech: teaching habits of resilience as a form of proactive education, establishing an emotional bonding experience between speakers and challenging our habits of thought, another proactive measure to help us overcome our cognitive biases. As a lawyer, Strossen also stressed one basic principle of international law: “you may only censor speech as a last resort, if there is no less restrictive alternative that could be as effective in countering the potential harm of the speech” (see also Strossen 2018). Stern continued this strain of thought and emphasised a key question or goal we should focus on: “How do we cultivate an environment in which hateful speech is less likely to be uttered, less likely to be heard, less likely to be acted upon?” Rather than censor or only respond to harmful speech, we should work towards building a structure that helps everybody feel more welcome using free speech. Stern also shared with us valuable examples of real-world offline Counterspeech he has been doing to stand up against far-right groups in the U.S. (see also Stern 2020). Finally, Garland talked in more detail about his work in attempting to measure the effectiveness of Counterspeech online. He pointed out that achieving this is very challenging due to numerous reasons including that the understanding of effectiveness varies amongst people and that there is a lack of large-scale longitudinal studies of the discourse dynamics between hate and counterspeech. Garland and his colleagues are tackling those challenges, the latter in particular, by providing the first large-scale analysis of tens of millions of instances of hate and counter-hate speech on Twitter (Garland et al. 2021).

Altogether, this workshop turned out to be a truly successful event that provided a much needed platform for experts and practitioners from diverse fields to come together and discuss Counterspeech. It made clear how important multidisciplinary exchange is to make real progress and change possible. We need philosophers, linguists, anthropologists, lawyers, practitioners and activists as well as mathematicians and computer scientists to work together on this pressing problem and the challenges of countering hateful and toxic speech, be it on- or offline.

View the workshop YouTube playlist


The views, thoughts and opinions expressed on the CRASSH blog belong solely to the authors and do not necessarily represent the views of CRASSH or the University of Cambridge.


Tel: +44 1223 766886