Hate Speech On Social Media: A Human Machine Policy Regulatory Model

By: Omri AbendTehilla Shwartz Altshuler and Yuval Shany

In a previous policy research study, carried out at the Israel Democracy Institute, we created a model that provides scales for assessing online hate speech, and guidelines for online platforms on defining their preferred policies to combat hate speech. This is a co-regulatory model comprising two main components: (a) six common criteria for identifying and ranking hate speech; and (b) a detailed procedure for applying rules against hate speech.

Our criteria identify factors that categorize online expressions to definitions that are accepted by most countries and most major social media platforms. For each criterion, our model provides a scale of severity, enabling managers of online service providers (OSPs) to create and implement a range of steps against transgressors, from lenient to strict.

The Research Goals are to validate and improve the scale-based model developed by Altshuler and Medzini (2018) for responses to hate speech on social media, by: (1) Conducting a qualitative assessment of the model for content that was removed or marked as hate speech by Facebook’s human content moderators; (2) Develop a machine learning model for classifying text as hate speech; evaluate its performance and limitations. (3) Then create a policy model for and up to date recommended combination of human and machine to deal with hate speech on social media platforms.

Our goal will be best achieved if we can get access to Facebook dataset content that was marked or removed by the company as being hate content together with specific guidelines for classifying hate speech. We believe that such cooperation can provide the precise picture that is sorely needed for understanding current needs and abilities regarding dealing with hate speech, and that this contribution from Facebook is no less important than financial support. Alternatively we would use hand and machine search for harmful content.

In this way, the study will result in recommendations for a more precise model, relating to (1) The material definitions of hate speech on social media, which will be both applicable and suitable with constitutional values. and (2) The balance between human content moderation and machine content moderation to deal with hate speech on social media platforms.