Toxic Comment Filter Bot

5 min readOct 23, 2022

This project was carried out as part of the TechLabs “Digital Shaper Program” in Aachen (Summer Term 2022)

Introduction

With increased accessibility to the internet, communication has crossed all the barriers. Nowadays, people can share thoughts whenever, wherever with the entire world through social media. These comments not only reach people who favor them but also people who are against them, and thus, give rise to open debates. Since the explicit interaction is with a computer or smartphone and not with humans, people tend to forget proper behavior while commenting and type abusive toxic comments. Many times, people misbehave with this technology by creating fake accounts to spread hate and negativity. As it is said, a diamond cuts a diamond, we want to use technology to reduce the abuse of technology. This project is about creating a Toxic Comment Filter Bot using Data Science and Machine Learning methods, namely Natural Language Processing. NLP enables computers to process human language in the form of text or voice data and to ‘understand’ its full meaning, complete with the speaker or writer’s intent and sentiment.[1] The purpose of this bot is to identify threats and abusive comments, which are then categorized into six different categories based on the toxicity level. We want to enable this bot, in the future, to be used as an extension with any social media to filter or hide out toxic comments.

Method

Therefore, classifying inappropriate comments can be a solution for a user-friendly social platform. Here, around 160000 Wikipedia comments which have been labeled by humans into 6 different classes, like toxic, severe toxic, obscene, threat, insult, and identity hate are available from Kaggle competition. Moreover, the correlation for each class is shown in the picture below with the Pearson method. It is interesting to see that, “toxic” and “severe toxic” has less correlation compared to “toxic” with “obscene” or “insult”. A higher correlation between “toxic” with “obscene” could be explained by the fact that each sentence has both labels. Furthermore, “threat” has less correlation with other classes compared to others as shown in the image below.

Moreover, the dataset has mostly “toxic” labels with around 15000 entries and the lowest is “threat” labels, with only around 1000 entries. This imbalance has to be taken into consideration before the classification step because it influences how well the classification model performs. Many options, like upsampling/downsampling or data augmentation, could be used to solve this imbalance.

Data preprocessing

The steps for data preprocessing and cleaning will be divided into five small steps. First, lowercase the sentence from “ She eats Äpple and orange!” to “ she eats äpple and orange!”. Next, unnecessary symbols or characters, like !,@,” are removed, so that at the end the sentence will be “she eats äpple and orange”. Here, the symbol database from non-ASCII text and regular expression (RegEx) is used to make sure all such symbols are removed. Accented words that can be found for example in German, like äpfel changed to apfel. In this English dataset, the anomaly is usually just a misspelling and it is not common to use accented words. The difference between this algorithm and other NLP data cleaning algorithms is that we keep the “negative” words like not, hadn’t, or shouldn’t, because these toxic comments have a high correlation with these words, and “not” is also the most common word in dataset. Stopwords, like “and” or “or” will be also removed because these words do not provide further information. In this algorithm, NLTK is chosen as a library for stopwords and other text processing. Therefore, we don’t need to specify every word, especially in English. Unnecessary white space in the sentence is also deleted. Lastly, lemmatization is performed for every word in the dataset, which means the word is returned to its base form, like eats to eat, and talking to talk. At the end of the entire preprocessing, the example sentence will become “she eat apple orange”.

Classification Model

In this project, long short term memory (LSTM) is used to classify multiple label dataset. LSTM is a type of recurrent neural network and is better than traditional recurrent neural networks in terms of memory because it has multiple hidden layers. Moreover, LSTM is also better than the traditional machine learning models, like random forest, especially in NLP because with LSTM, we can give a sentence as an input for prediction rather than just one word. Therefore, LSTM from the keras library is chosen for this work with sigmoid for the last layer with binary cross entropy to calculate the loss. To use the model, the sentence needs to be tokenized with a maximum of 200 words for each sentence. Tokenization is the process of exchanging sensitive data for nonsensitive data called “tokens” [2].

Conclusion

The LSTM model is trained with 25 epochs to see whether the model is overfitting or not. To train the model, it took around 1 hour with Intel i5 CPU, and the highest accuracy with 99% accuracy is achieved with 2 epochs. After 2 epochs, the data was overfitting. Moreover, the output of the model will be a percentage from 0 to 100% for each label, because every comment could have multiple labels. However, the model’s self-developed spell-check functions take an extremely long time to execute. Implementing an optimal and faster method for spell check can improve the accuracy and efficiency of the model. Moreover, it can be seen in class distribution, the amount of entries per class is still imbalanced, i.e., the presence of very few ‘interesting’ events (toxic/hate comments in our case). Most ML algorithms do not work well with imbalanced datasets. Some of the techniques (or combinations) that can be tried out to handle imbalanced datasets are using the correct Evaluation Matrix, resampling the training set, resampling with different ratios, clustering the abundant class, and various Data Augmentation techniques. As compared to Computer Vision, where transformations are done on the go using data generators, data augmentation should be done carefully in NLP due to the grammatical structure of the text. Back translation, EDA (Easy Data Augmentation), NLP Albumentation, and NLP Aug are a few methods that can be used [3]. Future possibilities for the Toxic Comment Filter Bot include enabling it as an extension for various social media platforms such as Twitter, Facebook, etc. Moreover, in the future, it can be configured to work in real-time to avoid people from even posting toxic comments.

https://www.youtube.com/watch?v=h0jiea1Cot0&list=PLNavB9lBytK5JArvDT-bWuRMRz6yk3nFr&index=4

TechLabs Aachen e.V. reserves the right not to be responsible for the topicality, correctness, completeness or quality of the information provided. All references are made to the best of the authors’ knowledge and belief. If, contrary to expectation, a violation of copyright law should occur, please contact journey.ac@techlabs.org so that the corresponding item can be removed. Any videos or media content are only added after proper consent/request from the authors.