A comparison of classification models to detect cyberbullying in the peruvian spanish language on Twitter

View/ Open
Cuzcano_Chavez_Ximena_Marianne.pdf
(application/pdf: 312.0Kb)
(application/pdf: 312.0Kb)
Date
2020Author(s)
Advisor(s)
Metadata
Show full item recordAbstract
Cyberbullying is a social problem in which bullies’
actions are more harmful than in traditional forms of bullying as
they have the power to repeatedly humiliate the victim in front of
an entire community through social media. Nowadays, multiple
works aim at detecting acts of cyberbullying via the analysis of
texts in social media publications written in one or more
languages; however, few investigations target the cyberbullying
detection in the Spanish language. In this work, we aim to
compare four traditional supervised machine learning methods
performances in detecting cyberbullying via the identification of
four cyberbullying-related categories on Twitter posts written in
the Peruvian Spanish language. Specifically, we trained and
tested the Naive Bayes, Multinomial Logistic Regression, Support
Vector Machines, and Random Forest classifiers upon a
manually annotated dataset with the help of human participants.
The results indicate that the best performing classifier for the
cyberbullying detection task was the Support Vector Machine
classifier.
How to cite
Cuzcano Chavez, X. M. (2020). A comparison of classification models to detect cyberbullying in the peruvian spanish language on Twitter [Tesis para optar el Título Profesional de Ingeniero de Sistemas, Universidad de Lima]. Repositorio institucional de la Universidad de Lima. https://hdl.handle.net/20.500.12724/12718Publisher
Universidad de LimaCategory / Subcategory
Ingeniería de sistemas / Diseño y métodosSubject
Collections
- Tesis [36]
The following license files are associated with this item: