Analyzing documents with text representations

August 07, 2024

Representing texts as vectors is a versatile technique that can be exploited in other applications besides text categorization problems. This project aims to use text representations based on sentiment analysis such as polarity, emotion, sarcasm, and toxic comments, among others, to analyze documents, e.g., taken from the Gutenberg project.

The idea is to develop a tool using visualization techniques and clustering algorithms to highlight the similarity between documents even when these treat different subjects.

Mario Graff

Analyzing documents with text representations