Sentiment Analysis with Transformers

Sentiment Analysis with Transformers

This post includes an app SentimentAnalysisTransformersApp created for a public outreach activity organized by the Chair for Dynamics, Control, Machine Learning and Numerics – Alexander von Humboldt Professorship (FAU DCN-AvH) during the Lange Nacht der Wissenschaften 2025 (Long Night Sciences 2025).

Live demo

https://albertalcalde.github.io/SentimentAnalysisTransformersApp/

Overview

This interactive app lets you type a sentence and see:
– Whether your sentence expresses a positive or negative sentiment.
– A 2D visualization shows how a Transformer model processes and transforms your words layer by layer.

Why it’s Interesting

Transformer models understand text by gradually updating internal word representations through multiple layers. Each layer refines meaning and relationships, but that process is normally invisible to us.

This app makes those hidden transformations visible: as the layers advance, you can watch how your words become points on a plane, clustering and shifting as the model “understands” your sentence.

How it Works

1. Your input text is sent to a pretrained Transformer model (English or German).
2. The model outputs a series of hidden states, one for each internal layer.
3. These high-dimensional vectors are projected into 2D space for visualization.
4. The app animates how these points move from the input layer to the final output layer, where the sentiment prediction is determined by the position of the yellow [CLS] point.

Navigating the Interface

When you open the app, you’ll see an interface with the following elements:
Model selector: lets you choose between a model trained for English or Deutsch.
Text box: type your sentence or modify the default example.
Analyze: runs the model and displays the initial representation and predicted sentiment.
Play: animates the transition through all Transformer layers.
Clear: resets the text and visualization.
Layer slider: manually explore how the hidden states evolve from the first to the last layer.

Example Visualizations

Below you can see how the model processes a sample sentence in English and German, from the input layer to the output layer.

English Model

(Click on image to see details)

 
German Model

(Click on image to see details)

 

Try It Yourself

1. Visit: https://albertalcalde.github.io/SentimentAnalysisTransformersApp/
2. Type a few sentences, from movie reviews to random thoughts,
and watch how the model predicts the feeling of the sentence!

Technical Details

English Model
– The app uses DistilBERT, a smaller version of BERT, fine-tuned for English sentiment analysis (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
– This model predicts two sentiment classes: positive and negative.
– For visualization, we directly use the output of the classification head, which maps each token’s 768-dimensional hidden state to the two sentiment logits (positive and negative).
– These two logits naturally form a 2D coordinate system, so each token appears as a point in this plane.
– As we move through the six Transformer blocks of DistilBERT, the positions of the tokens evolve, revealing how the model progressively organizes information leading to its final sentiment prediction.

German Model
– The German visualization uses BERT, fine-tuned for German sentiment analysis (https://huggingface.co/oliverguhr/german-sentiment-bert).
– This model predicts three sentiment categories: positive, negative and neutral.
– To maintain a consistent 2D visualization, we omit the neutral logit and project only the positive and negative logits for each token.
– This allows us to visualize the model’s internal reasoning along the same positive–negative axis as in the English case.
– Each token’s position evolves across the twelve Transformer blocks of BERT, illustrating how sentiment information emerges layer by layer.

Implementation Notes

– All computations run directly in the browser using WebGPU, enabling smooth and responsive inference without any server-side processing.
– The visualization and UI are implemented with HTML, CSS, and vanilla JavaScript.

 

References

[1] Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv preprint arXiv:1910.01108, arxiv: 1910.01108
[2] Guhr, O., Schumann, A.-K., Bahrmann, F., & Böhme, H. J. (2020). Training a Broad-Coverage German Sentiment Classification Model for Dialog Systems. Proceedings of The 12th Language Resources and Evaluation Conference (LREC 2020), Marseille, France, pp. 1620–1625.

Acknowledgement
A. Alcalde acknowledges the funding by the European Union (Horizon Europe MSCA project DTN ModConFlex, grant No. 101073558).

You might like: Clustering in pure-attention hardmax transformers and its role in sentiment analysis

 

|| Go to the Math & Research main page

 

You might like!