Objectives & Highlights

• Taking existing pre-trained language model and understanding it’s output - here I use PolBERTa trained for Polish language. • Building custom classification head on top of the LM. • Using fast tokenizers to efficiently tokenize and pad input text as well as prepare attention masks. • Preparing reproducible training code with PyTorch Lightning. • Finding good starting learning rate for the model. • Validating the trained model on PolEmo 2.0 dataset (benchmark for Polish language sentiment analysis with 4 classes).

Don't forget to add the tag @marrrcin in your comments.

Data Engineer / Machine Learning Engineer @ Egnyte
Share this project
Similar projects
Visual Paper Summary: ALBERT(A Lite BERT)
An illustrated summary of ALBERT paper and how it improves BERT and makes it resource efficient
NLP Research Highlights — Issue #1
First quarterly issue of the natural language processing (NLP) Research Highlights series.
Cycle Text-To-Image GAN with BERT
Image generation from their respective captions, building on state-of-the-art GAN architectures.