Devulgarization of Polish Texts
DEPOT is a text style transfer framework for replacing vulgar expressions in Polish utterances with their non-vulgar equivalents while preserving the main characteristics of the text. The framework contains three pre-trained language models (GPT-2, GPT-3 and T-5) trained on a newly created parallel corpus of sentences containing vulgar expressions and their equivalents. The resulting models are evaluated by checking style transfer accuracy, content preservation and language quality.
Download
- ...
Licence
CC-BY 4.0
Publications
Klamra C., Wojdyga G. Żurowski S., Rosalska P., Kozłowska M., Ogrodniczuk M. Devulgarization of Polish Texts Using Pre-trained Language Models (in preparation).