Differences between revisions 1 and 2

Devulgarization of Polish Texts

DEPOT is a text style transfer framework for replacing vulgar expressions in Polish utterances with their non-vulgar equivalents while preserving the main characteristics of the text. The framework contains three pre-trained language models (GPT-2, GPT-3 and T-5) trained on a newly created parallel corpus of sentences containing vulgar expressions and their equivalents. The resulting models are evaluated by checking style transfer accuracy, content preservation and language quality.

Download

Licence

CC-BY 4.0

Publications

Klamra C., Wojdyga G. Żurowski S., Rosalska P., Kozłowska M., Ogrodniczuk M. Devulgarization of Polish Texts Using Pre-trained Language Models (in preparation).

-  ⇤ ← Revision 1 as of 2022-01-19 13:27:51 → 
  Size: 294
  Editor: MaciejOgrodniczuk
  Comment:
+   ← Revision 2 as of 2022-01-19 13:29:49 → ⇥
  Size: 778
  Editor: MaciejOgrodniczuk
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-...
+DEPOT is a text style transfer framework for replacing vulgar expressions in Polish utterances with their non-vulgar equivalents while preserving the main characteristics of the text. The framework contains three pre-trained language models (GPT-2, GPT-3 and T-5) trained on a newly created parallel corpus of sentences containing vulgar expressions and their equivalents. The resulting models are evaluated by checking style transfer accuracy, content preservation and language quality.

Diff for "DEPOTx"

Menu

Wiki

Devulgarization of Polish Texts

Download

Licence

Publications