A Quantitative Analysis of Relations between Semantic Fields in the Slovenian Narrative Prose of the Long Nineteenth Century

Authors

  • Lucija Mandić

DOI:

https://doi.org/10.3986/pkn.v47.i2.08

Keywords:

digital literary studies, Slovenian narrative prose, Prešernian structure, Slovenian cultural syndrome, semantical analysis, word embeddings

Abstract

The article analyzes the relationships between semantic fields in Slovenian narrative prose of the long nineteenth century using the method of word embeddings. The corpus of longer Slovenian narrative prose (KDSP 1.0) was analysed using the Word2Vec technology in the Python programming language. For the purposes of the analysis, semantic fields were constructed for four social institutions: economy, politics, culture, and the household. A set of words for each semantic field was obtained by identifying the 50 words with the greatest cosine proximity to the vector representation of each institution. The set of vectors obtained in this way became the quantitative basis of an investigation into the relations between these social institutions as they are narrated by the literary texts included in the corpus. The findings reveal a significant overlap between the semantic fields of politics and culture, thus offering a quantitative approach to a phenomenon that traditional literary scholarship tends to conceptualize as the Prešernian structure or the Slovenian cultural syndrome.

References

Brottrager Judith, idr. »Modeling and Predicting Literary Reception: A Data-Rich Approach to Literary Historical Reception«. Journal of Computational Literary Studies, let. 1, št. 1, 2022, https://jcls.io/article/id/95/. Dostop 12. 4. 2024.

Dović, Marijan. Prešeren po Prešernu: kanonizacija nacionalnega pesnika in kulturnega svetnika. LUD Literatura, 2017.

Eder, Maciej, in Artjoms Šeļa. »One Word to Rule Them All: Understanding Word Embeddings for Authorship Attribution«. Digital Humanities 2022 Combined Abstracts, Univerza v Tokiu, 2022, str. 199–202, https://dh2022.adho.org/. Dostop 12. 4. 2024.

Erjavec, Tomaž idr. Slovenian Novel Corpus (ELTeC-slv): April 2021 Release (v2.0.0). Zenodo, 2021, https://doi.org/10.5281/zenodo.4662600. Dostop 12. 4. 2024.

Hatzel, Hans Ole, idr. »Machine Learning in Computational Literary Studies«. it – Information Technology, let. 65, št. 4–5, 2023, str. 200–217.

Herrmann, J. Berenike, Joanna Byszuk in Giulia Grisot. »Using Word Embeddings for Validation and Enhancement of Spatial Entity Lists«. Digital Humanities 2022 Combined Abstracts, Univerza v Tokiu, 2022, str. 239–241.

Juvan, Marko. Prešernovska struktura in slovenski kulturni sindrom. LUD Literatura, 2012.

Kljun, Maša, Matija Teršek, in Slavko Žitnik. »Pomenska analiza kategorij sovražnega govora v obstoječih označenih korpusih«. Uporabna informatika, let. 30, št. 1, 2021, str. 3–18.

Ljubešić, Nikola, in Kaja Dobrovoljc. »What Does Neural Bring? Analysing Improvements in Morphosyntactic Annotation and Lemmatisation of Slovenian, Croatian and Serbian«. Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, ur. Tomaž Erjavec idr., Association for Computational Linguistics, 2019, str. 29–34, https://aclanthology.org/W19-3704. Dostop 12. 4. 2024.

Mandić, Lucija, in Tomaž Erjavec. Corpus of Longer Narrative Slovenian Prose KDSP 1.0. ZRC SAZU, 2023, http://hdl.handle.net/11356/1823. Dostop 12. 4. 2024.

Mikolov, Tomáš, idr. »Efficient Estimation of Word Representations in Vector Space«. International Conference on Learning Representations, 2013, https://arxiv.org/abs/1301.3781. Dostop 12. 4. 2024.

Nelson, Laura K. »Leveraging the Alignment between Machine Learning and Intersectionality: Using Word Embeddings to Measure Intersectional Experiences of the Nineteenth Century U.S. South«. Poetics, št. 88, https://doi.org/10.1016/j.poetic.2021.101539. Dostop 12. 4. 2024.

Pirjevec, Dušan. Vprašanje o poeziji. Vprašanje naroda. Obzorja, 1978.

Pollak, Senja, Matej Martinc, in Katja Mihurko. »Natural Language Processing for Literary Text Analysis: Word-Embeddings-Based Analysis of Zofka Kveder’s Work«. Proceedings of the Workshop on Digital Humanities and Natural Language Processing (DHandNLP 2020), ur. Maria José Finatto idr., CEUR-WS, Aachen, 2020, http://ceur-ws.org/Vol-2607/paper4.pdf. Dostop 12. 4. 2024.

Rupel, Dimitrij. Svobodne besede od Prešerna do Cankarja. Lipa, 1976.

Schneider, Felix idr. »Data-Driven Detection of General Chiasmi Using Lexical and Semantic Features«. Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, ur. Stefania Degaetano-Ortlieb idr., Association for Computational Linguistics, 2021, str. 96–100, doi.org/10.18653/v1/2021.latechclfl-1.11. Dostop 12. 4. 2024.

Published

2024-06-21

Issue

Section

Thematic section