Comparing Josip Jurčič and Ivan Cankar Using Computational Semantic Change Detection Methods


  • Andrejka Žejn
  • Marko Pranjić
  • Senja Pollak



Slovenian literature, Jurčič, Josip, Cankar, Ivan, natural language processing, semantical analysis, digital literary studies, digital humanities


The article begins with a presentation of the heterogeneity and interdisciplinarity of the digital humanities as two central and interrelated concepts. In the main part of the article, the method of detecting semantic changes based on contextual word embeddings for the analysis of literary works is introduced. The method’s potential is demonstrated through a comparative analysis of the narrative works of two canonical Slovenian authors belonging to two distinct literary periods, Josip Jurčič and Ivan Cankar, in particular through the automatic recognition of words whose meanings differ between the authors. The differences in literary style are further interpreted via a qualitative analysis of the automatically obtained results, followed by a manual categorization into semantic fields of the words that were qualitatively identified as informative of stylistic differences between Jurčič and Cankar. The article shows that the approach based on contextual word embeddings can be used for literary analysis with satisfactory results. This enables narratology to gain new insight into the oeuvres of Cankar and Jurčič, as the article shows that the difference between Jurčič’s (romantic) realism and Cankar’s kind of modernism (moderna) is also based on the semantics of discourses related to movement and social and psychological actions and processes.


Alvarado, Rafael C. »The Digital Humanities Situation«. Debates in the Digital Humanities, ur. Matthew K. Gold, University of Minnesota Press, 2012, str. 50–55.

Azarbonyad, Hosein, idr. »Words Are Malleable: Computing Semantic Shifts in Political and Media Discourse«. CIKM ’17: Proceedings of the 2017 ACM Conference on Information and Knowledge Management, Association for Computing Machinery, 2017, str. 1509–1518.

Bode, Katherine. Reading by Numbers: Recalibrating the Literary Field. Anthem Press, 2012.

Burrows, John F. Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method. Clarendon Press, 1987.

Conneau, Alexis, idr. »Unsupervised Cross-lingual Representation Learning at Scale«. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ur. Dan Jurafsky idr., The Association for Computational Linguistics, 2020, str. 8440–8451, Dostop 5. 2. 2024.

Čeh Steger, Jožica. »Kratka proza«. Ivan Cankar: literarni revolucionar, ur. Aljoša Harlamov, Cankarjeva založba, 2018, str. 88–141.

Devlin, Jacob, idr. »BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding«. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 1. zv., ur. Jill Burstein idr., The Association for Computational Linguistics, 2019, str. 4171–4186, Dostop 5. 2. 2024.

Earhart, Amy E. Traces of the Old, Uses of the New: The Emergence of Digital Literary Studies. University of Michigan Press, 2015.

Enkvist, Nils Erik. »On Defining Style«. Linguistics and Style, ur. Nils Erik Enkvist idr., Oxford University Press, 1964, str. 1–56.

Eve, Martin Paul. The Digital Humanities and Literary Studies. Oxford University Press, 2022.

Ganascia, Jean-Gabriel. »The Logic of the Big Data Turn in Digital Literary Studies«. Frontiers in Digital Humanities, št. 2, 2015, Dostop 5. 2. 2024.

Herrmann, J. Berenike, idr. »Revisiting Style, a Key Concept in Literary Studies«. Journal of Literary Theory, let. 9, št. 1, 2015, str. 25–52.

Heuser, Ryan, in Lang Le-Khac. »A Quantitative Literary History of 2,958 Nineteenth-century British Novels«. Stanford Literary Lab, 2012, Dostop 5. 2. 2024.

Hoover, David L., idr. Digital Literary Studies: Corpus Approaches to Poetry, Prose, and Drama. Routledge, 2014.

Jannidis, Fotis, in Gerhard Lauer. »Burrows’s Delta and Its Use in German Literary History«. Distant Readings: Topologies of German Culture in the Long Nineteenth Century, ur. Matt Erlin in Lynne Tatlock, Camden House, 2014, str. 29−54.

Juvan, Marko, idr. »Interpretiranje literature v zmanjšanem merilu: ‘oddaljeno branje’ korpusa ‘dolgega leta 1968’«. Jezik in slovstvo, let. 66, št. 4, 2021, str. 55–76.

Kmecl, Matjaž. Josip Jurčič: pripovednik in dramatik. Zavod Republike Slovenije za šolstvo, 2009.

Kmecl, Matjaž. »Problematika slovenske proze 19. stoletja«. Zbornik predavanj / XV. seminar slovenskega jezika, literature in kulture, 2.–14. julija 1979, ur. Breda Pogorelec in Ljubica Črnivec, Filozofska fakulteta, Pedagoško-znanstvena enota za slovanske jezike in književnosti, 1979, str. 157–182.

Kos, Janko. »Cankar in problem slovenskega romana«. Sodobnost, let. 24, št. 5, 1976, str. 413–423.

Kos, Janko. Pregled slovenskega slovstva. DZS, 2010.

Kuhn, Virginia, in Vicki Callahan. »Nomadic Archives: Remix and the Drift to Praxis«. Digital Humanities Pedagogy, ur. Brett D. Hirsch, Open Book Publishers, 2012, str. 291–308, Dostop 5. 2. 2024.

Kutuzov, Andrey, idr. »Diachronic Word Embeddings and Semantic Shifts: A Survey«. Proceedings of the 27th International Conference on Computational Linguistics, ur. Emily M. Bender idr., The Association for Computational Linguistics, 2018, str. 1384–1397, Dostop 5. 2. 2024.

Kutuzov, Andrey, idr. »Tracing Armed Conflicts with Diachronic Word Embedding Models«. Proceedings of the Events and Stories in the News Workshop, ur. Tommaso Caselli idr., The Association for Computational Linguistics, 2017, str. 31–36, Dostop 5. 2. 2024.

Lin, Jianhua. »Divergence Measures Based on the Shannon Entropy«. IEEE Transactions on Information Theory, let. 37, št. 1, 1991, str. 145–151.

Mahony, Simon. »Cultural Diversity and the Digital Humanities«. Fudan Journal of the Humanities and Social Sciences, št. 11, 2018, str. 371–388.

Martinc, Matej, idr. »EMBEDDIA Hackathon Report: Automatic Sentiment and Viewpoint Analysis of Slovenian News Corpus on the Topic of LGBTIQ+«. Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, ur. Hannu Toivonen in Michele Boggia, The Association for Computational Linguistics, 2021, str. 121–126, Dostop 5. 2. 2024.

Martinc, Matej, idr. »Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift«. Proceedings of the Twelfth Language Resources and Evaluation Conference, ur. Nicoletta Calzolari idr., European Language Resources Association, 2020, str. 4811–4819, Dostop 5. 2. 2024.

McCarty, Willard. »Humanities Computing«. Encyclopedia of Library and Information Sciences, ur. Miriam Drake, Marcel Dekker, 2003, str. 1224–1235.

Mikolič, Vesna. Ali bereš Cankarja? Slovenska matica, 2021.

Montariol, Syrielle, idr. »Scalable and Interpretable Semantic Change Detection«. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ur. Kristina Toutanova idr., The Association for Computational Linguistics, 2021, str. 4642–4652, Dostop 5. 2. 2024.

Moretti, Franco. »Domneve o svetovni literaturi«. Grafi, zemljevidi, drevesa in drugi spisi o svetovni literaturi, ur. in prev. Jernej Habjan, Studia humanitatis, 2011, str. 5–25.

Murray, Simone. »Varieties of Digital Literary Studies: Micro, Macro, Meso«. DHQ: Digital Humanities Quarterly, let. 16, št. 2, 2022, Dostop 5. 2. 2024.

Piper, Andrew. »There Will Be Numbers«. Journal of Cultural Analytics, let. 1, št. 1, 2016, Dostop 5. 2. 2024.

Pirjevec, Dušan. »Problem slovenskega romana«. Literatura, let. 9, št. 67–68, 1997, str. 63–75.

Presner, Todd. »Comparative Literature in the Age of Digital Humanities«. A Companion to Comparative Literature, ur. Ali Behdad in Dominic Thomas, Blackwell, 2011, str. 193–208.

Ramsay, Stephen. »Who’s In and Who’s Out«. Defining Digital Humanities: A Reader, ur. Melissa Terras idr., Ashgate, 2013, str. 239–241.

Rodríguez Ortega, Nuria. »Five Central Concepts to Think of Digital Humanities as a New Digital Humanism Project«. Artnodes, št. 22, 2018, Dostop 5. 2. 2024.

Rommel, Thomas. »Literary Studies«. A Companion to Digital Humanities, ur. Susan Schriebman idr., Blackwell, 2004, str. 88–96.

Schlechtweg, Dominik, idr. »SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection«. Proceedings of the Fourteenth Workshop on Semantic Evaluation, International Committee for Computational Linguistics, 2020, Dostop 5. 2. 2024.

Stubbs, Michael. Words and Phrases: Corpus Studies of Lexical Semantics. Blackwell, 2005.

Svensson, Patrik. »Beyond the Big Tent«. Debates in the Digital Humanities, ur. Matthew K. Gold, University of Minnesota Press, 2012, str. 36–72.

Svensson, Patrik. »Humanities Computing as Digital Humanities«. Digital Humanities Quarterly, let. 3, št. 3, 2009, Dostop 5. 2. 2024.

Tahmasebi, Nina, idr. »Survey of Computational Approaches to Lexical Semantic Change Detection«. Computational Approaches to Semantic Change, ur. Nina Tahmasebi idr., Language Science Press, 2021, str. 1–91.

Tang, Xuri. »A State-of-the-art of Semantic Change Computation«. Natural Language Engineering, let. 24, št. 5, 2018, str. 649–676.

Terras, Melissa, idr. »Selected Definitions from the Day of Digital Humanities: 2009–2012«. Defining Digital Humanities: A Reader, ur. Melissa Terras idr., Ashgate, 2013, str. 279–287.

Ulčar, Matej, in Marko Robnik Šikonja. Slovenian RoBERTa Contextual Embeddings Model: SloBERTa 2.0. Fakulteta za računalništvo in informatiko, 2021, Dostop 5. 2. 2024.

Vanhoutte, Edward. »The Gates of Hell: History and Definition of Digital | Humanities | Computing«. Defining Digital Humanities: A Reader, ur. Melissa Terras idr., Ashgate, 2013, str. 119–156.

Zajc, Ivana, in Peter Purg. Digitalna humanistika in literatura. Založba Univerze, 2023, Dostop 5. 2. 2024.

Zupan Sosič, Alojzija. »Romani«. Ivan Cankar: literarni revolucionar, ur. Aljoša Harlamov, Cankarjeva založba, 2018, str. 200–233.





Thematic section