Subject indexing and creation of the paremiological thesaurus of contemporary Croatian proverbs

Download PDF

The aim of this paper was to subject-index a corpus of contemporary Croatian proverbs using denotative (literal) and connotative (figurative) terms, and to develop a dual thesaurus for searching on the Croatian Proverbs Portal. The research sample consisted of 108 proverbs collected as part of a project defining the Croatian paremiological minimum and optimum. The methodology included two main steps: subject processing of proverbs and creation of a dual thesaurus. Subject processing followed Lancaster’s methodology and consisted of seven phases: denotative analysis, identification of synonyms and related terms, connotative analysis by determining aboutness through corpus analysis, addition of connotative indexing terms, further synonym searches in a dictionary and language portals, and final verification of connotative meanings using artificial intelligence and manual review. Both thesauri were constructed according to Shearer’s methodology. The results showed that 1,074 terms were assigned to the proverbs – 586 denotative and 488 connotative. The denotative thesaurus contains 29 main facets and 380 subfacets, while the connotative thesaurus includes 23 facets and 244 subfacets. The originality of this research lies in its presenting of the first Croatian paremiological thesaurus, also the first thesaurus to separately process denotative and connotative meanings. The practical outcome of this research will be the implementation of the thesaurus on the Croatian Proverbs Portal, enabling more precise and comprehensive searches, and contributing to the digitalization and popularization of Croatian paremiological heritage in educational and scientific contexts.

Related Posts