Calculation of the degree of morphological proximity of the Eastern Khanty dialects (Surgut, Vakh and Vasyugan) on the online platform LingvoDoc
V. V. Vorobyova Ivannikov Institute for System Programming of the Russian Academy of Sciences, Moscow, Russian Federation, National Research Tomsk Polytechnic University, Tomsk, Russian Federation, [email protected]
I. V. Novitskaya National Research Tomsk State University, Tomsk, Russian Federation, [email protected]
Introduction: the article addresses some issues raised in the scientific literature, namely, about the affiliation Vakh, Vasyugan and Surgut Khanty to one or different branches of the Eastern cluster of idioms and the degree of their morphological variation in comparison with one another. The individual features of each dialect under analysis cannot be considered comprehensive, hence it is advisable to use modern digital tools to mathematically process large databases of text corpora with the purpose to identify an entire spectrum of differences in their morphological systems.
Objective: presentation of preliminary results of calculating the degree of morphological proximity of three Eastern dialects of the Khanty language based on data of the research platform LingvoDoc.
Research materials: the corpus of the Vakh, Vasyugan and Surgut dialects of the Khanty language, posted on the LingvoDoc research platform.
Results and novelty of the research: the starting point for the analysis was observations of specialists in the Khanty language who had previously described some of the different-level features of the Eastern dialects. Using the LingvoDoc option allowed us to identify a list of 74 groups of cognate affixes, as well as 14 affixes unique to one of the dialects of the Khanty language. Of the total number of identified cognates 77.02% of affixes have common etymological links in all three dialects, which makes it possible to attribute these idioms to one branch of the Eastern dialects. The remaining 21.62% of
affixes are etymologically related only in the Vakh and Vasyugan dialects, which ensures their close morphological proximity. Of the 14 unique affixes, 12 affixes are attested only in the Surgut dialect. The data obtained support the conclusion that the Surgut dialect reveals a sufficient degree of distancing from the Vakh-Vasyugan dialect unity. Previously, the calculation of the morphological proximity of the Eastern dialects was not carried out instrumentally online.
Key words: Khanty language, Vakh dialect, Vasyugan dialect, Surgut dialect, morphology, text corpora, LingvoDoc, data analysis, language documentation
Acknowledgments: the study was funded by Russian Science Foundation, project № 20-18-00403 “Digital Description of Uralic Dialects Based on Big Data Analysis”.
For citation: Vorobyova V. V., Novitskaya I. V. Calculation of the degree of morphological proximity of the Eastern Khanty dialects (Surgut, Vakh and Vasyugan) on the online platform LingvoDoc // Vestnik ugrovedenia = Bulletin of Ugric Studies. 2024; 14 (4/59): 616–629.