The system of ascending diphthongs in dialects of the Karelian language: comparison of clustering methods
English
journal number:
Journal’s Subject Headings:
Philology
About author:
I. P. Novak Institute of Linguistics, Literature and History, Karelian Research Centre of the Russian Academy of Sciences, Petrozavodsk, Russian Federation, [email protected]
N. B. Krizhanovskaya Institute of Applied Mathematical Research, Karelian Research Centre of the Russian Academy of Sciences, Petrozavodsk, Russian Federation, [email protected]
ABSTRACT
Introduction: in the last decade, statistical methods of dialectology are increasingly used in Finno-Ugric studies. The results of the first stage of applying the clustering technique to the materials of the Dialectological Atlas of the Karelian Language (1997) revealed the main problems of Karelian dialectology (failure of the traditional classification, unclear definition of the status and boundaries of certain groups of dialects, etc.). To solve these problems, a dialect base of the Karelian language was developed. This base includes encoded language data, which made it possible to apply various
hierarchical and iterative clustering methods to this data. This base includes encoded language data to which various hierarchical and iterative clustering methods are applied.
Objective: choice of a metric and a clustering method for verification and refinement of the existing scheme of the existing scheme of dialect division of the Karelian language, on the example of the analysis of the system of ascending diphthongs.
Research materials: digitized and coded data of the “Programs for collecting material for the dialectological atlas of the Karelian language”, completed in 1937–1972.
Results and novelty of the research: scientific novelty is the application of statistical methods of dialectometry to large volumes of Karelian dialect material. During the study, five variants of clusterization were carried out, demonstrating the distribution of variants of ascending diphthongs in the Karelian dialects of Karelia: the complete-linkage method (three clusterizations), the сentroid linkage method, and the k-means method. The results of clusterizations do not show significant differences, but the methods of complete-linkage and k-means showed themselves in the best way. The final cluster map coincided with the picture described in studies on Karelian phonetics and dialectology, but made it possible to obtain clearer boundaries of the analyzed dialect phenomenon and its transitional zones. This proves the legitimacy of applying the methodology for solving the problems of Karelian dialectology, as well as in the process of reworking the dialect classification of the language.
Key words: dialectology, linguistic geography, dialectometry, cluster analysis, clustering method, Karelian language, ascending diphthongs
Acknowledgements: the study was carried out under the state order of the Karelian Research Centre of the Russian Academy of Sciences (№ 121070700122-5) and through Russian Science Foundation grant 22-28-20215 Creation of the speech corpus of the Baltic-Finnic languages of Karelia implemented in collaboration with Republic of Karelia authorities with funding from the Republic of Karelia Venture Capital Fund.
For citation: Novak I. P., Krizhanovskaya N. B. The system of ascending diphthongs in dialects of the Karelian language: comparison of clustering methods // Vestnik ugrovedenia = Bulletin of Ugric Studies. 2022; 12 (3): 486–496.
N. B. Krizhanovskaya Institute of Applied Mathematical Research, Karelian Research Centre of the Russian Academy of Sciences, Petrozavodsk, Russian Federation, [email protected]
ABSTRACT
Introduction: in the last decade, statistical methods of dialectology are increasingly used in Finno-Ugric studies. The results of the first stage of applying the clustering technique to the materials of the Dialectological Atlas of the Karelian Language (1997) revealed the main problems of Karelian dialectology (failure of the traditional classification, unclear definition of the status and boundaries of certain groups of dialects, etc.). To solve these problems, a dialect base of the Karelian language was developed. This base includes encoded language data, which made it possible to apply various
hierarchical and iterative clustering methods to this data. This base includes encoded language data to which various hierarchical and iterative clustering methods are applied.
Objective: choice of a metric and a clustering method for verification and refinement of the existing scheme of the existing scheme of dialect division of the Karelian language, on the example of the analysis of the system of ascending diphthongs.
Research materials: digitized and coded data of the “Programs for collecting material for the dialectological atlas of the Karelian language”, completed in 1937–1972.
Results and novelty of the research: scientific novelty is the application of statistical methods of dialectometry to large volumes of Karelian dialect material. During the study, five variants of clusterization were carried out, demonstrating the distribution of variants of ascending diphthongs in the Karelian dialects of Karelia: the complete-linkage method (three clusterizations), the сentroid linkage method, and the k-means method. The results of clusterizations do not show significant differences, but the methods of complete-linkage and k-means showed themselves in the best way. The final cluster map coincided with the picture described in studies on Karelian phonetics and dialectology, but made it possible to obtain clearer boundaries of the analyzed dialect phenomenon and its transitional zones. This proves the legitimacy of applying the methodology for solving the problems of Karelian dialectology, as well as in the process of reworking the dialect classification of the language.
Key words: dialectology, linguistic geography, dialectometry, cluster analysis, clustering method, Karelian language, ascending diphthongs
Acknowledgements: the study was carried out under the state order of the Karelian Research Centre of the Russian Academy of Sciences (№ 121070700122-5) and through Russian Science Foundation grant 22-28-20215 Creation of the speech corpus of the Baltic-Finnic languages of Karelia implemented in collaboration with Republic of Karelia authorities with funding from the Republic of Karelia Venture Capital Fund.
For citation: Novak I. P., Krizhanovskaya N. B. The system of ascending diphthongs in dialects of the Karelian language: comparison of clustering methods // Vestnik ugrovedenia = Bulletin of Ugric Studies. 2022; 12 (3): 486–496.