Incorporating Phonological Knowledge into a Computational Model for Language Family Homeland Identification
Shen, Ziting
Several computational models have been proposed which hypothesize the
geographical homeland of a language family in a quantitative manner. Aiming
at identifying the homeland accurately for the world's language families,
especially those that have not been well-studied, we examine and propose modifications
to one of these models, the ASJP model. Specifically, we incorporate
more phonological information in the linguistic distance measurement. In the
original ASJP model, the linguistic distance is calculated through Levenshtein
distance. In the modified model, we apply a technique similar to the ALINE
algorithm to assign weights to the feature changes in the Levenshtein distance
calculation. The weights are chosen based on a priori knowledge about frequencies
of types of phonological change. The model will be tested on the
Indo-European family in the future, and the results will be compared to current
major Indo-European homeland thecries, i.e. the Steppe Theory and the
Anatolian Hypothesis.
↧