The Archaeologist

View Original

New Study Traces Linguistic Paths: How Language Dispersal Echoes Ancient DNA and Archaeological Findings


By Dimosthenis Vasiloudis


Tracing the Paths of Languages: The Novel Language Velocity Field Approach

In the intricate tapestry of human history, language serves as a crucial thread, weaving through the fabric of cultural and demographic changes. The study of language evolution, particularly its spatial trajectory, offers profound insights into our collective past. Traditional approaches, while useful, frequently miss the complexity of language evolution, which depends on both vertical divergence and horizontal contact. A new study published by Nature introduces the groundbreaking Language Velocity Field (LVF) estimation, a computational approach that steps beyond these limitations, providing a more comprehensive understanding of language dispersal.

Beyond the Tree: Understanding Language Evolution Holistically

The LVF methodology stands out for its innovative approach to studying language evolution. Unlike the phylogeographic approach, which primarily focuses on vertical divergence and relies heavily on phylogenetic trees, LVF incorporates both vertical and horizontal dimensions of language change. This distinction is crucial in multilingual regions, where languages often borrow from and influence each other, leading to a more complex evolutionary pattern.

The Two-Fold Strategy of LVF

The effectiveness of the LVF lies in its two-pronged strategy. First, it creates a velocity field that shows how linguistic traits change over time, accurately capturing the essence of both horizontal contact and vertical divergence. This field functions akin to a phylogenetic tree but with an expanded scope. Second, LVF projects this field into geographic space, thereby outlining language dispersal trajectories based on linguistic relatedness and geography. This new method has been thoroughly tested by simulating 1000 datasets and then being used to figure out how major agricultural language families like Indo-European, Sino-Tibetan, Bantu, and Arawak spread.

a The homelands of ancient agriculture and the dispersal routes of Neolithic/Formative cultures and Holocene populations proposed by previous studies2,3,4 based on archaeological and ancient DNA evidence. The pale red polygon denotes the known ancient agricultural homeland. The black arrow signifies the dispersal trajectory of the Neolithic/Formative culture. The coloured arrow represents the dispersal trajectory of the major Holocene population. b The velocity fields of four language families and groups. The coloured dot denotes the geographical position of each observed language sample. The coloured small arrow represents the velocity vector which has been grid-smoothed and normalised for better visualisation. The larger coloured schematic arrow, summarised based on the velocity vectors, renders the general language dispersal trajectory. The pale grey polygon signifies the known geographic range of the Neolithic culture. The coloured concentric circle represents the language dispersal centre inferred by the LVF. The grey base world map is generated using the map function of the maps package in R (4.3.1). The Source Data and Codes for generating Fig. 2 are available.

Empirical Insights: Unveiling the Journey of Agricultural Languages

The empirical application of LVF to these language families yielded fascinating insights. The dispersal trajectories were found to align with population movements inferred from ancient DNA and archaeological data. Moreover, the identified dispersal centers were geographically close to the ancient homelands of agricultural or Neolithic cultures. These findings underscore the significant role of agricultural languages in mirroring demographic and cultural spreads over the past 10,000 years.

LVF vs. Traditional Approaches: A Comparative Perspective

A critical aspect of this study is the comparison between LVF and the traditional phylogeographic approach. Despite having a similar theoretical foundation, both perform differently when horizontal contact affects linguistic relatedness. LVF proves more reliable in such scenarios, demonstrating its utility in complex linguistic landscapes. Additionally, LVF's comparison with other phylogeny-free approaches highlights its unique methodological strengths and versatility.

closer to Colin Renfrew's Anatolian hypothesis?

The new study appears to align more closely with Colin Renfrew's Anatolian hypothesis for the distribution of Indo-European languages. Renfrew's theory, proposed in the 1980s, posits that the spread of Indo-European languages was closely tied to the diffusion of agriculture from Anatolia (modern-day Turkey) around 8,000–9,500 years ago. This theory contrasts with the traditional Kurgan hypothesis, which links the spread of these languages to the migrations of pastoral nomads from the Pontic Steppe around 6,000 years ago.

The LVF study's results, which show that the paths of language families used in farming are similar to population movements inferred from ancient DNA and archaeological data, back up the idea that language spread and agricultural growth are linked. Specifically, the LVF approach identified the dispersal center of Indo-European languages in the Fertile Crescent, an area closely associated with the earliest known agricultural developments and within proximity to Anatolia. This geographical correlation is a key aspect of Renfrew's hypothesis.

The LVF method can also look at both vertical divergence and horizontal contact in language evolution. This makes the analysis more complex and in-depth, and it might help us understand how these languages spread along with farming practices in a more nuanced way.

These results suggest a stronger link between the spread of Indo-European languages and the spread of agriculture in Anatolia. However, it's important to note that the debate over the origins and spread of Indo-European languages is ongoing, and each new piece of evidence, such as this study, contributes to a continuously evolving academic discussion.

a The geographic coordinates (Lon, Lat) of dispersal centres for each case inferred by five approaches: language velocity field estimation (LVF), phylogeographic approach (PhyloG), diversity approach (DIV), centroid approach (Centr), and minimal distance approach (MD). (b1) Density plot displaying differences in longitude and latitude between the dispersal centres inferred by LVF and PhyloG using 1000 simulated datasets. p value is calculated by the two-sided Wilcoxon rank-sum test. (b2) Density plot showing the delta score distribution of simulated language samples (one-sided 95% CI = [0.1553, 0.1727]), estimated from 200 bootstrap resamplings. (b3) Density plot illustrating absolute differences in longitude and latitude between dispersal centres inferred by LVF and PhyloG using 1000 simulated datasets (Lat: mean = 0.94, one-sided 95% CI = [4 × 10-4, 2.82]; Lon: mean = 1.55, one-sided 95% CI = [5 × 10-5, 3.55]). (b4) Linear relation between the delta score and the absolute difference between dispersal centres in longitude estimated from LVF and PhyloG. The orange ribbon denotes the 95% CI. (b5) Linear relation between the delta score and the absolute difference between dispersal centres in latitude estimated from LVF and PhyloG. The blue ribbon denotes the 95% CI. (b6) Table displaying statistical test results for three indexes: delta score, absolute estimated difference between LVF and PhyloG, and linguistic relatedness explanatory power of PCA-based distance and phylogenetic tree. For the delta score, the p value is calculated using the one-sided bootstrap test. For the absolute estimated difference, the p value is calculated using the one-sided Monto-Carlo Simulation test. For linguistic relatedness explanatory power of PCA-based distance or phylogenetic tree, the p value is calculated using the Mantel test. For all tests, statistical significance is indicated by p value < 0.05. The grey base world map used in Subfigure (a) is generated using the map function of the maps package in R (4.3.1). The Source Data and Codes for generating Fig.3 are available.

A New Horizon in Language Evolution Studies

The introduction of LVF marks a significant milestone in the field of language evolution. Its ability to account for both vertical and horizontal language dynamics opens new avenues for understanding the intricate relationship between language, culture, and human migration patterns.

This approach not only enriches our comprehension of linguistic development but also ties into broader historical narratives, offering a more holistic view of human history. As research continues to evolve, the LVF is poised to become an invaluable tool in unraveling the complex story of human languages and their journey across time and space.