Dialect Classification

How changes spread across geographical zones

Romani dialects have been grouped mainly on the basis of their geographical location: The conventional classificatory grid recognises a Northwestern, Northeastern, Central, Vlax (centred around Romania and neighbouring regions) and Balkan group, of which the latter three are each further sub-divided into a northern and a southern sub-group. This suggests that the divisions between the dialects can be plotted in the form of lines on the map, each line or ‘isogloss’ representing a difference in the realisation of a particular structural feature. How do such geographical divisions in the form of isoglosses come about?

The early phase of dialect differentiation

It is likely that the speech forms of different Romani families and clans differed only slightly before they migrated into Europe between the late fourteenth and fifteenth century. Although they were often known as “Travellers” due to their specialisation in itinerant trades, most Roma did not habitually travel long distances but remained in familiar regions, interacting with a familiar population of settled clients. They acquired local languages, took on local religions, and adopted a role in local economies. Hybrid identities developed as each Romani population accommodated to its new environment while maintaining its own language, beliefs and customs.

The period that followed settlement in the individual regions in the sixteenth and seventeenth centuries was a period of rapid change during which distinct regional Romani identities emerged. This period left its mark on the speech forms of Roma in various locations. Each community developed its own structural preferences and adopted influences from the new contact languages. Documentation of Romani proliferated in the early eighteenth century, with scholars taking a keen interest in the language. By this time, Romani dialects were already as diverse as we know them today


The divisions between the dialects are largely the result of changes that accumulated since the dispersion of Romani populations throughout the European continent. Some of the changes were local, limited to the speech form of several households or a group of closely related clans in a small region. Roma continued to maintain contact networks with other Roma after settlement, and many changes were passed on to other communities. The passing of structural innovations from one community to another is known as ‘diffusion’. When plotting the spread of structural features on the map we are therefore reconstructing the path of their historical diffusion among population groups and so across geographical space.

North-south division

The differentiating features that capture our attention and are most relevant to a general classification of dialects are those that separate the entire Romani-speaking landscape into identifiable zones. In relation to several prominent features in phonology, morphology and lexicon, there is a tendency toward a north-south split, with innovations occurring on both sides of the divide. This division line tracks the older (sixteenth-seventeenth century) frontier zone between the Habsburg Monarchy and the Ottoman Empire. The political boundary prevented contacts between Rom on either side and blocked the diffusion of innovations, creating a dense and conspicuous cluster of isoglosses (Map 1).

Map 1: North-south division
^ Map 1: North-south division

In the north, syllable truncation is triggered in all likelihood by a shift to word-initial stress as a result of Romani-German bilingualism. We find mal ‘friend’ for amal, khar- ‘to call’ for akhar-, sa- ‘to laugh’ for asa-, and more. There is also a preference for initial jotation in selected words, among them jaro ‘egg’ and the 3rd person pronouns jov ‘he’ etc., and the simplification of the historical cluster ṇḍ to r in words like jaro ‘egg’ and maro ‘bread’. The south, by contrast, maintains non-jotated forms and consonant clusters, as in (v)ov ‘he’, an(d)ro ‘egg’, man(d)ro ‘bread’.

The remarkable coherence of the northern area, from Britain to Finland, the Baltics and northern Russia, might lead us to believe that the individual dialects split away from an earlier group that had settled around the German-Polish contact area. Note that the Romani dialects of the Iberian peninsula tend to remain conservative with respect to these features, indicating that they were not part of the network of contacts that enabled their diffusion in the north. A number of developments fail to reach Finland and appear to have been adopted after the breakaway of the Scandinavian sub-group. They include the loss of the preposition katar ‘from’, which is retained in Finnish Romani, and the assimilation of verbs of motion and change of state into the dominant verb inflection and disappearance of gender-inflected past-tense forms of the type gelo ‘he went’ geli ‘she went’ (equally retained in Finnish Romani).

A series of lexical preferences spread throughout the north, while inherited variation often continues in the south. The north has xač- ‘to burn’ (in the south phabar-) and stariben ‘prison’ (phanglipe in the south, but also in Finnish Romani), as well as angušt ‘finger’ (naj in the south), derivations of gi for ‘heart’ (ilo in the south), and men ‘neck’ (kor in the south).

In the south, the epicentre of innovation appears to be Romania and adjoining regions. Prominent southern innovations include the loss of the nasal segment at the end of the nominalising suffix -iben/-ipen, and affrication in tikno ‘small’ > cikno predominates in the south, though the southern Balkans show a mixed region. Verbs belonging to the perfective inflection classes that had retained a perfective augment -t- are re-assigned to the class of verbs with an augment ¬ -l- (originally representing verb roots ending in vowels): beš-t-jom ‘I sat’ > beš-l-jom. Conservative forms occur occasionally in isolation in the south, especially along the Black Sea coast.

Northwest-Northeast divide

The north-south divide is complemented by a further divide between a (north)western zone with its centre in Germany and northeastern zone comprising the Baltics and North Russia (Maps 2-3). The 2SG past-tense and present copula conjugation marker -al was probably the older historical form (going back to the 2SG oblique enclitic pronoun *te). In Early Romani it appears to have competed with -an, an analogy to the 2PL marker. The form in -al is generalised in the western innovation zone in Germany and spreads eastwards into central Europe to include the Romani dialects of historical Habsburg Monarchy and on to some of the dialects of Trans-Carpathian Ukraine, but leaves out the entire western periphery (Britain and Spain) as well as northern Poland and the Baltic areas. A very similar diffusion pattern is found for the predominance of -h- over -s- in grammatical paradigms and in particular in intervocalic position such as the singular instrumental/sociative case endings (leha ‘with him’ vs. lesa). Here too, the variation appears to go back to Early Romani. Note that s/h alternation is found in a wide transition zone encompassing the continental side of the Adriatic and stretching all the way to Transylvania. Finnish Romani matches this western-central diffusion zone for both items, indicating that the development preceded the separation from the continental dialects.

Map 2
^ Map 2: Northwest-Northeast divide I

The shortening of anglal/angil ‘in front’ to glan/gil, of ame ‘we’ to me, and of the verbs ačh- ‘to stay’ and av- ‘to come’ to čh- and v- (as examples for numerous other items affected by the process) remain limited to Romani varieties spoken within the German-speaking area and neighbouring regions. The areas south of the Great Divide remain unaffected by these developments, while in the northeastern zone jotation appears consistently so that ame ‘we’ becomes jame, and the verbs ačh- ‘to stay’ and av- ‘to come’ become jačh- and jav-. A partition similar in shape emerges around analogies in the past-tense marker of the 2PL. The original -an prevails in the northwest as well as in a central belt connecting Germany all the way with the Romanian Black Sea Coast. The innovation centres are once again the northeastern zone, where the predominant form is -e (by analogy to the 3PL), and the southern periphery, from southern Romania through to the Mediterranean coast of France, where a partial analogy renders the form -en.

Map 3
^ Map 3: Northwest-Northeast divide II

Core and periphery

Many developments spread following a pattern of core vs. periphery. In the case of the word for ‘flour’ (Map 4), Early Romani appears to have had at least the two variants, with and without initial v- (ařo and vařo). In the northern regions the pressure toward initial jotation affected the word, which became jařo. The general absence of initial segments in the south shifted the balance in favour of a generalisation of the more conservative form ařo. But in the geographical periphery, in the absence of pressure in any particular direction, the more innovative of the two Early Romani variants vařo was selected.

Map 4: Distribution of variants for 'flour'
^ Map 4: Distribution of variants for 'flour'

Often the periphery remains conservative. The original Early Romani demonstrative opposition set in adava : akava (with corresponding forms in -o-) is retained in the geographical periphery comprising Britain, Spain, Italy, and the southern Balkans (Map 5). The core, by contrast, shows various innovation zones where the original forms are simplified or reinforced to create opposition pairs such as adava : dava, kada : kaka, kava : kavka and so on. Though zones partly overlap due to the many forms that can become part of the paradigm, a rough geographical split can be identified between a zone in northern Bulgaria and Romania (kaka), a central zone around Hungary and Slovakia (kada), a northeastern zone comprising Poland and Russia (dava : adava) with a unique retention sub-zone in the Baltics (kada), a major zone stretching from the Black Sea coast to the North Sea (kava), and a Finnish zone (tava).

Map 5: Demonstratives
^ Map 5: Demonstratives

A conservative periphery is also encountered in the retention of Greek-derived verb inflection markers, used in Romani as a means of adapting loan verbs from Greek and subsequent contact languages (Map 6). Romani dialects of present-day Greece show a proliferation of forms. Several forms are retained in Welsh Romani too. Crimean and Zargari Romani keep -isker- and -isar- appears in Romania-Moldavia and in Spain. The distribution of other forms shows a German-Scandinavian dialect group with -er-/-ev-, a Black Sea coast group with -iz-, and a central-eastern zone from the Baltics and all the way down to western Bulgaria and southern Italy, with -in- (primarily, with additional vocalic variation in the Balkans).

Map 6: Loan verb adaptation markers
^ Map 6: Loan verb adaptation markers


Note that each isogloss has its own unique pattern of diffusion. The fact that we are able to review a set of numerable such patterns mirrors the historical fact that networks of social contacts between Romani communities remained stable for considerable periods of time, allowing the diffusion of several distinct innovations to follow similar pathways, while divisions between groups – through political borders, migrations, or simply through a collapse of social contacts – set demarcations boundaries that contained the diffusion. The result is an accumulation of a complex matrix of different diffusion patterns, yet not without overlap of a number of prominent isoglosses.

When consideration is given to the various bundles of isoglosses representing prominent structural features – such as essential vocabulary items, salient lexico-phonological developments, and especially the organisation of recurrent morphological paradigms – then we obtain a picture that is quite similar to the prevailing reference grid of dialect classification. The classification is thus inspired by the reality of clusters of isoglosses, which in turn are the accumulated result of the diffusion of structural innovation among populations and across geographical space.