Get a Quote

UK +44 (0)207 193 1808
USA +1 415 315 9818

The Romance languages, a major branch of the Indo-European language family , comprise all languages that descended from Latin, the language of the Roman Empire. The Romance languages have more than 600 million native speakers worldwide, mainly in the Americas, Europe, and Africa; as well as in many smaller regions scattered through the world.

All Romance languages descend from Vulgar Latin, the language of soldiers, settlers, and slaves of the Roman Empire, which was substantially different from the Classical Latin of the Roman literati. Between 200 BC and 100 AD, the expansion of the Empire, coupled with administrative and educational policies of Rome, made Vulgar Latin the dominant native language over a wide area spanning from the Iberian Peninsula to the Western coast of the Black Sea. During the Empire's decadence and after its collapse and fragmentation in 5th century, Vulgar Latin began to evolve independently within each local area, and eventually diverged into dozens of distinct languages. The oversea empires established by Spain, Portugal and France after the 15th century then spread Romance to the other continents — to such an extent that about 2/3 of all Romance speakers are now outside Europe.

In spite of multiple influences from pre-Roman languages and from later invasions, the phonology, morphology, lexicon, and syntax of all Romance languages are predominantly derived from Vulgar Latin. As a result, the group shares a number of linguistic features that set it apart from other Indo-European branches. In particular, with only one or two exceptions, Romance languages have lost the declension system of Classical Latin, and as a result have a relatively rigid SVO sentence structure and make extensive use of prepositions.

Vulgar Latin

There is very little documentary evidence about the nature of Vulgar Latin, and that little is often hard to interpret or generalize. In any case, many of its speakers were soldiers, slaves, displaced peoples, and forced resettlers — that is, more likely to be natives of the conquered lands than natives of Rome. It is believed that Vulgar Latin already had most of the features that are shared by all Romance languages and distinguish them from Classical Latin — such as the almost complete loss of the declension system and its replacement by prepositions, the loss of the neuter gender, of comparative inflections, and of many verbal tenses, the use of articles, and the change in pronunciation of /k/ and /g/.

Fall of the Empire

The political decadence of the Roman Empire in the 5th century and the large-scale migrations of the period, notably the Germanic incursions, led to a fragmentation of the Latin-speaking world into several independent states. Central Europe and the Balkans were occupied by Germanic and Slavic tribes, Huns, and Turks, isolating Romania from the rest of Latin Europe. Latin also disappeared from England, which had been for a time part of the Empire. On the other hand, the Germanic tribes that had entered Italy, France, and the Iberian Peninsula eventually adopted Latin and the remains of Roman culture, and so Latin continued to be the dominant language in those areas.

Latent incubation

Between the 5th and 10th century, spoken Vulgar Latin underwent divergent evolution in various parts of its domain, leading to dozens of distinct languages. This evolution is poorly documented, since the written language for all purposes continued to be a Latin close to the Classical variant.

Recognition of the vernaculars

Between the 10th and 13th centuries, some local vernaculars came to be written, and began to supplant Latin in many of its roles. In some countries, such as Portugal, this transition was speeded up by force of law, whereas in other countries, such as Italy, the rise of the vernacular was the result of many prominent poets and writers adopting it as their medium.

Uniformization and standardization

The invention of the press apparently slowed down the evolution of Romance language from the 16th century on, and brought instead a tendency towards greater uniformity of language within political boundaries. In France, for instance, the "Francien" spoken in the region of Paris gradually spread over the whole country, while the Langue d'Oc and Franco-Provençal of the south lost much ground.

History of the name

The term "Romance" comes from the Vulgar Latin adverb romanice, derived from romanicus, in the expression romanice loqui (which designated the vulgar languages of Latin origin, and which contrasted to barbarice loqui, the non-Latin "barbarian" languages of the invaders, and latine loqui, used for the Latin taught in schools). From this adverb originated the noun romance, which applied initially to anything written in a romanice loqui.


The most spoken Romance language is Spanish, followed by Portuguese, French, Italian, Romanian and Catalan. First five languages are all main and official national languages in more than one country each. A few other languages have official status on a regional or otherwise limited level, for instance Sardinian and Valdôtain in Italy, Romansh in Switzerland, Galician, Catalan and Aranese in Spain.

The remaining Romance languages survive mostly as spoken languages for informal contact. National governments have historically viewed linguistic diversity as an economic, administrative, or military liability, and a potential source of separatist movements; therefore they have generally fought to eliminate it — by massively promoting the use of the official language, by restricting the use of the "other" languages in the media, by characterizing them as mere "dialects" — or worse.

In the last decades of the 20th century, however, increased sensitivity to the rights of minorities have allowed those languages to recover some of their prestige and of their lost rights. However, it is not clear whether those political changes will be enough to reverse the decline of the non-official languages.

Linguistic features
Features inherited from Indo-European

As members of the Indo-European (IE) family, Romance languages have a number of features that are shared by other IE subfamilies (such as the Celtic, Germanic, Slavic, and Indo-Persian languages, Albanian, Armenian, Greek, Lithuanian, etc.), and in particular with English; but which set them apart from non-IE languages like Arabic, Basque, Hungarian, Tamil, and many more. These features include:

Almost all their words are classified into four major classes — nouns, verbs, adjectives, and adverbs — each with a specific set of possible syntactic roles.
They have a complex system of word inflections to indicate syntactic relationships between words and to create derivative words in the same or in other classes.
Inflection almost always consists in replacing a suffix of the word, and each word has relatively small set of "suffix slots".
They are verb-centered; meaning that the basic clause structure consists of a verb, expressing an action involving one or more nouns — the arguments of the verb — that play specific semantic roles in the action and specific syntactic roles in the clause.
The verb is inflected to indicate various aspects the action, such as time, completeness or continuation; and also according to the grammatical person and grammatical number of one of the arguments, the subject.
The verb can be further modified by adverbs, or by additional nouns preceded by prepositions that indicate their semantic roles.
Nouns are classified into several grammatical genders and grammatical numbers.
Adjectives are noun modifiers; each adjective is normally inflected so as to echo the gender and number of the noun it is attached to.
Verbs are not inflected according to the gender of the subject (unlike Arabic and Hebrew, for example).
Tone (voice pitch) is used only at the sentence level, e.g. to indicate surprise or interrogation (unlike Chinese and Yorùbá, for example, where pitch changes the meaning of words).

Features inherited from Latin

The Romance languages share a number of features that were inherited from Classical Latin, and collectively set them apart from most other Indo-European languages.

They have lost the dual number, retaining only singular and plural.
They all have retained at least three of Latin's verbal tenses: present, e.g. D?CIT "he says", past perfect D?XIT "he said", past imperfect D?CEBAT "he was saying".
For each tense, there are usually six distinct verbal inflections, encoding each of the three persons (I, you, he/she/it) and two numbers (singular and plural) of the subject.
They all had originally two copula verbs, derived from the Latin STARE (mostly used for "temporary state") and ESSE (mostly used for "essential attributes"). However, the distinction was eventually lost in some languages, notably French, which now have only the first copula.
All those languages are written with the "core" Latin alphabet of 22 letters — A, B, C, D, E, F, G, H, I, L, M, N, O, P, Q, R, S, T, V, X, Y, Z — subsequently modified and augmented in various ways.
In particular, the letters K and W are rarely used in Romance languages — mostly for unassimilated foreign names and words, as they were in late Latin.
In the case of standard Italian, the Latin stressed pronunciation of double consonants is preserved.

Features inherited from Vulgar Latin

Romance languages also have a number of features that are not shared with Classical Latin. Most of these features are thought to be inherited from Vulgar Latin.

There are no declensions, that is, nouns are no longer altered to indicate their grammatical roles. (An exception is Romanian, which retains a combined genitive/dative case. Also, Old French initially had an oblique case.)
There are only two grammatical genders, having lost the neuter gender of Classical Latin. (An exception is Romanian, which retains neuter gender)
The normal clause structure is SVO, rather than SOV, and is much less flexible than in Latin.
Adjectives generally follow the noun they modify.
Many Latin constructions involving nominalized verbal forms (e.g. the use of accusative plus infinitive in indirect discourse and the use of the ablative absolute) were dropped in favor of constructions with subordinate clauses.
There are definite and indefinite grammatical articles, derived from Latin demonstratives and the numeral UNUS ("one").
The Latin future tense was replaced by new synthetic future and conditional tenses, based on infinitive + present or imperfect tense of HABERE ("to have"), fused to form new inflections.
Most Latin synthetic perfect tenses were lost, generally replaced by new compound forms with "to be" or "to have" + past participle.
There is an elaborate system of pronouns which partially retain the distinction between Latin cases, some of them being clitic.
The distinction between long and short vowels, believed to have been present in Classical Latin, was lost and replaced by a system of lexical stress, where one vowel of each word is pronounced slightly louder than the rest.
Many Latin combining prefixes were incorporated in the lexicon as new roots and verb stems, e.g. Italian estrarre ("to extract") from Latin EX- ("out") and TRAHERE ("to drag").
The Latin letters C and G — which usually sound like [k] and [g] — have other sounds when they come before E and I. (See below.)

Other shared features

The Romance languages also share a number of features that were not the result of common inheritance, but rather of various cultural diffusion processes in the Middle Ages — such as literary diffusion, commercial and military interactions, political domination, influence of the Catholic Church, and (especially in later times) conscious attempts to "purify" the languages by reference to Classical Latin. Some of those features have in fact spread to other non-Romance (and even non-Indo-European) languages, chiefly in Europe. Here are some of these "late origin" shared features:

Most Romance languages have polite forms of address that change the person and/or number of 2nd person subjects, such as the tu/vous contrast in French or the tu/Lei contrast in Italian.
They all have a large collection of prefixes, stems, and suffixes retained or reintroduced from Greek and Latin, used to coin new words. Most of those have cognates in English, e.g. "tele-", "poly-", "meta-", "pseudo-", "dis-", "ex-", "post-", "-scope", "-logy", "-tion".
They all replaced the Latin letter V by a new letter U when it had a vowel sound.
Many of them introduced the new letter J (originally the Semitic version of I, which in time acquired various sounds in different languages).
They are all presently written in a mixture of two distinct but phonetically identical variants or "cases" of the alphabet, "uppercase" and "lowercase", with similar rules for their usage.
They also use very similar sets of punctuation characters.

Divergent features

In spite of their common origin, the descendants of Vulgar Latin have many differences. These occur at all levels, including the sound systems, the orthography, the nominal, verbal, and adjectival inflections, the auxiliary verbs and the semantics of verbal tenses, the function words, the rules for subordinate clauses, and, especially, in their vocabularies. While most of those differences are clearly due to independent development after the breakup of the Roman Empire (including invasions and cultural exchanges), one must also consider the influence of prior languages in territories of Latin Europe that fell under Roman rule, and possible inhomogenities in Vulgar Latin itself.

It is often said that Portuguese and French are the most innovative of the Romance languages, each in different ways, that Sardinian and Romanian are the most isolated and conservative variants, and that the languages of Italy other than Sardinian (including Italian) occupy a middle ground. Some even claim that Languedocian Occitan is the "most average" western Romance language. However, these evaluations are largely subjective, as they depend on how much weight one assigns to specific features. In fact all Romance languages, including Sardinian and Romanian, are all vastly different from its common ancestor.

Romanian (together with other related minor languages, like Aromanian) in fact has a number of grammatical features which are unique within Romance, but are shared with other non-Romance languages of the Balkans, such as Albanian, Bulgarian, Greek, and Serbian. These features include, for example, the structure of the vestigial case system, the placement of articles as suffixes of the nouns (cer = "sky", cerul= "the sky"), and several more. This phenomenon, called the Balkan linguistic union, may be due to contacts between those languages in post-Roman times.

Sound changes

The vocabularies of Romance languages have undergone massive change since their birth, by various phonological processes that were characteristic of each language. Those changes applied more or less systematically to all words, but were often conditioned by the sound context or morphological structure.

Some languages have dropped letters from the original Latin words. French, in particular, has dropped all final vowels, and sometimes also the preceding consonant: thus Latin LUPUS and LUNA became Italian lupo and luna but French loup [lu] and lune [lyn]. Catalan, Occitan, and Romanian (Daco-Romanian) lost the final vowels in most masculine nouns and adjectives, but retained them in the feminine. Other languages, including Portuguese, Spanish, Italian, Franco-Provençal, and the Southern dialects of Romanian have retained those vowels.

Some languages, like Portuguese, Spanish, and Venetian, have lost the final vowel -E from verbal infinitives, e.g. D?CERE → Portuguese dizer ("to say"). Other common cases of final truncation are the verbal endings, eg. Latin AM?T → Italian ama ("he loves"), AM?BAM → amavo ("I loved"), AM?BAT → amava ("he loved"), AM?BATIS → amavate ("You pl. loved"), etc..

Sounds have often been dropped in the middle of the word, too; e.g. Latin LUNA → Galician and Portuguese lua, CR?DERE → Spanish creer ("to believe").

On the other hand, some languages have inserted many epenthetic vowels in certain contexts. For instance Spanish and Portuguese have generally inserted an e in front of Latin words that began with S + consonant, such as SPER? → espero ("I hope"). French has gone the same way, but then dropped the s: SPATULA → épaule ("shoulder"). In the case of Italian, a unique article, lo for the definite and uno for the indefinite, is used for masculine S + consonant words (sbaglio, "mistake"), as well as all masculine words beginning with Z (zaino, "backpack").

Lexical stress

The position of the stressed syllable in a word generally varies from word to word in each Romance language, and often moves as the word is inflected. Sometimes the stress is lexically significant, e.g. Italian Papa [papa] ("Pope") and papà [papa] ("daddy"), or Spanish imperfect subjunctive cantara ("he would sing") and future cantará ("he will sing"). However, the main function of Romance stress in appears to be a clue for speech segmentation — namely to help the listener identify the word boundaries in normal speech, where inter-word spaces are usually absent.

In Romance languages, the stress is usually confined to one of the last three syllables of the word. That limit may be occasionally exceeded by some verbs with attached clitics, e.g. Italian mettiamocene [me.t?ja.mo.te.ne] ("let's put some of it in there") or Spanish entregándomelo [en.tre.an.do.me.lo] ("delivering it to me"). Originally the stress was predominatly in the next-to-last syllable, but that pattern has changed considerably in some languages. In French, for instance, the loss of final vowels has left the stress almost exclusively on the last syllable.

Formation of plurals

Some Romance languages form plurals by adding /s/ (derived from the plural of the Latin accusative case), while others form the plural by changing the final vowel (by influence of the Latin nominative ending /i/). See La Spezia-Rimini Line for more information.

Plural in /s/: Portuguese, Galician, Spanish, Catalan, Occitan, Sardinian, Friulian.
Vowel change: Italian, Romanian.
No marking: French (formerly marked with /s/, but this has been lost in the spoken language; plural marking is now indicated on the associated determiner rather than the noun itself)

List of languages

The following is a listing of the Romance languages and some of their dialects. Since the classification of Romance languages is still controversial, the listing records only the groupings that are accepted by most linguists. Top level groups are listed roughly West to East. Within each group, the sub-groups and languages are listed in alphabetical order. Nesting below a single language is used only for true dialects — meaning languages that were definitely derived from that parent language only, well after the parent came into existence. So, for example, the American variants of Spanish are listed under Spanish; whereas Spanish, Portuguese, and Galician are listed at the same level. Ditto for the so-called "Italian dialects," which were derived directly from Vulgar Latin and not from standard Italian.

West Iberian languages

Asturo-Leonese: 100,000 Spain.
Fala: 10,000 Spain.
Galician: 4 million Galicia.

Dialects in Galicia:

Central Galician
Eastern Galician
Western Galician

Judaeo-Portuguese: extinct.
Ladino (Judaeo-Spanish)
Mirandese: 5,000 Portugal.
Portuguese: 230 million Portugal, Brazil; a few thousand Asia; 26 million Africa.
Riverense Portuñol: about 100,000 in Uruguay and Southern Brazil.
Spanish (Castilian): 360 million Spain, Americas.

Dialects in Spain:

Andalusian Spanish
Canarian Spanish
Churro Spanish
Murcian Spanish
Northern Spanish

Dialects in Americas:

Amazonian Spanish
Andean Spanish
Antioqueño Spanish
Camba Spanish
Caribbean Spanish

Cuban Spanish
Dominican Spanish
Panamanian Spanish
Puerto Rican Spanish
Venezuelan Spanish
Maracucho Spanish

Central American Spanish
Chilean Spanish

Chilote Spanish

Cundiboyacense Spanish
Ecuatorial Spanish
Mexican Spanish (Central Mexico)
North Mexican Spanish
South Mexican Spanish
Paraguayan Spanish
Peruvian Coast Spanish
Rioplatense Spanish
Santandereano-Tachirense Spanish
Yucateco Spanish

Mozarabic: (extinct)

Catalan: 6.5 million Spain, Andorra.

Central Catalan (incl. Barcelonese)
Northern Catalan (Roussillonese)
Ribagorçan: (~Aragonese)

Northern French languages (langues d'oïl):

French: 70 million France; 12 million Americas.

Dialects in the Americas:

Acadian French
Canadian French
Québec French


Anglo-Norman language: extinct.


Southern French languages (Occitan, langue d'oc): 2 million France:




Sardinian: 300,000 Sardinia.


Northern Italian (Gallo-Romance) languages:

Venetian: 2 million Veneto.

Rhaetian languages

Friulian: Friuli.
Ladin Dolomites
Romansh: 66,000 Switzerland.

Italo-Dalmatian languages:

Dalmatian: extinct.
Italian: 60 million Italy.
Judeo-Italian: 4,000 Italy.
Neapolitan: 8 million Italy.
Sicilian: 10 million Sicily, Italy.

Eastern Romance languages:

Aromanian: 300,000 Greece, Macedonia, Albania, and Bulgaria.
Romanian: 22.5 million Romania.
Moldovan (identical to Romanian): 4.5 million Moldova.

Mixed languages

There are some languages that developed from a mixture of two established Romance languages. It is not always clear whether they should be classified as pidgins, creole languages, or mixed languages.

Proposed subfamilies

Here are some of the subfamiles that have been proposed within the various classification schemes for Romance languages:

Iberian Romance languages

Pidgins and creoles

The global spread of colonial Romance languages has given rise to numerous creole languages and pidgins. Some of the lesser-spoken languages have also had influences on varieties spoken far from their traditional regions. The following is a partial list of creole languages and pidgins, grouped by their main source language.

Lingua Franca, influenced by the Romance languages of the Western Mediterranean and Arabic.
French-based creole languages:

Haitian Creole is a national language of Haiti
Antillean Creole spoken primarily in Dominica and St. Lucia.
Kreyol Lwiziyen Louisiana creole
Mauritian Creole is the lingua franca in Mauritius
Seychellois Creole Also known as Seselwa, Seychellois Creole is an official language, along with English and French, as well as the lingua franca of the Seychelles.
Lanc-Patuá Spoken in Brazil, mostly in Amapá state. It has been influenced by Portuguese. It was developed by immigrants from neighbouring French Guiana and French territories of the Caribbean Sea.

Portuguese-based creole languages

Angolar Spoken in coastal areas of São Tomé Island, São Tomé and Príncipe.
Annobonese Spoken in the island of Annobón, Equatorial Guinea.
Capeverdean Crioulo (Criol, Kriolu) A dialect continuum spoken in Cape Verde.
Daman Indo-Portuguese Spoken in Daman, India. Decreolization process occurred.
Diu Indo-Portuguese Spoken in Diu, India. Almost extinct.
Forro Spoken in São Tomé Island, São Tomé and Príncipe.
Kristang Spoken in Malaysia.
Kristi Spoken in the village of Korlay, India.
Principense (Lunguyê) Spoken in Príncipe Island, São Tomé and Príncipe. Almost extinct.
Macanese Spoken in Macau and Hong Kong. Decreolization process occurred.
Papiamento Spoken in the Netherlands Antilles and Aruba. Spanish influenced.
Riverense Portuñol Spoken in Rivera (Northern Uruguay) and region. Spanish influenced.
Saramaccan Portuguese/English Creole. Spoken in Surinam.
Sri Lanka Indo-Portuguese Spoken in Coastal cities of Sri Lanka.
Upper Guinea Creole (Kriol) lingua franca and "national language" of Guinea-Bissau, also spoken in Casamance, Senegal.

Spanish-based creole languages

Chavacano -Spoken in Zamboanga and Cavite , Philippines.
Papiamento. It is often hard to tell Portuguese influences from Spanish ones.
Spanglish, spoken in northern Mexico and southern United States.

While not being pidgins nor creoles, English (see Middle English creole hypothesis), Basque and Albanian have a substantial Romance influence in their vocabularies.

Constructed languages

Latin and the Romance languages also give rise to numerous constructed languages, both international auxiliary languages (such as Interlingua, Latino sine flexione, Occidental, and Lingua Franca Nova) and languages created for artistic purposes only (such as Brithenig and Wenedyk).

From Wikipedia, the free encyclopedia