The history of languages and the oldest language

While replying to other posters on this subject, I thought about a more detailed response on the question whether a specific language can be considered the oldest, and what it would mean. It's not directly related to language learning, but I believe some learners here may find this information useful or interesting. It's an educational site after all, so let's get informed! I'll start with the basics.

The world has thousands of languages. They all are very different, spoken in different parts of the world and have different writing systems, history and culture. If we want to compare them, we have to establish similar traits which will help us. And modern linguistics has, roughly speaking, the following things to say about natural spoken languages:

  1. Languages change.

Some of them change faster, some slower, but they all do, provided enough time. We can easily read English from one century ago. English from 16th century would be considerably harder, English of 10th century would be very hard to figure out. But for an Icelandic speaker Icelandic text from 11th century will be relatively easy to read. That's because for a number of reasons (isolation, compact population, literary tradition) Icelandic remained very stable for a millenium.

Here's a different example: Lithuanian. We don't have very old literary matherial for Lithuanian, the earliest comes from aroud 16th century. But Lithuanian is considered the most conservative language in Indo-European family. Indo-European language family includes most languages of Europe and many languages in Asia (including every language that is currently available on Duolingo). Linguists believe that rougly six thousands years ago it was a single language, that later branched, changed and so gave life to all modern languages in this family. This very ancient language wasn't written by anyone, but using a number of scientific methods linguists can roughly restore it, and, getting back to Lithuanian, they can see that Lithuanian changed the least over thousands of years.

If we compare Icelandic and Lithuanian, we can say that Icelandic was very stable during one thousand years, but not before; while Lithuanian changed slowly and gradually over many thousands of years.

  1. Nobody witnessed or discovered how a natural spoken language appears.

We know that languages change, and scientists can reconstruct some ancient languages to the stage before they were first written down. But when we go back in time, at some point it becomes impossible to know what the language looked like. It existed before, it simply was different. For example, I said above about Indo-European language family and how scientists can reconstruct the ancestral language of this family. It means that science can't tell what it was like centuries or millenia before that. That ancestral Indo-European language had an ancestral language of its own, but we'll probably never be able to reconstruct it.

I said "natural spoken language" in this sub-title. Why "natural"? Because artificial languages, or constructed languages, also exist. They are explicitly created by linguists and enthusiasts. For example Esperanto. Or Elvish languages, or Dothraki, or Klingon.

Why "spoken"? Because as a matter of fact we witnessed a birth of a sign language! It is called Nicaraguan Sign Language, it was born spontaneously by deaf Nicaraguan children who didn't have many ways to communicate in schools.

  1. Languages move.

We know that sometimes speakers of a language come to the teritory where it wasn't spoken before. For example, speakers of English, Spanish, Portuguese etc. came to America, where previously only Native American languages were spoken.

Sometimes languages move faster than people. We know that Ireland was speaking more Irish centuries ago, but people weren't simply displaced. It was an environment in which the society deemed it preferable to switch to English (social pressure and economic benefits).

Languages change, but people sometimes stay in the same region for many thousands of years, while changing languages between generations. For example, geneticists could establish that one of the bodies discovered in English bogs, many thousand years old, was directly related to a modern English speaking person living in the same area. But English came to the islands much later.

  1. Languages can split and give life to multiple new languages.

As the time comes, a language spoken on a large area can slowly grow differences within itself under many internal and external influences. When the Roman Empire conquered vast areas of land in europe it brought Latin language along. People were switching to Latin, but because they had many different neighbors and previous languages, their newly acquired Latin started adding differences, and what was called Latin before medieval times, transformed itself into many Romance languages (from the name of the city, Rome), including French, Spanish, Portuguese, Catalan, Romanian, Italian and many smaller languages.

  1. Some languages disappear before we can learn about them sufficiently.

For example, we know that there was Etruscan in northern Italy. Etruscans were people with a rich culture, but their written records are very limited to restore a full picture of the language. Or Iberians in Iberian peninsuila. We have fragmentary records of their language, they may or may not have been related to Basques, but we can't know for sure. In Ireland, we can see a beautiful neolithic building (Newgrange), but we know that the people who built it weren't speaking English, and they weren't speaking Irish either! In fact the building is so old that we can only say with some certaintly that the languages we know of weren't in the area at that time. So we have a sign of beautiful culture without any known language along with it.

So what do we mean when we call a language the oldest one? I personally think it's not a good term, because most of the time it is impossible to understand the exact criteria, and provided everything written above we sometimes can't be certain. But we can consider different definitions that can help us clarify our words.

  • The earliest written record.

We can discover a very ancient written record of a language and use that info to claim the language as the oldest one. For example, the earliest attested languages are Egyptian and Sumerian. it is important to understand that these languages were spoken even before that. So we can say, for example, that Sumerian is the oldest language. Unfortunately, Sumerian speakers gradually switched to a different language (Akkadian, which in turn was replaced by Aramaic, which was again replaced by Arabic). Egyptian lived longer, into a form today known as Coptic, but Egyptians also switched to Arabic centuries ago.

Now we can consider the earliest written record of a living language. That is, something that changed from its ancient form but wasn't directly replaced. In Europe, the earliest attested language that's still spoken is Greek. Here's an interesting bit: the first records of Greek weren't written in Greek characters, but in much more elaborate symbols they borrowed from Minoans, inhabitants of ancient Crete.

In Asia it would be Chinese. Unlike Greek, Chinese acquired a great diversity, ancient Chinese is ancient Mandarin just as it is ancient Cantonese, see item 4 above.

  • The earliest appearance in the area.

Some languages don't leave very old written records, but scientists can say with certaintly that people were in the area. Let's consider modern Celtic languages. Today, Celtic languages are spoken in Britain, Ireland, and Brittany, a region in France on a peninsula close to Britain. Thousands of years ago, Celtic languages, them being a part of the same Indo-European family, were branched from the ancestral languages and started spreading from a territory roughly corresponding to modern southern Germany, Switzerland and Austria. They occupied large areas of Europe, but were later displaced by Latin, Germanic and Slavic languages, as well as some others. But they stayed in Britain for much longer, and later Anglo-Saxons, germanic speakers from the north of Germany, come to Britain.

Interesting bit: Breton, a language in France, didn't stay all that time in the region, but in fact was brought by Celts from Britain back to mainland Europe after other Celtic languages in the region mostly disappeared. It is clear from its similarity to Welsh and linguistic developments they underwent together.

So if we limit our selection to an area where languages were moving into, for example Britain, we can consider that no languages predating Celtic are left, and English came later. So we can safely say that Welsh is the oldest living language in Britain. Because a Celtic language of Britain was being changed and split in multiple languages, and a result of that split is Welsh, still alive, and Cornish, a language people are trying to revitalize. Breton doesn't count because it developed as a language outside of Britain. In Ireland, Irish language has a similar timespan and can be claimed to be the oldest language of Ireland, but that's obvious.

Another good example is Basque. The most important information we can start with is that it's not Indo-European, unlike all languages of Western Europe. That means we don't know how long it was in the area, but Roman Empire has a direct proof that people who spoke a similar language were in the region rougly 2 thousand years ago at very least. It is most probable, although not proven, that Basque was in the region long before Indo-European languages (including Celtic) came in to the area, so we probably can say with some certainty that Basque is the oldest remaining living language in Western Europe.

But, if we don't explain exactly what we mean when saying "the oldest", we're withdrawing an important information that could explain what we mean. Language history is too fluid, uncertain and limitless to be brief.

I'm tired of writing for now, I hope it's not too chaotic. I'm not a linguist, but the history of languages is a subject that matters to me, and I'll gladly take my time to explain things I've written or speak about other languages you don't want to read about in Wikipedia. And corrections are of course welcome, I love to learn.

August 23, 2014


This was so interesting! I'm Basque and I thought my language was the oldest in Europe, but turns out it's the oldest in Western Europe only, darn it! Anyway, it's still a wonderful feeling to belong to a community that protected their language from Latin and whose origin is unknown. Have a lingot for the long text and the effort :)

August 23, 2014

Thanks. Yep, I meant that it's not so certain if you consider the entire continent: Indo-Europeans were in Eastern Europe for a very long time, and in Northern Europe there's Sami people, very long history of staying at one place too.

August 23, 2014

Basque is a really cool language. I hope some day it will be on Duolingo.

August 24, 2014

I'd love people to speak it, but it's a quite difficult language and has no relation gramatically with spanish or french... so i don't suppose many people will want to learn it sadly

August 25, 2014

To be honest, I thought the same about Irish, and still find it pleasently supprising that so many people want to learnt it. It makes me proud.

So don't give up, and if you can encourage and help even a few people to learn your language in your lifetime I think that's something to be extremely proud of. Good luck!

August 25, 2014

Thanks a lot that was very informative

August 24, 2014

Technically from a linguistic perspective, all languages are of equal age, and therefore no language is the oldest. Yes, if you use some criteria, such as oldest written language, but that's being specific.

All in all though a very interesting post and a great read.

August 23, 2014

Very informative and interesting.

August 23, 2014

Very interesting, thank you. I had begun to wonder about the history of languages but didn't really know where to start.

August 23, 2014

Interesting, but there's a bit wrong in the opening. Icelandic once had a lot more loan words, but they were pruned out of the language to make it more "pure;" it's not that way because it didn't change, but because it was reformed to be. Also, it's unlikely that they would be able to understand Old Norse, and they do need annotations at some points when reading it.

Now to Lithuanian... It's not the overall most conservative language. It's the most conservative language phonetically, but it isn't the overall most conservatice language.

August 24, 2014

Thanks. I figured that I'm oversimplifying, but I don't know many details about either. Basically, they are quite archaic in some way and it seemed appropriate to compare them to English with its galloping changes throughout history.

August 24, 2014

Very interesting! Thank you for the effort! I usually tend not to read very long posts since most of the time they're just copied from somewhere else but I loved how you put it so straight forward. As a side note we Persians can easily read written Farsi from at least 10 centuries ago without problem, maybe a few words need some clarification from time to time.

I recently found the name of a subject of my interest about languages and that is etymology. I would always be fascinated by how and why word meanings change. You may want to consider that as the topic of your next post!

August 24, 2014

Here, have a lingot! Very interesting and clear text, I loved it! :D

August 24, 2014


August 23, 2014

I don't see any mention about Tamil (scripts from 300 BC) and Sanskrit(2000 BC).

Though I don't speak those languages, I think it's worth mentioning as the influence of Sanskrit on several western languages is known to all. The basic language of computers were also constructed along the principles of Sanskrit.

Tamil language origin is believed to be during 2500 BC or so. In today's world, around 78 million people in the world speak Tamil. It is this fact of contemporary utility that makes Tamil the longest surviving language in the world along with Chinese.

November 26, 2017

Thanks a lot, one lingot for interest. Another factoid for this list is that the oldest Indo-European records date from the 17th century BC and are written in Hittite, a member of the Anatolian sub-family of languages.

August 24, 2014

Thank you so much for this post! I simply adore historical linguistics!!! (:

January 14, 2015
Learn a language in just 5 minutes a day. For free.