The Origin of Cambodian

Linguistics 540- Hallen

February 24, 1998

Ratry Badell

I would like to start out by introducing my background and what inspired me to choose Khmer as a language to do research on. I am from Cambodia (see Appendix 9) which was taken over by Pol Pot, the Communists’ leader (from 1970 until 1995). Although Cambodia is no longer in Pol Pot’s hands it is still not a stable country. There are still rallies in some areas of Cambodia as well as coup d'état’s for control of the government. The estimates of the number of those killed during Pol Pot’s reign have been into the millions.

The Communists took over Cambodia when I was 6 years old; they destroyed schools and killed teachers and any other type of scholars. They didn’t favor educated people because the communists knew it was easier to rule people who were ignorant, they tried their best to keep us all illiterate. Pol Pot’s regime created many tragedies including the destruction of the people’s hope of leaving any type of legacy for future generations. Many souls, myself included, grew up not knowing anything about their native language. Becoming familiar with the legends of my native language and the prospect of getting closer to my ancestors inspired me to research the Khmer language.

This paper will deal mostly with the origin of Cambodian (Khmer). Although the information on this topic is sparse, I will attempt to bring to light and organize the information that I have found during my research. In order to do so, I will talk about the history of the Khmer language and in so doing, try to explain its origin and how it came to be what it is today.

According to the International Encyclopedia of Linguistics, the language of the nation of Cambodia, formerly Kampuchea, is often called "Cambodian"; but the autonym, as well as the proper linguistic and ethnographic term, is "Khmer" (Diffloth 271). In the language of Old Khmer, kmer means ‘slave’ and was adopted as an ethnonym by the Thai and Vietnamese conquerors. The modern pronunciation is /khmae/, (the /r/ is not pronounced) this is in great measure due to the European influence (mostly that of the French). The name ‘Cambodia’ comes from Kambu, the ‘Khmer’ eponymous ancestor plus –ja ‘son of’; it was replaced with ‘Kampuchea’ in the 1970s but in 1989 the Heng Samrin government readopted the name of ‘Cambodia’.

The Khmer language can be divided into these four periods:

Pre-Angkorian Old Khmer (seventh to eight century)
Angkor Old Khmer (ninth to the thirteenth century)
Middle Khmer (fourteenth to the eighteenth century)
Modern Khmer (eighteenth century)

(Parkin 64)

It was during the Pre-Angkorian Old Khmer period that the Cambodia people first received a phonologically based writing system, the Pallava script. This was also the era in which there was a great influx of Indian ideas, etc. It was during the Angkor period also known as the Classic Period, that the influence of Buddhism attributed to the loaning of vocabulary from Pali to Khmer. It was also during this period that Khmer greatly influenced the surrounding languages of Thai, Lao, Vietnam, and Cham (Diffloth 275). Middle Khmer was a period of literary writing evidenced by the texts found on palm-leaves and other manuscripts. The Middle Khmer period was one of evolution so that there is no clear breaking point separating it from the Modern Khmer period which, consequently, has stretched into the present day.

The earliest Old Khmer inscription found was dated about 533 Saka Era (611 CE) using a form of the Pallava script (Diffloth 271). At first, it was unclear whether or not this is the language we call Pali, which was derived from Sinhalese, a dialect of Sanskrit. However, my research led me to the realization that Pallava is actually a distinct language belonging to the large family of Indian writing derived from the Brahmi script. The Brahmi script is of the Ashokan language, an ancient Indian language. It is the same Pallava script that was used for writing Old Khmer that has evolved through time into the present day Cambodian script. At the same time, it is very clear that Cambodian or Khmer has adopted many words from Pali. Furthermore, Pali appears to be the closest relative to Cambodian that presently exists, this being the case I think it necessary to explain a little bit of the origin of Pali.

Pali language is one of the Prakrits, or Aryan vernaculars of ancient India. It was spoken in the sixth century before Christ, and is considered to have been a dead language for over two thousand years. Pali is sometimes called the language of the monks. Therefore, there’s no doubt that it is still widely used among the monks and the learned throughout Asia. Buddhist tradition holds that Pali, the dialect of Magadha (a sub-category of Sanskrit), was the language medium that Gautama Buddha preached in. "Just like a Jew of present day looks upon the language of Pentateuch, with same regard, the Buddhists look at Pali or Magadha as the genius of a great reformer to the dignity of a classic language. They believe that had Gautama Buddha never preached, the Pali language or dialect of Magadha, would never have been distinguished from any of the other ancient tongues (Childers VI).

Childers further explained:
"The true or geographical name of the Pali language is Magadhi, Magadhese language, or Magadhabhasa, language of the Magadha people. The word pali in Sanskrit means line, row, or series; and by the Buddhists of south India is extended to mean the series of books which form the text of the Buddhist Scriptures. Thence it comes to mean the text of the scriptures as opposed to the commentaries and at last any text, or even portion of a text, of either scriptures or commentaries. Palibhasa therefore means language of the texts, which of course is equivalent to saying Magadhi language. The term pali in the sense of sacred text is ancient enough, but the expression Palibhasa is of modern introduction, and Magadhi is the only name used in the old South Buddhist texts for the sacred language of Buddhism" (ix).

Present day Khmer has 33 consonants, 32 sub-consonants, 26 vowels, and 14 independent vowels (the independent vowels are rarely used). Khmer, unlike the neighboring languages of Thai and Lao, is spoken with a monotone. Khmer words are either monosyllabic or sesquisyllabic. The only exceptions to this are the Sanskrit and Pali loanwords, which can have many syllables. The writing system is alphasyllabic and is written from left to right (Bright 467). The consonants are written first and then the vowel symbols are written either in front of, above, below or behind that consonant (see Appendix 2-5).
Khmer belongs to the Mon-Khmer family which comprises more than one hundred distinct languages including "Lower Burma, Palaung, Wa, Riang in Burma, the Khmu, Lamet, Laos, the Kuy, Chong, Sre in Vietnam, and possibly Semang and Sakai in the Malay peninsula" (Huffman IX). Mon-Khmer languages are located in mainland Southeast Asia, and some of the islands. This family and the Munda languages of India, form a larger phylum called Austro-Asiatic (see Appendix 8). According to Parkin, the Austro-Asiatic population is probably over 80 million (65 million or more are Vietnamese). The next largest group are the Khmer, currently about six million, a drop of some two million because of the killing fields; due to Pol Pot’s tragic activities, starting from the 1970s and lasting until 1989. The third largest is the Santal, one of the largest tribes in India, with nearly four million people (Parkin 1).

Mon-Khmer is spoken by some 5,000,000 people in Cambodia, about 400,000 inhabitants of the provinces of Buriram, Surin, and Sisaket in northern Thailand and approximately, about 450,000 people in South Vietnam along the Mekong Delta (Huffman IX). It is unclear when the split between Mon-Khmer languages occurred; some linguists believe that it happened a thousand years ago. However, the International Encyclopedia of Linguistics states that Khmer was one of the first languages of Southeast Asia, along with Cham and Mon, to receive a phonologically based writing system (Diffloth 271). This is one of the most important events to a language if it ever hopes to survive through time.

As with other relatively new languages, Khmer was greatly influenced by the language of the visiting educated society, in this case the Sanskrit and Pali language of the Indian people. Similar to Rome in Ancient times, the Indians took their language, religion, and culture to every country they inhabited. It is a commonly held fact that the Angkor (which is derived from the Sanskrit word ‘nagara’ meaning ‘city’) would never have been built if it wasn’t because of the influence and architectural technology of the Indians (see Appendix 6&7).

George Coedes stated that the Indians came to Southeast Asia for two reasons: commerce and the quest for gold (Mazzeo and Antonini 18). Hippalus recorded that the Indians came to the region about 50 A D during the monsoon season. The periodic winds of this region, which enabled ships to sail more swiftly, thus making the voyage quicker and safer only came during the monsoon season. Whenever these Indian traders, prospectors and sailors would return they were obliged to wait in one port or another for a monsoon to blow from the east to take them back home to India. So they settled down for months at a time and were establishing close ties with the local people; such contacts facilitated the transmission of Indian ideas, language, and religious faith (Mazzeo and Antonini 19).

During the Angkor period, the Indians brought Buddhism to the region of Cambodia; with them came the language of Pali –the language of the Theravada Buddhist canon (scriptures). It is one of the dialects of Middle Indo-Aryan, sometimes called Prakrits-derived from Sanskrit. This canon of scripture, like the Bible, is a compilation of the writings of many different monks from varying regions. Because of the rise of Buddhism, Pali has continued to be the main source of the Khmer vocabulary in the modern period (Diffloth 271).

Khmer borrowed vocabulary from Pali, the Indian Sanskrit, which served as the religious and scholarly language during the Angkor period. For several hundred years Sanskrit was only used in directly speaking to gods (prayer). To this day, it is used when addressing or speaking about Deity, kings, and monks. Because of the adoption of Sanskrit words into the Khmer language, Cambodian people were divided into two groups. One group that could only speak Khmer (considered low class) who were usually farmers. The second group spoke Sanskrit; consisting mostly of learned scholars and monks (considered high class)(Chandler 12).

Many linguists have attempted to give an accurate account of where Khmer language has been derived from but were unsuccessfully due to very little information that could be had in books or other sources. In addition, the many wars and strife on the soil of that little nation have contributed greatly to the lack of good historical information. Most of these linguists however have come to the same conclusions, that Khmer was influenced by the Pali language, a dialect of Sanskrit. Sadly, not many have been able to get beyond that point. Huffman, Parkin, Shmidt and Coedes state that Khmer has borrowed words in the fields of administration, military, and literary vocabulary including religious faith (Buddhism). However, Chandler and Diffloth gave a remarkable insight that the earliest Old Khmer used a form of the Pallava script and that this script has evolved locally over the centuries into the Cambodian writing system of today.

After much reading and pondering, I strongly agree with these last two linguists. The Khmer spoken language, in my opinion, has been adopted from Pali, a dialect of Sanskrit in terms of vocabulary. However, it appears that Pallava may have been the predominant source of the actual script that is used today. As is evidenced in Appendix 1, Khmer is the direct descendant of the Pallava language. This is true however only as far as the script is concerned. The Pallava language is, for all intensive purposes the mother of written Khmer. However, so far as can be found out, it is impossible to pin down the exact lineage of the spoken language. In my research I came across two schools of thought in regards to this subject. The first stated that spoken Khmer came from Chenla, a Chinese dialect. Funan is the other possibility, also a Chinese dialect (Mazzeo and Antonini 35). But this route obviously would have been too lengthy for this short research paper. Perhaps in the future this would be an eye-opening topic of research for me to pursue. It is interesting though, that there seems to be a common ancestry pointing in the direction of Chinese.

This lends truth to the belief that all languages are connected and probably come from the same source. Quite possibly this original source was the language spoken by Adam in the beginning of time. I will not be too surprised if one day this is indeed verified as truth. I recall reading the stories of the people of the Babel Tower when the Lord confounded their language because they were wicked. They were separated due to the uniqueness of their new languages and this in turn also separated them geographically.

I have attempted to validate my point that Khmer has adopted vocabulary from Pali, in so doing I have studied select words from the two languages. Khmer and Pali share meanings of similar words. Their similarities however are relatively few when considering the likenesses of the written languages of Khmer and Pallava (see Appendix 1). Below are provided some Pali words that were taken from the Buddhist Dictionary. (I was unable to locate a Pallava dictionary inasmuch as it is an ancient tongue with nothing but posterity now). The left-hand column consists of words in Pali and the right hand column has the Khmer words of similar meanings, which have been supplied in the far right column.

Lakkhana Lakkhana Characteristics
Loka Loka World
Panna Panna Wisdom, knowledge
Kamma Kamma Unwholesome
Sati Sati Mind, mindfulness (conscious)
Metta Metta Loving kindness (mercy)
Vaca Vaja Speech
Sasana Sanana Message, Buddhasasana (message of Buddha)
Vijja Vicha Discipline (subject of study)
Saddha Sadta Faith
Jivita Jivit Life, Vitality
Jara Chara Old age
Sukkha Suk Bodily pleasant, feeling
Dukkha Tuk Bodily pain, feeling
Samadhi Samati Concentration
Lobha Lob Greed
Nimitta Nimit Concentration-exercise arises in the mind

I have enclosed the Cambodian writing, the Pali script and the Pallava script so one may see and compare them. Obviously, the two lists of similar words above are in romanization but the actual script may be observed in Appendix 1 through 4. The scripts on those tables are accompanied by romanization as well.

As far as can be discovered by research on this topic, albeit sparse, the Cambodian language seems to be an amalgam of Pali vocabulary and Pallava script. As far as how the adoption happened, I could not find any concrete evidence. Anyone choosing to try to locate materials on this topic, good luck. Nonetheless, this research has been an intriguing experience for me. I thank those linguists who have gone before me. Without their efforts I would surely have met with complete and utter failure as far as this topic is concerned. I hope that the research I have compiled here will make it possible for a later generation and perhaps a more skilled linguist to learn and understand more about the Khmer language.


Huffman, Franklin E. Modern Spoken Cambodian, New Haven: Yale University. 1970. Reprinted, Ithaca: Southeast Asia Program, Cornell University, 1884.

Diffloth, Gerard International Encyclopedia of Linguistics, New York: Oxford Press 1992.

Childers, Robert Caesar Dictionary of the Pali language, London 1875

Nyanatiloka. Buddhist Dictionary Manual of Buddhist terms and Doctrines, Colombo: Frewin & Co., LTD., 956

Parkin, Robert A guide to Austroasiatic Speakers and Their languages, Oceanic Linguistics Special Publication No. 23. University of Hawaii Press. 1991

Chandler, David P. A History of Cambodia, Westview Press, Inc. 1992

Mazzeo, Donatella and Chiara Silvi Antonini. Monuments of Civilization; Ancient Cambodian. New York: Grosset & Dunlap, Inc., 1978

Bright, Daniel The World’s Writing Systems. New York: Oxford University Press. 1996  

Instructor | Textbook & Materials | Course Objectives | Major Learning Activities | Course Requirements & Grading Scheme | Resources | Language Reports | Home

1998-1999 © Dr. Cynthia L. Hallen
Department of Linguistics
Brigham Young University
Last Updated: Monday, September 6, 1999