A History of the Arabic Language

Brian Bishop
Linguistics 450
April 24, 1998

The Arabic language is not well known in the Western world. Having studied the language for almost three years now, I could be considered something of an expert on the language. Thatís not to say, however, that I always knew a lot about Arabic. I certainly wasnít an expert when I initially decided to fulfil the non-Indo-European language requirement for my Linguistics major by studying Arabic instead of Chinese, as I had previously planned. In fact, my knowledge of Arabic up to that point could probably have been summed up in one succinct phrase: I think Arabs speak Arabic!

The fact that Arabic is not well known in the Western world should perhaps be considered a point of regret considering that the Arabic language is spoken natively by over 150 million people (Kaye 664). Moreover, it functions as a liturgical language for the hundreds of millions of Muslims throughout the Earth. It is truly one of the great modern languages of the world. At the same time, as I have learned through my study, Arabic is not a language without deep historical roots. In fact, the history of the Arabic language is one which spans the centuries from well before the advent of the Christian era to modern times. In this paper, I will trace the history of the Arabic language from its roots in Proto-Semitic to the modern linguistic situation in the Arabic-speaking world. In particular, I will focus on the various phonological, morphological, and syntactic changes which together have created Arabicís unique dialectal situation.

Roots of the Arabic Language

As I mentioned above, Arabic is descended from a language known in the literature as Proto-Semitic. This relationship places Arabic firmly in the Afro-Asiatic group of world languages. Merrit Ruhlenís taxonomy in his Guide to the Worldís Languages helps to further elucidate Arabicís ancestry within this large group of languages. Specifically, Arabic is part of the Semitic subgroup of Afro-Asiatic languages (293). Going further into the relationship between Arabic and the other Semitic languages, Modern Arabic is considered to be part of the Arabo-Canaanite sub-branch the central group of the Western Semitic languages (323). Thus, to review, while Arabic is not the oldest of the Semitic languages, its roots are clearly founded in a Semitic predecessor.

Arabic as a Proto-Semitic language

As mentioned above, Arabic is a member of the Semitic subgroup of the Afro-Asiatic group of languages. The common ancestor for all Semitic languages (i.e. Hebrew or Amharic) in the Afro-Asiatic group of languages is called Proto-Semitic. Based upon reconstruction efforts, linguists have determined many of the phonological, morphological, and syntactic features of Proto-Semitic. As might be expected, not all Semitic languages have equally preserved the features of their common ancestor language. In this respect, Arabic is unique; it has preserved a large majority of the original Proto-Semitic features. In fact, many linguists consider Arabic the most ëSemiticí of any modern Semitic languages in terms of how completely they preserve features of Proto-Semitic (Mukhopadhyaya 3-4).

Proto-Semitic Phonology

In order to examine the Arabic languageís earliest roots, in the next three sections I will compare Modern Standard Arabic to Proto-Semitic, showing the various changes and similarities between the two in terms of phonology, morphology, and syntax. In terms of phonology, Proto-Semitic was characterized in part by the following features: (1) A six vowel system composed of three long vowels and three short vowel counterparts (a, i, u, _, _, _); (2) pharyngeal fricative consonants; (3) utilization of the glottal stop as a phoneme; (4) inclusion of the semivowels (w) and (y) as consonants; and (5) the existence of three classes of consonants: voiced, voiceless, and "emphatic" consonants (Britannica 722; Hetzron 657). Modern Arabic matches each of these Proto-Semitic features point by point including, among other items, the "classical triangular [vocalic] system," _, _, and _, and the three types of consonants: voiced, voiceless, and emphatic (Kaye 669).

Proto-Semitic Morphology

Arabic also contains many of the fundamental morphological features of Proto-Semitic. These features included at least the following seven points: (1) words were composed of a consonantal root upon which a scheme made up of vowels was imposed. The root ktb is one such root from which words having to do with writing are derived. For example, maktaba means ëlibraryí or ëplace to keep writingsí while k_tib means ëwriter.í The same root occurs in both words, but the vowels and supplementary consonants change to form the various words; (2) the majority of roots incorporated three consonants rather than two consonants; (3) infixation was used frequently and suffixes and prefixes less frequently to accomplish category changes and create related words (Britannica 722); (4) a declension system marked at least three cases, i.e. nominative, accusative, and genitive; (5) three numbers, the singular, dual, and plural, were used with nouns, verbs, and adjectives (Britannica 722, 723); (6) two grammatical genders, masculine and feminine, were distinguished in nouns and adjectives (Hetzron 658); and (7) reverse polarity in gender agreement was exhibited with the numbers from three to ten (Hetzron 659). Once again, Modern Standard Arabic contains all of the classical Proto-Semitic features.

Proto-Semitic Syntax

Linguists know less about the syntactic features of Proto-Semitic. The assumption is that Proto-Semitic was a VSO language as Arabic is today. Other features are, however, less clear. Presumably, demonstratives followed the noun in Proto-Semitic while they precede the noun in Arabic. On the other hand, subordinate clauses generally followed the head, as they do in Arabic (Hetzron 662).

The resemblance between Arabic and Proto-Semitic is remarkable, certainly. Very few changes have taken place between the two. And, of those changes that have taken place, many are simple phonological changes. For example, Proto-Semitic *ö has become s and *th has become z with corresponding changes in similar phonemes (Britannica 725).

Unfortunately, there is a caveat in all of this. Up to this point, the word Arabic, as it has been used, has referred to Modern Standard Arabic. This usage has completely disregarded the fact that there are several thousand colloquial or spoken dialects of modern Arabic which do not preserve Proto-Semitic features in such abundance. In truth, of the Proto-Semitic features mentioned, less than half can be said to be preserved by the modern colloquial dialects of Arabic (Britannica 723). Thus, to speak of Arabic as if all Arabic dialects were the same is a gross overgeneralization. There is a wide divergence between Modern Standard Arabic and modern colloquial Arabic, and this subject naturally leads to discussion of the next section: modern Arabic diglossia.

Modern Arabic Diglossia

Modern Arabic is an uncommon language because it is characterized by what is termed diglossia (Blau 1; Diglossia 340). Essentially what this means is that modern Arabic is really almost two languages: Modern Standard Arabic and colloquial Arabic. Modern Standard Arabic is used in reading, writing, and high register speech. It is descended from the Classical language of the Quran and in the view of almost all Arabs, is the "correct" Arabic (Myths 253). However, Modern Standard Arabic is a learned language. It is no oneís mother tongue. In fact, all Arabs grow up learning the second or colloquial language.

Arab colloquial dialects are generally only spoken languages. Arabs use the colloquial language in all their daily interactions, but when they encounter a language situation calling for greater formality, Modern Standard Arabic is the medium of choice. In every area of the world where Arabic is spoken, this language situation prevails: there is a colloquial language, meaning the language which is spoken regularly and which Arabic speakers learn as their L1, and then there is Modern Standard Arabic, based on Classical or Quranic Arabic. Standard Arabic is more or less the same throughout the Arab World, while there are wide differences between the various colloquial dialects. In fact, some of the differences are so large that many dialects are mutually unintelligible. My Palestinian roommate, for example, has told me several times that he canít understand the Moroccan dialect of colloquial Arabic.

Diglossia, while infrequent among the languages of the world, has played a huge role in the development of modern Arabic. Up to this point in the history, when I have spoken of Arabic, I have been referring to Modern Standard Arabic, the language derived from the Classical language of the Quran. From this point onward, I will always differentiate between Modern Standard Arabic and colloquial Arabic. Whenever I speak of colloquial Arabic, I am referring to any of thousands of dialects of Arabic which are spoken natively by Arabic speaking peoples.

Origins of Arabic Diglossia

The primary question in historical Arabic linguistics is this: How did Arabic diglossia originate and develop? As one might expect with such an important question, researchers have advanced a number of theories to answer this question. However, no one view is uniformly held by researchers. In order to classify the various theories which have been advanced, a three part classification can be established: those theories which posit the existence of a koine; those theories which advocate an explanation of language drift; and those which utilize a creolization/pidginization hypothesis to answer the question.


Perhaps the most well known theory regarding the origins of Arabic diglossia is the koine hypothesis. Koine is a term derived from Greek denoting a lingua franca that develops out of a mixture of languages or dialects. This idea of a "common" language was expressed early on by the linguist Fück when he made the claim that a "common Bedouin language" came into existence through the Islamic conquests. This common Bedouin language, then, formed the basis for the later development of the colloquial dialects of Arabic, while Modern Standard Arabic continued to develop from the classical language of the Quran (Belnap 20).

Fückís hypothesis matches in the essential points the koine hypothesis of the American linguist, Charles Ferguson. Ferguson posited that the majority of the modern dialects of Arabic are descended from a koine which was not based on any one particular regional area and which existed side by side with the Standard, Classical Arabic (Ferguson 51). Fergusonís argument rested on a list he developed of fourteen features which differ between colloquial Arabic and Standard Arabic (See Appendix 2) (Koine 53). While Ferguson acknowledged that one or several of the features he pointed out could have been due to normal drift and language change, he felt the strength of his argument was the fact that there were fourteen such changes. Taken as a group, he argued, their existence was strong evidence for the existence of a koine (Belnap 30-31). According to Ferguson, then, it was this koine that started diglossia and served as the basis for modern colloquial Arabic.

Language Drift and Normal Tendencies

A second theory advanced by several scholars is one which attributes the difference between Modern Standard Arabic and colloquial Arabic to language drift, natural Semitic language change tendencies, and substratum effects, among others. Those who advocate these theories have often taken vehement exception to the koine hypothesis because they feel it is largely unnecessary and unwarranted by the evidence available. This is despite the fact, however, that there is substantial agreement between them on several points.

For example, both sides agree that changes likely centered in towns and sedentary populations rather than in the dialects of the Bedouin tribes of the Arabian deserts. The Bedouin dialects, both sides feel, likely remained untouched by language change for several centuries after the advent of Islam in the mid seventh century (Koine 52; Blau 23). They also agree that there was no one language center in the Arab World which exercised enough influence by itself to cause the changes seen (Koine 53-54; Blau 24, 26). Finally, both sides agree that the most important factor in precipitating the rise of the colloquial Arabic dialects was the Islamic conquests of the seventh and eighth centuries (Blau 21; Koine 52).

This is where the agreement stops. To illustrate, I will examine the views of Joshua Blau, an Israeli scholar who found Fergusonís koine argument entirely unconvincing. He argued that the reverse of Fergusonís hypothesis was true: instead of a koine being the origin of the modern Arabic dialects, it was the koine itself that resulted from the changes in the Arabic dialects (27). In Blauís estimation, the various Arabic dialects developed similarly because of at least two things: unifying factors such as the tendency for Semitic languages to undergo certain changes, and mutual contact between the dialects (Blau 25, 26). This explanation, he felt, was more in line with conventional linguistic theory such as the wave theory of language change diffusion where language changes spread wave-like from speech population to speech population (Blau 27).


The third and latest theory in the development of Arabic diglossia is the Pidginization/Creolization theory. Kees Versteegh is one researcher who has advocated this theory. Versteegh argued that both the existing theories of diglossia development focused exclusively on either an explanation of the differences or an explanation of the similarities of the dialects without treating the other side (19). In his estimation, an effective theory needed to treat both the similarities and the differences between the dialects.

By hypothesizing a process of pidginization/creolization Versteegh accomplished what the other Arab scholars werenít able to do, that is address both the similarities and the differences between the modern dialects of Arabic. For example, he described how mixed marriages between Muslim Arab men and non-Arab women of the conquered peoples would likely have led to communication using a pidginized form of Arabic. At the same time, any children resulting from such a marriage would have probably spoken a creolized Arabic (74). This creolized Arabic could then have served as the starting point for the colloquial Arabic dialects. Of course, Versteegh acknowledged the influence of other factors, but on the whole, felt his hypothesis succeeded in explaining both the differences and the similarities between modern Arabic dialects.

Diglossia Concluded

Though scholars differ in opinions over the exact cause for the rise of the Arabic dialects, there is some ground for general agreement. This agreement is perhaps best summed up in a statement by Fischer and Jastrow:

One will hardly go wrong if one imagines that the development of New [colloquial] Arabic was connected with dialect mixing in the camps of the conquerors, the influence of the languages and dialects of the conquered, and the formation of regional vernaculars. Later population displacements and constant leveling tendencies through cross-regional contacts between the cities, likewise tendencies toward peculiar developments among the most isolated rural populations, may have been equally important developmental factors (Belnap 32).

Results of Arabic Diglossia

While linguists disagree sharply regarding how diglossia developed, there is consensus regarding the changes that have taken place in the switch from Standard Arabic to colloquial Arabic. Phonologically, for example, a number of phonemes have shifted systematically in the change from Standard Arabic to colloquial Arabic. For example, Egyptian colloquial Arabic has shifted all interdental fricatives to their corresponding alveolar articulation. Other colloquial dialects have made similar changes.

There have also been a number of morphological changes including most importantly, the loss of case endings or, íiraab, as it is known in Arabic. Standard Arabic has a system of three casesónominative, accusative, and genitiveówhile colloquial Arabic dialects generally have lost any case system. Other morphological changes include the collapsing of multiple particles into a single form, while feminine plural forms have been lost in pronouns, adjectives, and verbs (Blau 3).

Syntactic changes are also abundant. Blau mentions specifically how most dialects have dropped the syndetic/asyndetic alternation which was common in Standard Arabic (3). Versteegh emphasizes the fact that most dialects have become analytical whereas Standard Arabic is more synthetic. One place where this is easily seen is in showing possession; Standard Arabic uses a synthetic method to show possession, but almost all dialects have now developed an analytical method of showing possession using a word which shows the possession relationship (Versteegh 18).

Modern Linguistic Situation in the Arabic Language

Modern Arabic, both Standard and colloquial, is not static. The colloquials have undergone and will likely continue to undergo great change. Unfortunately, until recently they have not been closely studied, and therefore it is difficult to document any changes they may have undergone. It is easier, however, to document changes in Modern Standard Arabic.

One on-going trend in Modern Standard Arabic is modernization. Modernization involves the creation of new terms for concepts which didnít exist in earlier times. Like many other speakers around the world, Arabic speakers are sensitive to the wholesale borrowing of words. In fact, they are perhaps more sensitive to language change because most Arabs recognize Arabic as the language of God. Such a concept doesnít accommodate language change well. As a result, normative language academies have been established in several areas throughout the Arab world including Cairo, Damascus, Baghdad, and Amman (Bakalla 11).

The language academies try to control borrowing by creating terms for new technological entities. Their typical means for doing this include extension, calques, and a process known as Arabization. A common example of extension involves the Standard Arabic word for car, sayy_ra. This word originally meant caravan of camels but has been redefined to mean car. Calques are more obvious in such phrases as kurat al-qadam, which is literally ball of the foot or football (soccer) (Bakalla 12). Arabization, on the other hand, involves the adoption of a foreign word, but with changes which make it acceptable to Arabic morphological and phonological patterns (Bakalla 13).

Another trend I have noticed in both personal experience and in researching is how Arabs have the expectation that the Arab world is slowly turning toward Modern Standard Arabic as its mother tongue. This trend takes two parts. In my experience, Arabs uniformly disparage the colloquial dialects they speak natively. For example, a teaching assistant in my current Arabic language class emphasizes every time she tells us a colloquial Arabic word that it is, "slang." The other part of this phenomenon is that Arabs expect that Modern Standard Arabic is eventually going to prevail as the L1 in the Arab world. Ferguson noted this tendency when he stated there is an expectation among Arabs that Modern Standard Arabic will take over the Arab world (Myths 255). I was introduced to this idea personally in May of 1997 when, during a conversation with a taxi driver in Amman, Jordan, I was told that I needed to speak Standard Arabic. This, despite the fact, as I told him, that no one actually speaks Standard Arabic natively.

History of Arabic Writing System

Before concluding, I wish to examine briefly the historical development of the Arabic writing system. Descended from the North Arabic script, the modern Arabic language writing system runs from right to left and is a cursive script. There are twenty eight letters in the alphabet, but because the script of the alphabet is cursive, 22 of the letters take different shapes when they are in initial, medial, final, or isolated positions (See Appendix 1). There are six letters in the alphabet which have only two possible forms because you only connect to them; they cannot be connected from. The three long vowels are represented within the alphabet. However, the three short vowels are not. Short vowels can be indicated by optional diacritical markings, but these are most often not written. Those texts in which they are written are usually of a religious nature and they are included to ensure that the proper pronunciation is made for all the words.

Historically, the North Arabic script, the earliest extant copies of which date to the 4th century B.C., is descended from the Nabatean Aramaic script. However, because the Aramaic script represented less than the required number of consonants for Arabic, the use of some shapes was extended by the means of dots placed on the letters. Thus there are several letters in Arabic whose only distinguishing feature from another Arabic letter is the placement of a dot above or below the letter (Daniels 559).

The result of the utilization of short vowel diacritics in Arabic is that written Arabic is highly lexicalized: you have to know the words in order to be able to read the language correctly. Many Arab intellectuals criticize this situation and have proposed changes in order to make the Arabic writing system have a more strict one to one correspondence between letter and sound (Daniels 563). However, resistance to the change is so high that it is very unlikely such a change will ever take place. Many explain that Arabic is the language of God (Allah), and as such has no need to be changed.


In many ways the idea stated in the previous paragraph, that Arabic is the language of Allah, has defined how the Arabic language has behaved over the centuries. Of course in the early years, before the advent of Muhammed, Arabic developed and grew, though it was largely localized among the tribes of Arabia. As the Islamic conquests took place, however, Arabic became the language of the conquered peoples both because it was the language of their conquerors and because it was the language of their newly adopted religion.

In subsequent years, the desire to preserve the proper pronunciation and reading of the Holy Quran has been the driving force behind the maintenance of Classical Arabic as the standard par excellence for the Arabic language. Even today, when you ask an Arab about the colloquial dialect they speak, they are most likely to respond that what they speak is a "slang." For them, correct Arabic is Classical Arabic, a language which no one speaks natively, but which has been preserved from the Quran. Linguistically, the Arab world is a complex struggle between the progressiveness of colloquial Arabic and the conservative action of Standard Arabic which is fostered by religion. The interaction of the religious and the linguistic is part of what has made Arabic the interesting and vital language it is today.

Appendix 1
The Arabic Alphabet

Derived in part from Alan Kaye, "Arabic," pg. 674.

Appendix 2
Fergusonís Fourteen Points in Support of the Existence of an Arabic Koine

  1. Loss of the dual.
  2. Taltalah.
  3. Loss of Final-w_w verbs.
  4. Re-formation of geminate verbs.
  5. The verb suffix -l- "to, for".
  6. Cardinal numbers 3-10.
  7. /t/ in the numbers 13-19.
  8. Loss of the feminine comparative.
  9. .Adjective plural fu__l.
  10. Nisbah suffix -iyy > *-_.
  11. The verb "to bring."
  12. The verb "to see."
  13. The relative *íilli.
  14. The merger of d_d and ð_í.

Summarized from Charles A. Ferguson, "The Arabic Koine."

Works Cited

"Afro-Asiatic Languages." Encyclopedia Britannica. 1992 ed.

Bakalla, Muhammad Hasan. Arabic Culture Through its Language and Literature. London: Kegan Paul International, 1984.

Belnap, R. Kirk and Niloofar Haeri. Structuralist Studies in Arabic Linguistics: Charles A. Fergusonís Papers, 1954-1994. Leiden: Brill, 1997.

Blau, Joshua. Studies in Middle Arabic and its Judaeo-Arabic Variety. Jerusalem: The Magnes Press and the Hebrew University, 1988.

Daniels, Peter T. and William Bright, eds. The Worldís Writing Systems. New York: Oxford University Press, 1996.

Ferguson, Charles A. "The Arabic Koine." 1959. Structuralist Studies in Arabic Linguistics: Charles A. Fergusonís Papers, 1954-1994. Ed. R. Kirk Belnap and Niloofar Haeri. Leiden: Brill, 1997. 50-68.

---. "Diglossia." Word. 15 (1959): 325-40.

---. "Myths About Arabic." 1959. Structuralist Studies in Arabic Linguistics: Charles A. Fergusonís Papers, 1954-1994. Ed. R. Kirk Belnap and Niloofar Haeri. Leiden: Brill, 1997. 250-256.

Hetzron, Robert. "Semitic Languages." The Worldís Major Languages. Bernard Comrie. NewYork: Oxford University Press, 1987. 654-663.

Kaye, Alan S. "Arabic." The Worldís Major Languages. Bernard Comrie. NewYork: Oxford University Press, 1987. 664-685.

Mukhopadhyaya, Satakari. Preface. A Grammar of the Classical Arabic Language. By Mortimer Sloper Howell, trans. 4 Vols. Delhi, India: Gian Publishing House, 1986.

Ruhlen, Merritt. A Guide to the Worldís Languages. Stanford, California: Stanford University Press, 1987.

Versteegh, Kees. Pidginization and Creolization: The Case of Arabic. Amsterdam: John Benjamins Publishing Company, 1984.

Instructor | Textbook & Materials | Course Objectives | Major Learning Activities | Course Requirements & Grading Scheme | Resources | Language Reports | Home

1998-1999 © Dr. Cynthia L. Hallen
Department of Linguistics
Brigham Young University
Last Updated: Monday, September 6, 1999