An Overview of the History of the Japanese Language
Daniel J. Vogler
20 March 1998
Theories have sprung up to explain the origin of the Japanese language until they have become as varied as the seasons. In fact, Roy Miller, a profuse writer and well-respected authority on this language, says with respect to unraveling its ancestry, "Only one [predominant] language of one major nation remains today without clarification of its origins — Japanese" (Miller 1980, 26). In this paper I will explore the major theories attempting to connect Japanese to other known languages, after first presenting some of the changes from Old Japanese to Modern Japanese, including both the written and spoken forms.

The Point of Reference: Japanese Today

In order to track this journey through the history of the Japanese language, I'll start with the end result: Modern Japanese. Although the spoken language and the written language have obviously influenced one another, they each have their own unrelated histories. Japanese writing is clearly taken from Chinese, but the language itself (i.e. speech) is a mystery.

The feature of spoken Japanese that applies most directly to my arguments is its vowel system, with open syllables. There are five vowel phonemes in Modern Japanese, namely /a/, /i/, /u/, /e/, /o/. Unlike English, lengthened vowels are important in distinguishing words. Japanese consists of evenly-stressed syllables, each of which ends with a vowel. Most also begin with a consonant. And so, we can form words like Na-ga-no and u-tsu-ku-shi-i (beautiful). I will discuss other details of the spoken language later.

The writing system of Japanese is probably the most famous aspect of the language because it is so complex. In fact, a regular sample of written Japanese contains a liberal mixture of three separate systems! One system is the kanji, which are the ideographs borrowed from Chinese. Each kanji is a character that represents a meaning. For example, the concepts sun, moon, fire, and water are each expressed in writing with a single kanji. Since each unrelated idea requires a separate character, thousands of ideographs are necessary for a sufficient writing system. That means that each character must be identifiably different from all the rest, so each individual character can be complex as well. Today there are about two thousand kanji in regular use in Japan.

The other two systems, which are generically called kana, are much more simple because they are both syllabic; this perfectly suits the phonotactic structure of the spoken language. Like capital and lowercase sets of letters in the Roman alphabets, the two kana systems cover the same phonetic territory but have different orthographic functions. Katakana, the first syllabary, is more angular and is used mostly for transcribing words of foreign origin, such as terebi (television). Hiragana is more cursive, and can be used for grammatical inflections or for writing native Japanese words where kanji are not used. Using the inflected verb kakimasu as an example, the root ka- would be represented by the kanji carrying its meaning (write), and the inflection -kimasu would be written with three hiragana.

Early Written Language

The Japanese had no writing system prior to the introduction of the Chinese one, which was originally used by Chinese people who lived in Japan during the early Christian era. Later, the educated Japanese used it to write the Chinese language. The earliest known examples of Japanese writing, dating back to the 5th and 6th Centuries A.D., are proper names inscribed with Chinese characters on a mirror and a sword. But by the 8th and 9th Centuries A.D., Chinese characters began to be used to represent the Japanese language. Since the two languages are so different in their syntax and phonology, Chinese loanwords and characters began to be "Japanified" for more convenient use (Encyclopædia Britannica 1997).

The earliest known Japanese records of any length are the Kojiki (A.D. 712) and the Man'y_sh_ (after 771) (Komatsu 1970). These works are valuable in revealing the evolution of the Japanese writing system from Chinese to a specialized system for recording spoken Japanese. The Kojiki largely maintains Chinese syntax, while using character combinations specific to Japanese for their semantic content. The Man'y_sh_, on the other hand, begins to use Chinese characters for their pronunciations to indicate Japanese words (Encyclopædia Britannica 1997).

Because of the complex nature of kanji, using them for phonetic purposes is not very convenient. So the two kana systems developed independently during the 9th Century, as two different methods to simplify writing. Hiragana arose as a cursive abbreviation for the kanji, and was used mostly by women, who were excluded from the study of Chinese characters. They used it mostly for poetry, diaries and novels. Katakana was the product of priests in Buddhist temples. As the priests read Chinese works, they translated them into Japanese and inserted these kana beside the kanji as a mnemonic device to help them with Japanese inflections that were not in the Chinese (Encyclopædia Britannica 1997).

As a result of this Chinese influence and domestic adaptation, Japanese writing developed into the threefold system it is today, with incredible complexity. Part of the reason for its complexity is the incongruity of the Chinese and Japanese spoken languages. Where every word in Chinese is a single syllable, Japanese is a polysyllabic language and requires open syllables. Each kanji has at least two pronunciations: one, an imitation of the equivalent Chinese word (the on reading), forced into the CV phonotactics of Japanese; and the other, a native Japanese word (the kun reading).

The Spoken Language: Internal Diachronic Changes

The ancient texts of Japan have lent to studying the diachronic sound changes in the spoken language. The most amazing discovery about Old Japanese lies in its vowel phoneme system. I for one had accepted it as an article of faith that Japanese has always been phonetically simple, with five "pure" vowels each falling neatly into one of the five Roman letters that we foreigners use today to represent them. However, the Man'y_sh_ provides a key that led to the discovery that Old Japanese had eight vowel phonemes!

Dr. Shinkichi Hashimoto discovered that characters thought to represent the same sound actually occurred in complementary distribution-- i.e. they were contrastive (_no 1970, 99). The Man'y_sh_ used kanji not for their meaning, but for the sounds they represented. For example, one character (house) pronounced ke was used in certain words to represent the phonemes /ke/. However, another character (spirit or steam) also pronounced ke, was used in entirely different contexts. Dr. Hashimoto found that these characters did not overlap in their phonetic usages. In fact, he found the same phenomenon across all instances of syllables ending with /e/, /i/ and /o/, in the Man'y_sh_, in the Kojiki, and in other documents of the 8th Century. This clear distinction between two types of vowels shows that Old Japanese also had the phonemes /ï/, /ë/, and /ö/, in addition to the five vowels of Modern Japanese (_no 1970).

Before proceeding, I want to share a personal observation. In my comparison of the histories of Japanese and English, I have come to the conclusion that there is a linguistic homeostatic relationship between (a) a change in the number of vowel phonemes, and (b) a new distinction between other attributes of vowels. In other words, when (a) occurs, (b) will result to compensate. Modern Japanese (a) has lost three vowels since Old Japanese, but (b) has gained a distinction between long and short vowels that did not exist before. The development of English (which used to differentiate between short and long vowels) shows a similar change in the opposite direction. In order to compensate for (a) the loss of vowel duration as a phonemic factor, English (b) has developed a new distinction between tense and lax vowels, which has given rise to new phonemes. It could be said that English and Japanese have traded places with respect to vowel length and number of vowel phonemes.

After Dr. Hashimoto had shown that Old Japanese had eight vowels, Dr. Hideyo Arisaka and Professor Teiz_ Ikegami proved that the result was vowel harmony (_no 1970, 107). This is a phonological principle that permits combinations of "harmonious" vowels in a given word, but excludes other combinations. Below is _no’s (1970) chart illustrating vowel categories.

Group A: /a o u/

Group B: /ë ö ï/

Group C: /e i/

A word may contain more than one vowel from Group A, or more than one from Group B: kuro "black"; isago "sand"; kökörö "heart." Vowels of Group C can appear with those of either group. However, vowels from the first two groups rarely appear in the same word. In fact, /ö/ and /o/ never coexist in the same word (_no 1970).

Vowel harmony is common in Altaic and Uralic languages, such as Turkish and Finnish, and later I will show how it has been used to support theories relating Japanese to these groups.

Spoken Language: Attempts to Classify

Japanese is not conclusively linked to any other language or family of languages. It has remained a mystery despite all these centuries of research, and continues to prod the people who speak it to seek out their identity. (Since their anthropological indentity is also vague, Miller (1986) recommends keeping the people's anthropological roots out of the issue of the roots of the Japanese language. Therefore, I have not included any evidence from that field.)

Despite the ambiguity of its ancestry, theories about Japanese have been whittled down over the last few centuries to two of the most prominent and promising prospects. Today, Western linguists believe it is related either to Korean, which is a geographic neighbor, or to the Ural-Altaic family, or to both. Like Japanese, Korean is an orphan, and most advocates of Japanese-Altaic also propose that Korean belongs to its Altaic friends.

Before I proceed to discuss Korean and Altaic ties, I will touch lightly on one member of the wide assortment of other theories that have tried to explain the origin of Japanese. Some have suggested that Japanese is related to the Austronesian or Malayo-Polynesian languages because of their phonotactic similarities. For example, they share all 5 common vowels, as well as the attributes of open syllables and no diphthongs (Komatsu 1962, 52). These traits and others show a remarkable correlation between Japanese and languages of the isles of the sea, and linguistic contact probably would have been geographically possible. However, I feel this overlooks the historical 8-vowel system of Old Japanese. None of these theories seem quite as feasible as Korean or Ural-Altaic.

Based on my own limited experience, Korean is my personal choice for a closest relative to Japanese. They just sound so similar! And beyond that, scientific evidence supports their relationship: both the grammatical morphology and the phonology of the two languages coincide. _no (1970) specifies several grammatical similarities. For example, word order is so similar that translation requires little rearrangement. Also, neither Japanese nor Korean has an article, but both have postpositions (as opposed to prepositions or inflections) to indicate grammatical function. Phonological correlations include the fact that neither distinguishes the liquids /l/ and /r/ (_no, 1970). Also, neither has initial /r/ except in more recent words, especially loanwords. There is even a historical parallel between the phonology of the two languages: where Japanese had vowel harmony until the 9th Century, Korean maintained it all the way up to the 17th Century (Komatsu 1962, 58). Because of all these similarities, nearly all theories incorporate Korean into the equation, even if their main thrust is another language group.

The majority of scholarly opinions point toward the Altaic family as the home of Japanese. _no (1970) indicates a number of reasons to support this theory. For example, Altaic languages include many cases of vowel harmony, like that found in Finnish. Also, like Japanese, Altaic languages have no grammatical distinction of number, nor of gender. Neither has relative pronouns or passive voice, but both have postpositions or particles instead of word order or declension to indicate function (_no, 1970). On and on trails the list of similarities between Altaic and Japanese. I find the sheer volume of evidence to be convincing by itself. But even more than that, the Altaic family shows the most promise because of the quality of the evidence. Many of these characteristics are not very typical in other language groups, especially Indo-European. The probability of Japanese and Altaic sharing an unusual trait is not very high, and when so many of them are combined, the probability plunges. So I see the abundance of improbable evidence as significant support for the relationship between Japanese and the Altaic languages.

The connection between Japanese and Altaic has been refined somewhat since it was first suggested almost 150 years ago. In 1857, the Viennese man Anton Boller proposed that Japanese was descended from the Ural-Altaic group of languages (Miller 1986, 34). Since that time, linguistic research has split that group into the Uralic and the Altaic families. Most Western scholars have dropped the theories that maintain Uralic in the family tree of Japanese, so that Altaic remains in the forefront. However, a few such as Kazár (1980) still fight for Uralic. As support, he cites the vowel harmony among Japanese, Turk, and Old Korean, as well as other languages. However, Korean is not valid for his argument, since no one knows whether it is a Uralic language or not; Kazár here relies on faulty support. Moreover, vowel harmony fuels the Altaic theory just as well. Although Kazár uses exhaustive specific examples, his arguments are nearly identical to the ones that support the Altaic theory. And adding further doubt to the Uralic connection, he quotes an objection to his own cause: the phonetic correspondences are not established, and if we accept Uralic, then Japanese can be compared to any language (Kazár 1980). The Uralic half of Boller's theory has given way to the Altaic.

But how is it possible that the language of Finland in Northern Europe can be related to the language of a Southeast Asian island chain? The distance is a daunting obstacle. But we can easily apply to Altaic what Kazár said of Uralic: the geographical distance between Japan and the sources of the Altaic languages is no greater than between Britain and India, both of which have Indo-European languages (Kazár 1980).

Miller believes so strongly in the Altaic connection that he has written a book entitled Japanese and the Other Altaic Languages (1971), as if there is no question regarding its classification. He reaches a compromise between most of the languages that can be tied to Japanese in one way or another by comparing to some Indo-European languages that we are more familiar with. The key, he says, is to understand that Japanese is not a "hybrid" language any more than English is (1986, 166-7). English is a Germanic language, despite heavy influences from two Italic languages, French and Latin, and from other languages more closely related to it (Norse). Likewise, advocates Miller, Japanese is an Altaic language, although its history includes heavy influence from a variety of sources. He says that speakers of the Altaic and Uralic groups had some "close association," and therefore "early linguistic contact" (1986, 53), but whether that implies actual linguistic ancestry or merely mutual influence is not critical.

Although there is somewhat of a general consensus in the West that Japanese is an Altaic language, that we cannot be absolutely sure where Japanese comes from. Numerous conflicting theories are still advocated, both here and in Japan. Japanese and Korean still are each usually classified independent of any other language.

Recent Influences

Today, the standard variety of Japanese is the T_ky_ dialect. Because of both the government's efforts and modern communication, other dialects are becoming homogenized so that nearly everyone can understand and speak it. The T_ky_ dialect is taught in school, spoken on television, heard on the radio, and read in the newspapers.

In 1946, the government implemented a simplification of the writing system. It put forward a list of 1,850 T_y_ Kanji (Current Characters), requiring that publishers limit themselves to these characters wherever possible. This action reduced the number of kanji necessary for literacy, and simplified existing kanji. The list includes 881 characters for use in gradeschool curriculum. Despite the lingering complexity of Japanese writing, Japan maintains one of the highest literacy rates in the world.

As Modern Japan has entered the cosmopolitan scene, its language has been enriched by a recent influx of Western loanwords, transliterated into katakana. These include pan (bread), from Portuguese, and arubaito (part-time job), from German arbeiten (to work). But the majority of recent loanwords come from English, especially in the domains of technology and entertainment. A Japanese person today can go to Makkudonarudo to grab a hambaagaa to eat. He can watch a bideo on his terebi, or sit down at the kompyuutaa to type a letter on his waapuro (word processor) and save it on a disuku. Because of the volume and variety of English words appearing in Japanese print today, it seems to me that a Japanese person would not be able to understand a popular magazine in his own language unless he also has a good command of English vocabulary.


As linguistic research progresses throughout the continuum of the world’s languages, perhaps a definitive ancestor of Japanese will be pinpointed, and the people of Japan will rest from their identity crisis. But more likely, researchers will continue to refine the current theories to make the relationships between Altaic, Korean, and Japanese more clear and precise. Who knows — maybe the future could bring another discovery as revolutionary as the 8-vowel system of ancient Japanese texts. At any rate, I am content with Japanese as an independent language, an integral part of a beautiful culture, and a self-sufficient creature with its own personality.


