Everything Language Learners Should Know About Orthography

What does writing mean to linguists and philosophers of language?

In mainstream linguistics, writing is generally thought of as a secondary feature of language. Spontaneously using sounds and gestures to convey complex information or emotions is primordially human. Writing, by contrast, is a form of technology.

Socrates considers speech to be a morally superior act to writing, and for Jean-Jacques Rousseau, the development of writing represents a movement away from true freedom and the state of nature. Structuralists, and indeed most contemporary linguists, assume that the written word is simply “not language” at all. According to Ferdinand de Saussure:

A language and its written form constitute two separate systems of signs. The sole reason for the existence of the latter is to represent the former. The object of study in linguistics is not a combination of the written word and the spoken word. The spoken word alone constitutes that object.

But the goal of a language learner is almost certainly to be able to understand a language in whichever forms one encounters it, regardless of whether it is “real” or mere “representation.” Language learning is a fundamentally social activity driven by mostly social motivations, and of course speech is a more social activity than reading or writing; but an illiterate person would struggle to truly integrate into an overwhelmingly literate society.

At the very least, writing, and more importantly, reading, are powerful tools that accelerate the process of learning a language.

A brief history of writing

Human beings had been communicating using visible marks for thousands of years before they became competent enough to reliably encode and decode all potential acts of speech. Some scholars call these stages of development “proto-writing” and “full writing.”

But perhaps such a distinction is muddled. A writing system that can capture the entire range of expression possible with speech has yet to emerge. What is meant by “full” is rather a reader’s ability to reproduce words in a hypothetical speech act expected to reflect that envisioned by the writer.

In the evolution from “proto-” to “full,” the level of similarity between intention and reproduction certainly reaches new heights, but it does not constitute any objective completion of a system. Instead, it privileges certain aspects of communication and causes spoken language itself to undergo a transformation. Words, which were once just organizational blocks of sound, become more important than the sounds themselves.

If a writing system becomes fuller as it gains the power to record potential speech acts, then audio recording is the fullest writing system developed to date. If, rather, the goal of writing is not to capture speech but to communicate on its own terms, with a distinct set of potentials that sometimes fall short of, but sometimes exceed, those of spoken language, then writing may have begun 20,000 years ago with cave paintings and engravings.

Unfortunately, in using the latter definition one is forced to include the creation of all objets d’art – even those entirely unrelated to language – as writing. This is clearly even more absurd than the inclusion of audio recordings, and so we come full circle to writing as either language or the representation of language, and to the earliest “full writing” systems on the understanding that they are not “full” so much as “at the peak of their association with spoken language.”

The prevailing view is that four different human cultures achieved “full writing” independently of one other. These were the Sumerian civilization, based in what is today called Iraq; Ancient Egypt; Shang-Dynasty China about a millennium later; and the civilizations of present-day Mexico and Guatemala, sometime between 900 and 600 BC. Each of these developments spread rapidly and prompted neighboring cultures to develop their own writing systems.

The general historical trend is that writing systems evolve through stages from more ideographic (symbols represent concepts or objects) to more phonographic (symbols represent sounds.) However, there is some variation even among contemporary writing systems. For example, Chinese writing is a hybrid system that retains many of the conceptual associations of each character from its earlier forms. Many Asian and East African languages use pseudo-alphabetic writing systems, in which vowels are either absent, optional, or written as diacritics (accents).

Classifications of the world’s major writing systems

The type of writing system familiar to all literate speakers of English is the alphabet. Almost all alphabets are thought to trace their origins to Phoenician, which evolved from Egyptian hieroglyphs and spread widely during the Iron Age, as Phoenicia became the dominant commercial power throughout the Mediterranean.

Variants of the Latin alphabet are the most widely used in the world today. Apart from their ubiquity in Europe and the Americas, numerous African, Southeast Asian, and Turkic languages also make use of it.

The Cyrillic (used to write Russian, Bulgarian, and related languages) and Greek (today only used to write Greek) alphabets share many characters with Latin and others with each other. The Armenian and Georgian alphabets both have a distinct set of characters, but both are thought to have been significantly influenced by Greek as well.

Introduction to Orthography and Writing Systems

Image source

Languages whose most common writing system is based on Latin:

LanguageDistinctive Characteristics
EnglishStandard 26-letter system with minimal diacritics
SpanishHighly phonemic spelling, 27 letters including ñ, plus the acute accent to denote stressed syllables.
FrenchStandard 26-letter system with five diacritics and two ligatures. K and W are recent additions and uncommon.
PortugueseStandard 26-letter system with five diacritics. K, W and Y are recent additions and uncommon.
IndonesianHighly phonemic, standard 26-letter system with no diacritics. A modified Arabic script is also used to write the same language.
GermanPseudo-phonemic system of 27 letters including ẞ, or 30 officially including Ä, Ö, and Ü.
TurkishQ, W and X are absent, Ç, Ğ, Ö, Ş and Ü are added, and always-dotless I is distinguished from always-dotted İ to form a total of 29 letters with mostly phonemic spelling.
VietnameseF, J, W and Z are officially not included, but are used for loanwords and names. Ă, Â, Ê, Ô, Ơ, Ư and Đ are considered additional letters. Five more diacritics are used constantly to indicate tone (Á, À, Ã, Ả, Ạ), and can be stacked over the breves and circumflexes found in the standard letters. This allows the system to be an extremely phonemic representation of a tonal language.
TagalogOfficially, Ñ and Ng are included as additional letters to the standard system, forming a total of 28. C, F, J, K, Q, V, X and Z have only been included since 1987, and another spelling reform was carried out in 2013 to make the system more phonemic.
SwahiliQ and X are excluded, while Ch is often considered to replace C, which is only found in loanwords and names. Ng’ and Sh are often included as additional letters, and sometimes Dh, Gh, Ng, Sh, and Th are as well. A variant of Arabic script is still commonly used alongside the Latin alphabet.
JavaneseDh, É, È, Ng, Ny, and Th are added to the standard system to make 32 letters, but F, Q, V, X and Z are generally only found in loanwords. The language is also often written with traditional Javanese script and Arabic script.
ItalianHighly phonemic spelling based on 21 official letters, with J, K, W, X and Y used only for loanwords. Grave, acute and circumflex diacritics are also used.
PolishThe inclusion of Ą, Ć, Ę, Ł, Ń, Ó, Ś, Ź, and Ż makes up a somewhat phonemic 35-letter system. However, Q, V and X are only found in loanwords.

Languages whose most common writing system is based on Cyrillic:

LanguageDistinctive Characteristics
Russian33 letters including Ё and Э, which are not found in Old Slavonic or Bulgarian.
Ukrainian33 letters including Ґ and Є, which are not found in Russian, Bulgarian, or Old Slavonic, and Ї which is almost unique, today shared only with some Rusyn dialects.Ukrainian uses an apostrophe where Russian would use a Ъ. It also does not include the Old Slavonic Ы or the Russian Ё or Э.
SerbianA 30-letter Cyrillic system is used equally alongside a Latin system shared with Croatia. The Serbian Cyrillic alphabet includes Ђ, not found in Old Slavonic or many others, but comparable to the Bulgarian Дж. It also includes the original letters Џ, Љ, Њ and Ћ, which have since influenced other alphabets; and the letter J from Latin.
UzbekThe Uzbek system features the characters Ғ, Ў and Ҳ, which are somewhat uncommon and not part of Bulgarian, Russian or Old Slavonic. It shares Ё and Э with Russian, but excludes Ц, Щ and Ь which are present in the majority of Cyrillic alphabets.

Abjads, abugidas, and syllabaries are slightly different from alphabets. Although they are phonographic (symbols represent sounds), not all sounds are represented, and therefore the reader must make some inferences in order to decode these types of writing into speech.

However, it should be noted that many full alphabetic systems are inconsistent in how they represent sound, English being a notable example. Syllable or consonant-based systems are therefore capable of being more phonemic than many alphabets.

Japanese uses two syllabic writing systems – the Kana – alongside its logographic kanji system.

In abugida writing systems, each basic character represents a consonant along with an “implicit” vowel, which can be altered with diacritics. The Brahmic, Ethiopic, and Canadian Aboriginal languages are overwhelmingly written using such systems.

Abjads still in use today include the Arabic and Hebrew scripts. In these systems, basic characters generally represent consonants only. Vowels can be marked with optional diacritics or are implied by the grouping together of certain consonants.

The modern Chinese languages are written using hybrid “logosyllabic” systems. Some characters denote phonological content while others are purely logographic. It’s likely that all ancient logographic systems needed to become logosyllabic in order to accommodate the communication needs of a developing society, but the vast majority disappeared or lost their logographic content entirely by the modern era.

Finally, the modern Korean writing system is considered by some scholars to be distinct from an ordinary alphabet, because it is more phonetically specific than even the shallowest orthographies. Each symbol represents a basic phonological feature, like mouth shape or tongue position, which combine to form letters and then cluster into syllables.

Can I learn to speak a language without learning to read or write it?

The way we acquire language as children is fundamentally different from the way we learn additional languages as adults. In the case of the former, reading and writing plays almost no role in the formation of basic language skills.

But as adults, we no longer have access to most of the neurological mechanisms for rapidly and organically developing language that we used as children. Instead, we must supplement these processes with more rational and intentional methods, seeking input from as many sources as possible.

Many strategies for teaching and learning second languages are based on the idea of recreating the conditions of childhood language acquisition to the greatest extent possible. But as the patterns of everyday adult life differ so much from those of childhood, a significant amount of language input is likely to be from reading, not listening.

As adults, we instinctively filter out linguistic information that we are unable to understand. As a result, deep social immersion without a strong foundation in vocabulary is likely to be very inefficient. And when it comes to memorization, the visual and conceptual components of written words are extremely helpful.

Of course, learning a language whose writing system is dramatically different from that of your native language can present additional obstacles. In some cases it could make sense to read or practice vocabulary using transliterated text. But there are also some disadvantages this way, as it will restrict the total number of opportunities to encounter the language in natural contexts.

Learning an additional language using no text whatsoever is absolutely possible, but it’s very likely to take longer and be more difficult in many ways.

With Lingvist, however, you can supercharge your vocabulary in your target language in just 30 minutes per day, practicing the most relevant words in example sentences similar to those you’ll encounter in everyday situations.

Get more from Lingvist

We have created an app that gets the most out of Lingvist and your device. Download the app and enjoy Lingvist at its best.