kids encyclopedia robot

Diaeresis facts for kids

Kids Encyclopedia Facts
Quick facts for kids
◌̈ ◌̤
Two dots

Diacritical marks of two dots ¨, placed side-by-side over or under a letter, are used in a number of languages for several different purposes. The most familiar to English-language speakers are the diaeresis and the umlaut, though there are numerous others. For example, in Albanian, ë represents a schwa. Such diacritics are also sometimes used for stylistic reasons (as in the family name Brontë or the band name Mötley Crüe).

In modern computer systems using Unicode, the two-dot diacritics are almost always encoded identically, having the same code point.

Uses

Diaeresis

The "diaeresis" diacritic is used to mark the separation of two distinct vowels in adjacent syllables when an instance of diaeresis (or hiatus) occurs, so as to distinguish from a digraph or diphthong. For example, in the obsolete spelling "coöperate", the diaeresis reminded the reader that the word has four syllables co-op-er-ate, not three. It is used in several languages of western and southern Europe, though rarely now in English.

Umlaut

The "umlaut" diacritic indicates a sound shift  – also known as umlaut – in which a back vowel becomes a front vowel. It is a specific feature of German and other Germanic languages, affecting the graphemes ⟨a⟩, ⟨o⟩, ⟨u⟩ and ⟨au⟩, which are modified to ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ and ⟨äu⟩.

It derives from the Sutterlin script, formerly used widely in German handwriting, in which the letter e is formed as two short parallel vertical lines very close together (see under Sütterlin#Characteristics).

Stylistic use

The two dot diacritic is also sometimes used for purely stylistic reasons. For example, the Brontë family, whose surname was derived from Gaelic and had been anglicised as "Prunty", or "Brunty": At some point, the father of the sisters, Patrick Brontë (born Brunty), decided on the alternative spelling with a diaeresis diacritic over the terminal ⟨e⟩ to indicate that the name had two syllables.

Similarly the "metal umlaut" is a diacritic that is sometimes used gratuitously or decoratively over letters in the names of hard rock or heavy metal bands – for example, those of Motörhead and Mötley Crüe, and of parody bands, such as Spın̈al Tap.

Other uses by language

A double dot is also used as a diacritic in cases where it functions as neither a diaeresis nor an umlaut. In the International Phonetic Alphabet (IPA), a double dot above a letter is used for a centralized vowel, a situation more similar to umlaut than to diaeresis. In other languages it is used for vowel length, nasalization, tone, and various other uses where diaeresis or umlaut was available typographically. The IPA uses a double dot below a letter to indicate breathy (murmured) voice.

Vowels

  • In Albanian, Tagalog, and Kashubian, ⟨ë⟩ represents a schwa [ə].
  • In Aymara, a double dot is used on ⟨ä⟩ ⟨ï⟩ ⟨ü⟩ for vowel length.
  • In the Basque dialect of Soule, ⟨ü⟩ represents
  • In the DMG romanization of Tunisian Arabic, ⟨ä⟩, ⟨ö⟩, ⟨ṏ⟩, ⟨ü⟩, and ⟨ṻ⟩ represent and.
  • In Ligurian official orthography, ⟨ö⟩ is used to represent the sound.
  • In Māori, a diaeresis (e.g. Mäori) was often used on computers in the past instead of the macron to indicate long vowels, as the diaeresis was relatively easy to produce on many systems, and the macron difficult or impossible.
  • In Seneca, ⟨ë⟩ ⟨ö⟩ are nasal vowels, though ⟨ä⟩ is as in German umlaut.
  • In Vurës (Vanuatu), ⟨ë⟩ and ⟨ö⟩ encode respectively and.
  • In the Pahawh Hmong script, a double dot is used as one of several tone marks.
  • The double dot was used in the early Cyrillic alphabet, which was used to write Old Church Slavonic. The modern Cyrillic Belarusian and Russian alphabets include the letter ⟨ё⟩ (yo), although replacing it with the letter ⟨е⟩ without the diacritic is allowed in Russian.
  • Since the 1870s, ⟨Ї⟩, ⟨ї⟩ (Cyrillic letter yi) has been used in the Ukrainian alphabet for iotated plain і is not iotated. In Udmurt, ӥ is used for uniotated with и for iotated.
  • The form ⟨ÿ⟩ is common in Dutch handwriting and also occasionally used in printed text – but is a form of the digraph "ij" rather than a modification of the letter ⟨y⟩.
  • Komi and Udmurt use ⟨Ӧ⟩ (a Cyrillic O with two dots) for Mid central vowel.
  • The Swedish, Finnish and Estonian languages use ⟨Ä⟩ and ⟨Ö⟩ to represent Near-open front unrounded vowel and Mid front unrounded vowel
  • In the languages of J.R.R. Tolkien's Middle-Earth novels, a diaeresis is used to separate vowels belonging to different syllables (e.g. in Eärendil) and on final e to mark it as not a schwa (e.g. in Manwë, Aulë, Oromë, etc.). (There is no schwa in these languages but Tolkien wanted to make sure that readers wouldn't mistakenly pronounce one when speaking the names aloud.)

Consonants

Jacaltec (a Mayan language) and Malagasy are among the very few languages with a double dot on the letter "n"; in both, n̈ is the velar nasal.

In Udmurt, a double dot is also used with the consonant letters ӝ (from ж), ӟ (from з) and ӵ (from ч).

When distinction is important, Ḧ and ẍ are used for representing and in the Kurdish Kurmanji alphabet (which are otherwise represented by "h" and "x"). These sounds are borrowed from Arabic.

Ẅ and ÿ: Ÿ is generally a vowel, but it is used as the (semi-vowel) consonant (a without the use of the lips) in Tlingit. This sound is also found in Coast Tsimshian, where it is written ẅ.

A number of languages in Vanuatu use double dots on consonants, to represent linguolabial (or "apicolabial") phonemes in their orthography. Thus Araki contrasts bilabial p with linguolabial bilabial m with linguolabial and bilabial v with linguolabial .

Seneca uses ⟨s̈⟩ for.

In Arabic the letter ẗ is used in the ISO 233 transliteration for the tāʾ marbūṭah [ة], used to mark feminine gender in nouns and adjectives.

Syriac uses a two dots above a letter, called Siyame, to indicate that the word should be understood as plural. For instance, ܒܝܬܐ (bayta) means "house", while ܒܝ̈ܬܐ (bayte) means "houses". The sign is used especially when no vowel marks are present, which could differentiate between the two forms. Although the origin of the Siyame is different from that of the diaeresis sign, in modern computer systems both are represented by the same Unicode character. This, however, often leads to wrong rendering of the Syriac text.

The N'Ko script, used to write the Mandé languages of West Africa uses a two-dot diacritic (among others) to represent non-native sounds. The dots are slightly larger than those used for diaeresis or umlaut.

Diacritic underneath

The IPA specifies a "subscript umlaut", for example Hindi "potter"; the ALA-LC romanization system provides for its use and is one of the main schemes to romanize Persian (for example, rendering ⟨ض⟩ as ⟨z̤⟩). The notation was used to write some Asian languages in Latin script, for example Red Karen.

Computer encodings

In Unicode

Character encoding generally treats the umlaut and the diaeresis as the same diacritic mark. Unicode refers to both as diaereses without making any distinction.

Unicode encodes a number of cases of "letter with a two dots diacritic" as precomposed characters and these are displayed below. (Unicode uses the term "Diaeresis" for all two-dot diacritics, irrespective of the actual term used for the language in question.)

Both the combining character U+0308 and the pre-composed codepoints may be regarded as an umlaut or a diaeresis according to context.

In ASCII, ISO/IEC 646 and ISO 8859

ASCII, a seven-bit code with just 95 "printable" characters, has no provision for any kind of dot diacritic. Subsequent standardisation treated ASCII as the US national variant of ISO/IEC 646: the French, German and other national variants reassigned a few code points to specific vowels with diacritics, as precomposed characters.

The subsequent (eight bit) ISO 8859-1 character encoding includes the letters ä, ë, ï, ö, ü, and their respective capital forms, as well as ÿ in lower case only, with Ÿ added in the revised edition ISO 8859-15 and Windows-1252.

These standards are technically obsolete, having been replaced by Unicode.

Computer usage

Character encoding generally treats the umlaut and the diaeresis as the same diacritic mark. Unicode refers to both as diaereses without making any distinction.

Keyboard input

Tastatur-Umlaute-deutsch
Letters with umlaut on a German computer keyboard.

In countries where the local language(s) routinely include letters with a circumflex, local keyboards are typically engraved with those symbols. If letters with double dots are not present on the keyboard, there are a number of ways to input them into a computer system.

Apple MacOS, iOS

iOS provides accented letters through press-and-hold on most European Latin-script keyboards, including English. Some keyboard layouts feature combining-accent keys that can add accents to any appropriate letter. A letter with double dots can be produced by pressing Option, then the letter. This works on English and other keyboards and is documented further in the supplied manuals.

Google ChromeOS

For ChromeOS with US-International keyboard setting, the combination is ". For ChromeOS with UK extended setting, use AltGr, release, then the letter. Alternatively, the Unicode codepoint may be entered directly, using Ctrl, release, then the four-digit code, then Enter or Space.

Microsoft Windows

AZERTY and QZERTY keyboards (as used in much of Europe) include precomposed characters (accented letters) as standard and these are fully supported by Microsoft Windows, typically accessed using the AltGr key.

For users with a US keyboard layout, Windows includes a setting "US International", which supports creation of accented letters by changing the function of some keys into dead keys. If the user enters ", nothing will appear on screen, until the user types another character, after which the characters will be merged if possible, or added independently at once if not. Alternatively, the desired character may be generated using Alt codes.

For users in the United Kingdom and Ireland with QWERTY keyboards, Windows has an "Extended" setting such that an accented letter can be created using AltGr then the base letter.

When using Microsoft Word for Windows or Outlook, a letter with double dots can be produced by pressing Ctrl and then the letter.

Linux / X Window System

X-based systems, Composea" produces ä, and similiarly for many other letters including capital letters.

In addition any Unicode code point can be entered, for instance CtrlF6Space produces U+00F6 which is ö.

Dedicated keys

The German keyboard has dedicated keys for ü ö ä. Scandinavian and Turkish keyboards have dedicated keys for their respective language-specific letters, including ö for Swedish, Finnish, and Icelandic, and both ö and ü for Turkish. French and Belgian AZERTY keyboards have a dead key which adds a circumflex (if without Shift) or a diaeresis/umlaut (if with Shift) to the letter key immediately following (for instance Shift-^ followed by e gives ë).

Other scripts

For non-Latin scripts, Greek and Russian use press-and-hold for double-dot diacritics on only a few characters. The Greek keyboard has dialytica and dialytica–tonos variants for upsilon and iota (ϋ ΰ ϊ ΐ), but not for ε ο α η ω, following modern monotonic usage. Russian keyboards feature separate keys for е and ё.

On-screen keyboards

The early 21st century has seen noticeable growth in stylus- and touch-operated interfaces, making the use of on-screen keyboards operated by pointing devices (mouse, stylus, or finger) more important. These "soft" keyboards may replicate the modifier keys found on hardware keyboards, but they may also employ other means of selecting options from a base key, such as right-click or press-and-hold. Soft keyboards may also have multiple contexts, such as letter, numeric, and symbol.

HTML

In HTML, vowels with double dots can be entered with an entity reference of the form &?uml;, where ? can be any of a, e, i, o, u, y or their majuscule counterparts. For instance Ä produces Ä.

TeX and LaTeX

TeX (and its derivatives, most notably LaTeX) also allows double dots to be placed over letters. The standard way is to use the control sequence \" followed by the relevant letter, e.g. \"u. It is good practice to set the sequence off with curly braces: {\"u} or \"{u}.

TeX's "German" package can be used: it adds the " control sequence (without the backslash) to produce the Umlaut. However, this can cause conflicts if the main language of the document is not German. Since the integration of Unicode through the development of XeTeX and XeLaTeX, it is also possible to input the Unicode character directly into the document, using one of the recognized methods such as Compose key or direct Unicode input.

TeX's traditional control sequences can still be used and will produce the same output (in very early versions of TeX these sequences would produce double dots that were too far above the letter's body).

All these methods can be used with all available font variations (italic, bold etc.).

See also

kids search engine
Diaeresis Facts for Kids. Kiddle Encyclopedia.