Mojibake Facts for Kids

This is what a website can look like if the wrong font encoding is used.

The Japanese Wikipedia article for Mojibake uses UTF-8 encoding. This screenshot shows what it looks like, when it is decoded using the standard Windows CP1252 encoding.

Mojibake (文字化け, pronounced /modʑibake/) is a Japanese word for when your computer shows strange, unreadable characters instead of normal text. It happens when a computer program or website doesn't understand the special code used to write the text.

Think of it like this: computers use secret codes, called character encoding, to understand and show letters and symbols. Each letter has a number. When you send text, the computer sends these numbers. If the computer receiving the text uses a different secret code to read the numbers, it might show the wrong letters. This mix-up is Mojibake!

Long ago, there were many different secret codes for text. This caused a lot of Mojibake problems. Then, a new, bigger code called Unicode was created. A popular part of Unicode, called UTF-8, can understand almost all characters from every language. This helped solve many Mojibake issues.

What Causes Mojibake?
How Unicode Helps
The Meaning of Mojibake
See also

What Causes Mojibake?

Mojibake happens when the computer or program tries to read text using the wrong "language" or code. Imagine you're trying to read a message written in a secret code, but you're using the wrong key to unlock it. The message would look like gibberish!

Wrong Encoding: The most common reason is that the text was saved with one character encoding (like UTF-8), but the computer tries to open it with another (like an older code called ISO-8859).
Missing Information: Sometimes, the text doesn't tell the computer which encoding it used. So, the computer just guesses, and often guesses wrong.
Old Programs: Some older computer programs or websites might not be able to handle newer, more complex character encodings like Unicode very well.

How Unicode Helps

Before Unicode, different parts of the world used their own character encodings. For example, one code might work for English, another for Japanese, and another for Arabic. This meant that if you sent a Japanese text to someone using an English-only code, it would turn into Mojibake.

Unicode was created to be a single, universal code for all characters from all languages. It has enough space for millions of different symbols. UTF-8 is a very popular way to use Unicode. It's smart because it uses less space for common characters (like English letters) and more space for less common ones (like complex Asian characters). This makes it efficient and helps prevent Mojibake.

The Meaning of Mojibake

The word Mojibake comes from the Japanese language. It is made of two parts:

Moji (文字) means letter or character.
Bake (化け) comes from the verb bakeru (化ける), which means to change into something else, to appear in disguise, or to transform for the worse.

So, literally, Mojibake means "character transformation" or "character gone wrong." It perfectly describes what happens when text turns into unreadable symbols on your screen!

Mojibake facts for kids

Contents

What Causes Mojibake?

How Unicode Helps

The Meaning of Mojibake

See also