Byte order mark Facts for Kids

A Byte Order Mark (BOM) is like a secret code at the very beginning of a text file. It tells computers how the text inside the file is organized, especially when it uses a special way of writing characters called Unicode. Think of it as a tiny label that helps computers understand the language of the file.

The actual code for a BOM is `U+FEFF`. This code looks a bit different depending on the specific Unicode encoding used.

How BOMs Look for Different Encodings
Bytes (Computer Code)	Encoding Type
EF BB BF	UTF-8
FE FF	UTF-16, big-endian
FF FE	UTF-16, little-endian
00 00 FE FF	UTF-32, big-endian
FF FE 00 00	UTF-32, little-endian

Using a BOM is optional, meaning it's not always needed. But if a file has one, it must be right at the very start. The BOM helps the computer know if the text is in UTF-8 or UTF-16. For UTF-16 and UTF-32, it also tells the computer about something called endianness. This is important when files move between different computer systems that store information in slightly different ways.

What is Endianness?
UTF-8 and the BOM
When BOMs Can Cause Trouble
See also

What is Endianness?

Endianness describes the order in which a computer stores groups of bytes (pieces of data).

Big-endian means the most important byte (the "biggest" part) comes first.
Little-endian means the least important byte (the "smallest" part) comes first.

It's like writing a date: some countries write day-month-year (little-endian for the year), while others write year-month-day (big-endian for the year). The BOM helps computers know which order to expect.

UTF-8 and the BOM

Today, UTF-8 is the most common way to encode text. Because of this, the `EFBBBF` BOM (often called the UTF-8 signature) is seen most often. Web browsers like those you use to surf the internet are designed to recognize this UTF-8 BOM. When they see it, they know how to correctly display the text on a webpage.

The official Unicode rules don't actually require or strongly suggest using a BOM for UTF-8. However, they do warn that you might find it at the beginning of a file.

When BOMs Can Cause Trouble

Most modern computer programs can easily understand and use a BOM. They might even add one automatically when you save a text file. However, sometimes the UTF-8 BOM can cause problems. This usually happens with older software that wasn't made to handle UTF-8. In these cases, the BOM might show up as strange characters like "ï»¿" at the start of your text.

Byte order mark facts for kids

Contents

What is Endianness?

UTF-8 and the BOM

When BOMs Can Cause Trouble

See also