Unicode block facts for kids
A Unicode block is like a special box that holds a group of similar characters or symbols. Think of it as a collection of letters, numbers, or pictures that belong together. These blocks are created by the Unicode Consortium, which is a group that makes sure computers can understand and show text from all languages around the world.
Many blocks contain characters for a specific language, like the letters you use every day. Other blocks hold special symbols, such as those used in mathematics or for emojis!
Contents
What are Unicode Blocks?
Imagine you have a giant digital library with every single letter, number, and symbol from every language and writing system ever used. That's what Unicode tries to be! To keep this huge library organized, Unicode divides all these characters into smaller groups called "blocks." Each block has its own special range of codes.
Why do we need Unicode Blocks?
Unicode blocks help computers manage and display text correctly. Without them, your computer might not know how to show characters from different languages or special symbols. For example, the letters you are reading right now are part of the "Basic Latin" block. If you wanted to write in Japanese, your computer would use characters from blocks like "Hiragana" or "Katakana." This system makes sure that text looks the same for everyone, no matter what language they are using.
How are Unicode Blocks Organized?
Each character in Unicode has a unique number, called a "code point." Unicode blocks are defined by a starting code point and an ending code point. All the characters within that range belong to that specific block. This helps keep everything neat and tidy.
For example, the "Basic Latin" block starts at U+0000 and ends at U+007F. The "U+" simply means "Unicode." So, all the characters from 0000 to 007F are in that block.
Examples of Unicode Blocks
Here are a few examples of Unicode blocks to show you how diverse they are. This table shows the first and last code points for each block, its name, how many code points it covers, and how many actual characters are in it.
Character codes | Name | Code points | Number of characters | |
---|---|---|---|---|
First | Last | |||
U+0000 | U+007F | Basic Latin | 128 | 128 |
U+0370 | U+03FF | Greek and Coptic | 144 | 135 |
U+0400 | U+04FF | Cyrillic | 256 | 256 |
U+0900 | U+097F | Devangari | 128 | 128 |
U+0E00 | U+0E7F | Thai | 128 | 87 |
U+1100 | U+11FF | Hangul Jamo | 256 | 256 |
U+2000 | U+206F | General Punctuation | 112 | 111 |
U+2190 | U+21FF | Arrows | 112 | 112 |
U+2200 | U+22FF | Mathematical Operators | 256 | 256 |
U+2600 | U+26FF | Miscellaneous Symbols | 256 | 256 |
U+3040 | U+309F | Hiragana | 96 | 93 |
U+30A0 | U+30FF | Katakana | 96 | 96 |
U+4E00 | U+9FFF | CJK Unified Ideographs | 20,992 | 20,989 |
U+1F600 | U+1F64F | Emoticons | 80 | 80 |