URL encoding facts for kids
URL encoding, also called percent-encoding, is a special way to change certain characters in web addresses (like `http://www.example.com/page?name=John Doe`) so that computers can understand them correctly. Think of it like translating a secret code!
Web addresses, or URIs, are made up of a limited set of characters. Some characters have special jobs, like the `/` that separates parts of an address. When you need to use one of these special characters, but you want it to be part of the name or data, you have to "encode" it. This means turning it into a special code that starts with a percent sign (`%`).
Contents
Why Do We Need URL Encoding?
Imagine you're writing a message, and some words have a secret meaning. If you want to use those words just as regular words, you might put them in quotes or change them slightly so people know you're not using their secret meaning.
Web addresses work similarly. Some characters are "reserved" because they have a special job in a web address. For example:
- The / separates different parts of a web address, like folders on your computer.
- The ? shows where the main address ends and where extra information (like a search query) begins.
- The & separates different pieces of information in that extra part.
If you want to use one of these characters, like a `/` in a file name, you can't just type it directly. The computer would get confused and think it's a separator. That's where URL encoding comes in!
How Does Percent-Encoding Work?
When a special character needs to be used as regular text, it gets "percent-encoded." This process involves a few steps:
- First, the character is turned into a number that computers understand (its ASCII value).
- Then, that number is changed into two hexadecimal digits (a special way of counting using 0-9 and A-F).
- Finally, a percent character (`%`) is put in front of these two digits.
So, if you want to include a space in a web address, you can't just type a space. Instead, it becomes `%20`. The space character's ASCII value is 32, which is `20` in hexadecimal.
Reserved Characters That Get Encoded
Here are some characters that have special meanings in web addresses and how they look when percent-encoded:
Character | Encoded Form |
Space (␣) | %20 |
! | %21 |
" | %22 |
# | %23 |
$ | %24 |
% | %25 |
& | %26 |
' | %27 |
( | %28 |
) | %29 |
* | %2A |
+ | %2B |
, | %2C |
/ | %2F |
: | %3A |
; | %3B |
= | %3D |
? | %3F |
@ | %40 |
[ | %5B |
] | %5D |
If a reserved character doesn't have its special meaning in a certain part of the address, it might not need to be encoded. But encoding it anyway won't cause problems.
Characters That Don't Need Encoding
Some characters are "unreserved" and never need to be percent-encoded. These include:
- All uppercase letters (A-Z)
- All lowercase letters (a-z)
- All numbers (0-9)
- A few symbols: -, _, . and ~
Web addresses that only differ by whether an unreserved character is encoded or not are usually considered the same. For example, `example.com/Hello` and `example.com/%48ello` (where `%48` is `H`) should lead to the same page. However, it's best not to encode these characters to keep web addresses shorter and easier to read.
What About Other Characters?
What if you want to use characters that aren't in the basic English alphabet, like `é` or `€`? For these, the computer usually converts them into a special code called UTF-8. Then, each part of that UTF-8 code is percent-encoded. For example, the Euro sign `€` becomes `%E2%82%AC`.
Sending Form Data on the Web
When you fill out a form on a website (like a search bar or a login form) and click "submit," the information you typed needs to be sent to the server. This data is also encoded, but with a few small differences from regular URL encoding.
This special way of encoding form data is called `application/x-www-form-urlencoded`. One key difference is that spaces are often replaced with a `+` sign instead of `%20`. This makes it easier for web servers to understand the information.
Whether you're typing a web address directly or submitting a form, URL encoding helps make sure that all the information gets sent and understood correctly by computers around the world!
See also
In Spanish: Código porciento para niños
- Internationalized Resource Identifier
- Punycode
- Binary-to-text encoding
- Base64