Data compression facts for kids

Kids Encyclopedia Facts

Data compression is like packing a big suitcase into a smaller one! It's a way to make digital information, like photos, videos, or documents, take up less space. This is done by encoding the information using fewer bits (the tiny pieces of data computers use) than the original.

There are two main types of compression:

Lossless compression: This is like perfectly folding clothes to fit in a suitcase. No information is lost, so you can always get back the exact original file.
Lossy compression: This is like leaving some clothes behind to make the suitcase lighter. Some less important information is removed. This makes the file much smaller, but you can't get back the exact original.

Compression is super useful! It helps us store more data on our devices and send files faster over the internet. When you compress something, it's called "encoding." When you make it big again, it's called "decoding" or "decompression."

Lossless Compression
Lossy Compression
Uses of Data Compression
Images for kids
See also

Lossless Compression

Lossless compression works by finding and removing repeated information in data. Imagine you have a picture with a big blue sky. Instead of saving "blue pixel, blue pixel, blue pixel" hundreds of times, lossless compression might just say "500 blue pixels." This saves a lot of space!

One popular method is called Lempel–Ziv (LZ) compression. It's used in many everyday files. For example, GIF images use a type of LZ compression called LZW. Programs like PKZIP also use it. These methods create a kind of "dictionary" of repeated patterns in the data.

Newer lossless methods use smart math to predict what data comes next. This helps them compress even better. Arithmetic coding is one such advanced technique. It's used in some parts of JPEG images and in video formats like H.264/MPEG-4 AVC.

Lossy Compression

Lossy compression is different because it accepts losing some information. This is okay when the lost details are things people usually won't notice. It's like when you take a picture and the camera makes it smaller. You might not see every tiny detail, but the picture still looks good.

This type of compression is designed based on how our senses work. For example, our eyes are better at seeing changes in brightness than changes in color. So, JPEG image compression might reduce some color details that our eyes wouldn't miss much. For sound, it uses psychoacoustics, which means it removes sounds that our ears can't easily hear.

Most lossy compression uses a math trick called the discrete cosine transform (DCT). This was invented in the 1970s. DCT is used in many popular formats:

Images: JPEG
Videos: MPEG, AVC
Audio: MP3, AAC

Lossy compression is used everywhere. Digital cameras use it to fit more photos. DVDs, Blu-ray discs, and streaming services like Netflix use it for videos. It makes files much smaller, but if you compress a file over and over, you might notice the quality getting worse. This is called "generation loss."

Uses of Data Compression

Data compression is used in many parts of our digital world.

Image Compression

Image compression makes picture files smaller. This is super important for digital cameras and for sharing photos online.

The discrete cosine transform (DCT) led to JPEG in 1992. JPEG is now the most common image file format. It makes images much smaller with only a small loss in quality.
For lossless image compression, Lempel–Ziv–Welch (LZW) is used in GIF images. DEFLATE is used in Portable Network Graphics (PNG) images. These keep all the original details.
Newer formats like JPEG 2000 use a different method called wavelet compression.

Audio Compression

Audio compression makes sound files smaller. This helps us store more songs on our phones and stream music online.

Lossy audio compression is very common. Formats like MP3 and Vorbis use it. They make files much smaller by removing sounds that are hard for humans to hear. For example, a 640 MB CD can hold about 1 hour of uncompressed music. But it can hold 7 hours of music compressed as MP3!
Lossless audio compression keeps all the original sound quality. It's used for archiving music or for professional audio work. Examples include FLAC and ALAC. These files are still smaller than uncompressed files, but not as small as lossy ones.

How Lossy Audio Compression Works

Lossy audio compression uses special models of how our ears hear. It finds sounds that are too quiet to notice or are hidden by louder sounds. These "irrelevant" sounds are then removed or made less accurate. This is how MP3s get so small!

Some audio compression is specifically for human speech. This is called speech encoding. Since human voices have a smaller range of sounds than music, speech can be compressed even more. This is used in phone calls and internet calls.

History of Audio Compression

Early ideas for audio compression came from Bell Labs in the 1950s. The discrete cosine transform (DCT) in 1974 helped create modern audio formats. In the 1980s, engineers like Oscar Bonello developed systems that used these ideas for radio stations. This allowed radio stations to store and play music more easily. The MP3 format, which became very popular, uses these older ideas.

Video Compression

Video files are huge! So, video compression is super important for streaming movies, watching TV, and making videos on your phone.

Uncompressed video needs a lot of data. Lossy video compression can make files 20 to 200 times smaller!
Most video compression uses two main tricks:

* DCT: Like with images, it processes blocks of pixels. * Motion compensation: This is very clever! Instead of saving every single frame of a video, it looks for things that move. If a character walks across the screen, the system doesn't save the whole new picture. It just saves how much the character moved from the last frame. This saves tons of space.

Video compression formats like H.26x and MPEG use these methods. These are the standards for DVDs, Blu-rays, and streaming services like YouTube and Netflix.

How Video Encoding Works

Video is like a series of still pictures. Video compression tries to remove repeated information in these pictures.

Inter-frame coding looks at differences between frames. If nothing moves, it just says "copy from the last frame." If something moves, it saves the movement instead of the whole new image.
Intra-frame coding compresses each frame like a still image. This is simpler but doesn't compress as much.

Video compression also uses lossy compression by removing details that humans don't easily see. For example, small color changes are less noticeable than brightness changes. So, the compression might simplify colors in some areas.

History of Video Compression

The discrete cosine transform (DCT) in 1974 was key for video compression.

H.261 in 1988 was the first video format to use DCT.
The MPEG standards became very popular. MPEG-1 (1991) was for VHS quality. MPEG-2 (1994) became the standard for DVDs and digital TV. MPEG-4 (1999) followed.
H.264/MPEG-4 AVC (2003) is a very important standard today. It's used for Blu-ray Discs, YouTube, Netflix, and HDTV.

Genetics Compression

Believe it or not, data compression is also used in genetics! Scientists are finding ways to compress huge amounts of genetic data, like DNA sequences. This helps them store and study genetic information more easily. Some methods can compress human genome data to a tiny size, making it much easier to share and analyze.

Images for kids

MP3, an example of a lossy file format compared to WAV.
Comparison of spectrograms of audio in an uncompressed format and several lossy formats. The lossy spectrograms show bandlimiting of higher frequencies, a common technique associated with lossy audio compression.
Solidyne 922: The world's first commercial audio bit compression sound card for PC, 1990
Processing stages of a typical video encoder
Longest common subsequence of two files