Chapter 1: Information Representation - painlessprogramming.com

1. Binary, Denary & Hexadecimal

Key Concepts:

Binary: Base-2 system (0, 1).
Denary: Base-10 system (0–9).
Hexadecimal: Base-16 system (0–9, A–F).

Conversions:

Binary ↔ Denary:

Example: 1010₂ = (1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 0 \times 2^0 = 10_{10}).

Hexadecimal ↔ Denary:

Example: A5₁₆ = (10 \times 16^1 + 5 \times 16^0 = 165_{10}).

Past Paper Example (2023 Q1c):
Question: Convert F0₁₆ to denary.
Marking Scheme Answer:

(F = 15), (0 = 0) → (15 \times 16^1 + 0 \times 16^0 = 240_{10}).

Common Errors:

Misinterpreting hex digits (e.g., E = 14, not 15).

2. Binary Arithmetic

Key Concepts:

Addition: Column-wise with carry-over (e.g., 1 + 1 = 10).
Subtraction: Use two’s complement for negative numbers.

Past Paper Example (2024 Q1d):
Question: Subtract 01110101₂ from 10110011₂.
Marking Scheme Answer:

Find two’s complement of subtrahend: 10001011.
Add to minuend: 10110011 + 10001011 = 00111110 (discard overflow).

Overflow:

Occurs when result exceeds bit limit (e.g., 8-bit addition of 127 + 1).

3. Two’s Complement

Key Concepts:

Represents negative numbers:

Invert bits.
Add 1.

Range: (-128) to (127) (8-bit).

Past Paper Example (2022 Q4b):
Question: Convert 11100111₂ (two’s complement) to denary.
Marking Scheme Answer:

Invert: 00011000.
Add 1: 00011001 = (25_{10}) → (-25).

Common Errors:

Forgetting to add 1 after inversion.

4. Binary Coded Decimal (BCD)

Key Concepts:

Each denary digit represented by 4-bit binary.
Example: 87₁₀ = 1000 0111 (BCD).

Past Paper Example (2023 Q15b):
Question: Convert 00100111 (BCD) to denary.
Marking Scheme Answer:

0010 = 2, 0111 = 7 → 27₁₀.

Limitation:

Cannot represent hex values above 9 (e.g., A is invalid in BCD).

5. Decimal (SI) Units (Base 10)

Used by storage manufacturers (e.g., HDDs, SSDs).

Unit	Abbr.	Value (Bytes)	Equivalent	Example Usage
1 Bit	b	1/8 byte	Binary digit (0 or 1)	Network speeds (Mbps)
1 Byte	B	8 bits	ASCII character (e.g., ‘A’)	File sizes (e.g., 100B text file)
1 Kilobyte	kB	(10^3 = 1,000) B	~1/2 page of text	Small text files
1 Megabyte	MB	(10^6 = 1,000) kB	~1 minute of MP3 audio	High-res images
1 Gigabyte	GB	(10^9 = 1,000) MB	~30 mins of HD video	Smartphone storage
1 Terabyte	TB	(10^{12} = 1,000) GB	~250,000 photos	Laptop/cloud storage

Key Conversions:

(1 \text{ kB} = 1,000 \text{ B})
(1 \text{ MB} = 1,000 \text{ kB} = 1,000,000 \text{ B})
(1 \text{ GB} = 1,000 \text{ MB} = 1,000,000,000 \text{ B})

Binary (IEC) Units (Base 2)

Used by operating systems (e.g., Windows, macOS).

Unit	Abbr.	Value (Bytes)	Equivalent	Example Usage
1 Kibibyte	KiB	(2^{10} = 1,024) B	~2 paragraphs of text	RAM allocation
1 Mebibyte	MiB	(2^{20} = 1,024) KiB	~1 floppy disk (1.44 MiB)	Game textures
1 Gibibyte	GiB	(2^{30} = 1,024) MiB	~1 hour of SD video	GPU memory
1 Tebibyte	TiB	(2^{40} = 1,024) GiB	~500 hours of HD video	Enterprise storage

Key Conversions:

(1 \text{ KiB} = 1,024 \text{ B})
(1 \text{ MiB} = 1,024 \text{ KiB} = 1,048,576 \text{ B})
(1 \text{ GiB} = 1,024 \text{ MiB} = 1,073,741,824 \text{ B})

Practical Examples (Not Important For Exam)

File Size Calculation:

A 5 MB (decimal) file = (5 \times 1,000^2 = 5,000,000) B.
In binary: (5,000,000 \div 1,024^2 ≈ 4.77 \text{ MiB}).

Download Speed:

100 Mbps (decimal) = (100 \times 1,000^2 = 100,000,000) bits/sec.
In binary: (100,000,000 \div 1,048,576 ≈ 95.37 \text{ Mibps}).

Exam Tips

Confusion Alert: Mixing kB (decimal) vs. KiB (binary) loses marks.
Conversions: Always specify the system (e.g., “Convert 2 GiB to MiB”).
Real-World Context:
Cloud storage (e.g., AWS) uses decimal.
RAM/SSDs use binary.

Worked Example (2023 Q1a)

Question: State the difference between a kibibyte and a kilobyte.
Marking Scheme Answer:

Kibibyte (KiB): (2^{10} = 1,024) bytes (binary).
Kilobyte (kB): (10^3 = 1,000) bytes (decimal).

Past Paper Example (2021 Q1a):
Question: State one difference between kibibyte and kilobyte.
Marking Scheme Answer:

Kibibyte uses base-2 ((1024) bytes); kilobyte uses base-10 ((1000) bytes).

6. Bitmap vs. Vector Graphics

Key Differences:

Bitmap	Vector
Pixel grid	Mathematical objects
Fixed resolution	Scalable without quality loss
Larger file size	Smaller file size

Past Paper Example (2024 Q5b):
Question: Calculate bitmap file size ((1500 \times 3000) pixels, 8-bit depth).
Marking Scheme Answer:

(1500 \times 3000 \times 8 = 36,000,000) bits = (4.5) MB.

Compression:

Lossless (RLE): Repeats of BBBB → 4B.
Lossy (JPEG): Discards imperceptible data.

1. Lossless Compression

Definition:

Preserves all original data; perfect reconstruction is possible.
Used for text, databases, executables, and medical/scientific imaging.

Methods & Examples:

A. Run-Length Encoding (RLE)

How it works: Replaces sequences of identical data (runs) with a single value and its count.
- Example: AAAABBBCCDAA → 4A3B2C1D2A (50% smaller).
Best for: Images with large uniform areas (e.g., icons, cartoons) or simple text (e.g., XXXXXYYYY → 5X4Y).

Why Use RLE?

Simplicity: Easy to implement (minimal CPU overhead).
Speed: Fast compression/decompression (real-time applications).
No Quality Loss: Ideal for exact data recovery (e.g., medical scans).

Limitations of RLE:

Inefficient for Complex Data:
- Fails with varied patterns (e.g., ABABAB → 1A1B1A1B1A1B → larger than original!).
- Example: Photographs (pixel values rarely repeat consecutively).
No Entropy Reduction: Doesn’t exploit statistical redundancies like Huffman coding.

B. Other Lossless Methods (Not Part Of Syllabus)

Method	How It Works	Use Case
Huffman Coding	Assigns shorter codes to frequent data	ZIP files, MP3 (for metadata)
LZW (GIF/PNG)	Builds dictionary of repeated strings	GIF images, Unix `compress`
Deflate (ZIP)	Combines LZ77 + Huffman	ZIP archives, HTTP data

2. Lossy Compression

Definition:

Discards “less important” data to achieve higher compression.
Used for media (images, audio, video) where perfect fidelity isn’t critical.

Methods & Examples:

A. JPEG (Images)

How it works:
1. Color Sampling: Reduces chrominance (color) resolution (human eyes prioritize luminance).
2. DCT (Discrete Cosine Transform): Converts pixels into frequency components.
3. Quantization: Drops high-frequency details (controlled by quality setting).
Artifacts: Blockiness in low-quality JPEGs.

B. MP3 (Audio)

Components Of An Audio File:

Sampling Rate: Number of samples taken per second
Sampling Resolution: Number of bits used to represent each sample.

How it works:
1. Perceptual Coding: Removes sounds masked by louder frequencies.
2. Bitrate Reduction: Lower bitrates discard more data (e.g., 128 kbps vs. 320 kbps).

C. MPEG (Video)

Key Techniques:
- Inter-frame Compression: Stores only changes between frames (e.g., P/B-frames in H.264).
- Motion Estimation: Predicts movement to reduce redundant data.

Why Use Lossy Compression?

File Size Reduction: JPEG can shrink images to 10% of original size.
Bandwidth Efficiency: Critical for streaming (YouTube, Spotify).

Drawbacks:

Irreversible Data Loss: Repeated compression degrades quality (“generation loss”).
Artifacts: Pixelation (images), “tinny” sound (low-bitrate MP3s).

Why RLE Doesn’t Always Reduce File Size

Cases Where RLE Fails:

Non-Repeating Data:
- Input: ABCDEFG → RLE: 1A1B1C1D1E1F1G (2× larger!).
Random Noise:
- Photographs or encrypted data rarely have long runs.
Overhead of Counters:
- If runs are short (e.g., ABAB), counters dominate the output.

When RLE Works Best:

Binary Images: Fax machines (long runs of black/white).
Simple Graphics: Logos with flat colors.
Text: Repeated characters (e.g., spaces in documents).

Real-World Applications

Compression Type	Formats/Applications	Reason for Choice
Lossless	PNG, FLAC, ZIP, SQL databases	Exact data recovery required
Lossy	JPEG, MP3, H.264, WebP	Bandwidth/storage optimization
RLE-Specific	TIFF, BMP, fax transmissions	Simple, fast encoding

Exam-Style Questions

Q1: Explain why RLE is unsuitable for compressing a photograph.

Answer: Photographs have varied pixel values with few consecutive repeats, causing RLE to output larger files (e.g., 1R1G1B1R1G1B...).

Q2: Compare lossy and lossless compression for sound files.

Lossless (FLAC): Preserves quality; larger files (good for archiving).
Lossy (MP3): Smaller files; discards inaudible frequencies (good for streaming).

Key Takeaways

RLE is fast/simple but only effective for repetitive data.
Lossy sacrifices quality for size; lossless preserves data.
Always consider the data type when choosing compression (e.g., RLE for text, JPEG for photos).

7. Sound Representation

Key Concepts:

Sampling Rate: Samples/sec (Hz). Higher = better accuracy.
Resolution: Bits/sample. Higher = finer volume levels.

Past Paper Example (2023 Q7a):
Question: Calculate sound file size (50 kHz, 16-bit, 20 mins).
Marking Scheme Answer:

(50,000 \times 16 \times 1200 = 960,000,000) bits = (114.44) MB.

Buffering:

Prevents playback lag by storing preloaded data (e.g., streaming).

8. Character Sets

Key Definitions:

ASCII	Unicode
7-bit (128 chars)	16/32-bit (global chars)
English only	Supports emojis, scripts

Past Paper Example (2022 Q6a):
Question: Convert Unicode ‘1’ (denary 49) to hex.
Marking Scheme Answer:

(49 \div 16 = 3) remainder (1) → 31₁₆.

9. File Size Calculation:

a) Image

File size: Height * Width * Color Depth

b) Sound

File size: Sampling Rate * Sampling Resolution * Time(in seconds)

Exam Pitfalls & Tips

Binary Arithmetic: Always show working for partial marks.
Two’s Complement: Double-check sign bit (MSB).
Units: Confusing KiB (1024) vs. KB (1000) loses marks.

Worked Example (2024 Q20c)

Question: Compress 32 32 80 81 81 using RLE.
Marking Scheme Answer:

2 32 1 80 2 81.

Why Lossless for Text?

Lossy discards data (corrupts text); lossless preserves exact content.