How QR Codes Work: A Complete Technical Explanation
Ever wondered what all those black squares actually mean? Learn how QR codes encode data, the role of error correction, and why they can be scanned from any angle.
QR codes (Quick Response codes) were invented in 1994 by Masahiro Hara at Denso Wave, a Toyota subsidiary, to track automotive parts. Today they encode billions of data interactions daily. But how does a grid of black and white squares reliably encode a 300-character URL? This deep-dive explains the technical architecture of QR codes—from data encoding to error correction to the scanning process—in accessible terms.
QR Code Structure: The Building Blocks
A QR code is composed of several distinct functional zones, each serving a specific role in the encoding and decoding process.
Finder Patterns
Three identical square patterns in the top-left, top-right, and bottom-left corners. Each is a 7×7 module square within a square within a square. The scanner uses these three reference points to locate the code and calculate its orientation and size regardless of the viewing angle.
Alignment Patterns
Smaller 5×5 square-within-square patterns in the interior of larger QR codes (Version 2 and above). They help the decoder correct for perspective distortion when the code is photographed at an angle or printed on a curved surface.
Timing Patterns
Alternating black and white modules in a line connecting the top finder patterns (horizontal) and the left finder patterns (vertical). They help the scanner determine the module size and establish a coordinate grid for reading data modules.
Format Information
Two 15-bit patterns near the finder patterns that encode the error correction level and mask pattern used in this specific code. This information is critical—the scanner must decode it before it can interpret anything else.
Data and Error Correction Modules
The remaining modules in the code body encode the actual data plus Reed-Solomon error correction codewords. Data is written in an 8-module-wide zigzag pattern from the bottom-right corner upward.
Data Encoding: From Text to Modules
QR codes support four encoding modes depending on the content type, each with different efficiency characteristics.
Numeric Mode
Encodes only digits 0-9. Extremely compact: three digits are encoded in 10 bits. A 41-character numeric string is stored in about 136 bits. Used for telephone numbers, postal codes, and ISBN numbers where maximum data density is needed.
Alphanumeric Mode
Encodes uppercase letters A-Z, digits 0-9, and 9 special characters (space, $, %, *, +, -, ., /, :). Two characters are encoded in 11 bits. Less efficient than numeric mode but handles common URL characters in uppercase form.
Byte Mode
Encodes any byte value from the ISO-8859-1 character set (or UTF-8 with an ECI extension). Most URLs and text use this mode. Each character requires 8 bits. Most versatile mode—handles mixed content including lowercase letters, special characters, and international text.
Kanji Mode
Encodes double-byte Kanji and Kana characters efficiently—two bytes per character encoded in 13 bits. Specifically designed for Japanese text, enabling the same QR code size to hold more Japanese characters than would be possible with byte mode.
Error Correction: The Reed-Solomon Magic
QR codes can be partially damaged, obscured, or dirty and still scan correctly. This is due to Reed-Solomon error correction codes—the same algorithm used in CDs, DVDs, and space probes.
How Reed-Solomon Works
Reed-Solomon codes add mathematically derived redundant codewords to the data. If some codewords are lost or corrupted (damaged modules), the decoder uses the redundant codewords to reconstruct the missing information through polynomial interpolation.
Error Correction Levels
Level L (Low) adds 7% redundancy. Level M (Medium) adds 15%. Level Q (Quartile) adds 25%. Level H (High) adds 30%. Higher levels can recover from more damage but require more modules, making the code larger and more complex.
Why Logos Work
When you place a logo over a QR code, you are deliberately destroying some data modules. High error correction (H) means up to 30% of the code can be destroyed and the data is still recoverable. The logo must cover less than this threshold.
Practical Damage Tolerance
A QR code with H error correction can be scratched, partially torn, printed over a texture, or have a logo overlay and still scan. This physical resilience is why QR codes work reliably in real-world conditions.
Mask Patterns: Avoiding Visual Artifacts
QR codes apply one of eight mask patterns to the data modules before finalizing the code. This step is crucial but invisible to the end user.
Why Masking is Necessary
Without masking, certain data patterns could create visual artifacts that confuse scanners—like large blocks of solid color resembling finder patterns, or long alternating lines interfering with timing patterns. The encoding algorithm evaluates all 8 mask options and selects the one that minimizes these artifacts.
How Masking Works
Each mask pattern is a mathematical formula applied to module coordinates. For each dark module in the data area, if the mask formula evaluates to true for that position, the module color is flipped. The decoder applies the same mask in reverse to recover the original data.
The Four Penalty Scores
The encoder calculates four penalty scores for each candidate mask: runs of same-color modules, 2×2 blocks, patterns resembling finder patterns, and proportion of dark vs. light modules. The mask with the lowest total penalty score is used, maximizing scan reliability.
How Scanners Read QR Codes
The scanning process is a multi-step pipeline that transforms a camera image into decoded data in milliseconds.
1. Image Capture and Preprocessing
The camera captures a frame. The app converts it to grayscale and applies binarization (thresholding) to create a pure black-and-white image. Edge detection identifies the boundaries of the QR code in the image.
2. Finder Pattern Detection
The algorithm searches for the characteristic 1:1:3:1:1 module ratio in horizontal and vertical scan lines—the signature of finder patterns. When all three are found, the code location and orientation are established.
3. Perspective Correction
Using the three finder patterns (and alignment patterns for larger codes), the decoder applies a perspective transformation to create a normalized, square view of the code regardless of the camera angle, distance, or tilt.
4. Module Sampling
The timing patterns establish the module grid. The decoder samples the center of each module position in the normalized image to determine if it is dark or light, producing a binary grid of 1s and 0s.
5. Format Decoding
The format information modules are read first to determine the error correction level and mask pattern. The mask is reversed on the data modules to reveal the encoded codewords.
6. Error Correction and Data Decoding
Reed-Solomon error correction is applied to repair any corrupted codewords. The remaining codewords are decoded according to the encoding mode (numeric, alphanumeric, byte, or Kanji) to produce the final text, URL, or other data.
QR Code Versions and Capacity
QR codes come in 40 versions, each with different sizes and data capacities.
Version Sizing
Version 1 is 21×21 modules and can store up to 41 numeric characters or 17 alphanumeric characters. Each subsequent version adds 4 modules to each side. Version 40 is 177×177 modules and can store up to 7,089 numeric characters or 4,296 alphanumeric characters.
Auto-Version Selection
QR code generators like QR Creator automatically select the smallest version that can hold your data at the chosen error correction level. Shorter URLs produce simpler, smaller codes that are easier and faster to scan.
Why URL Length Matters
A short URL like "qrcreatorapp.com" generates a small, simple QR code. A long UTM-tagged URL like "qrcreatorapp.com/promo?utm_source=flyer&utm_medium=print&utm_campaign=black_friday_2025" generates a larger, more complex code. This is why URL shorteners improve QR code quality.
Conclusion
QR codes are a remarkably elegant engineering solution—encoding arbitrary data in a visually scannable format with built-in redundancy, orientation detection, and damage recovery. The technical constraints that govern their design (contrast, quiet zone, error correction, module density) are not arbitrary limitations but direct consequences of the underlying encoding mathematics. Understanding this architecture helps you make better design decisions, troubleshoot scanning problems, and appreciate why certain rules—like maintaining high contrast and respecting the quiet zone—are truly non-negotiable.
Ready to Create Your QR Codes?
Put these best practices into action with our free QR code generator. Create beautiful, scannable QR codes in seconds.
Create QR Code NowFound this helpful? Share it with others!
