Compact Size
Compact Size (also known as CompactSize or var_int) is a variable-length integer encoding used in Bitcoin to efficiently represent numbers while minimizing space usage. It's primarily used in transaction data and network messages to indicate:
- Number of inputs/outputs in a transaction
- Script sizes
- Number of witness elements
- Length of upcoming data fields
Structure
The encoding uses a prefix byte to determine how to read the subsequent bytes:
Prefix | Format | Range | Total Size |
---|---|---|---|
≤ 0xFC | Direct value | 0-252 | 1 byte |
0xFD | Next 2 bytes (LE) | 253-65,535 | 3 bytes |
0xFE | Next 4 bytes (LE) | 65,536-4,294,967,295 | 5 bytes |
0xFF | Next 8 bytes (LE) | 4,294,967,296-2^64-1 | 9 bytes |
How to Read It
Let's analyze real Bitcoin transactions to understand compact size encoding:
Transaction with 1 Input -> 5e6e1a9b4ce3f9f...fedc9a758fe614
- Looking at input count:
01
- Since 1 < 252, it's a direct value
- Meaning: Transaction has 1 input
- Total bytes used: 1
- Looking at input count:
ScriptSig of 107 bytes -> c27c4d2236fce2a...99c3102ccb45ef
- Looking at scriptsig size:
6b
- Since 107 (0x6b) < 252, it's a direct value
- Meaning: ScriptSig is 107 bytes long
- Total bytes used: 1
- Looking at scriptsig size:
ScriptPubKey of 4,026 bytes -> e411dbebd2f7d64...347abee1e1a455
- Looking at scriptpubkey size:
fd ba 0f
- First byte is 0xFD, so read next 2 bytes
- Next bytes:
ba 0f
(in little-endian) - Convert from little-endian:
0fba
= 4,026 in decimal - Meaning: ScriptPubKey is 4,026 bytes long
- Total bytes used: 3
- Looking at scriptpubkey size:
Understanding Little-Endian
In the third example, we see ba 0f
instead of 0f ba
because of little-endian encoding. Think of it as reading the bytes from right to left: 0fba
is the actual number in hexadecimal.
Tip
You can verify these values yourself by looking at the raw transaction data on the block explorer. Click the transaction IDs above and look for the "Raw Transaction" section.
Implementation
Here's a Python implementation for reading and writing compact size integers:
def read_varint(s):
'''Reads a variable integer from a stream
The first byte determines the format:
- If < 0xfd: directly contains the number
- If 0xfd: next 2 bytes contain number
- If 0xfe: next 4 bytes contain number
- If 0xff: next 8 bytes contain number
'''
i = s.read(1)[0]
if i == 0xfd:
return int.from_bytes(s.read(2), 'little')
elif i == 0xfe:
return int.from_bytes(s.read(4), 'little')
elif i == 0xff:
return int.from_bytes(s.read(8), 'little')
else:
return i
def encode_varint(i):
'''Encodes an integer as a compact size
Returns bytes object with the encoded number
'''
if i < 0xfd:
return bytes([i])
elif i < 0x10000:
return b'\xfd' + i.to_bytes(2, 'little')
elif i < 0x100000000:
return b'\xfe' + i.to_bytes(4, 'little')
else:
return b'\xff' + i.to_bytes(8, 'little')
Common Uses in Bitcoin
- Transaction Structure
- Input count
- Output count
- ScriptSig size
- ScriptPubKey size
- Witness element count
Size
variable
Format
Compact Size
Description
Indicates the number of inputs.
Example
01
Byte Visualization
Block Data
- Number of transactions in a block
Network Messages
- Length of upcoming message data
- Number of inventory items
Warning
The compact size format is part of Bitcoin's consensus rules. Incorrect encoding or decoding can lead to invalid transactions or network messages.
Important Notes
- Most transactions use single-byte encoding (≤ 252)
- The FF prefix (8-byte numbers) is rarely used in practice
- All multi-byte numbers must be encoded in little-endian format
- This encoding has been part of Bitcoin since its first release
This variable-length integer encoding plays a crucial role in keeping transaction and block data compact while maintaining flexibility for larger values when needed.