Endianness
Endianness refers to the order in which bytes are stored and read in a computer's memory.
To understand it, imagine reading directions in different languages: while English Hex flows from left to right, Arabic Hex flows from right to left.
Similarly, computers have two ways to store data:
- Big-endian (BE): Most significant byte first
- Little-endian (LE): Least significant byte first
1- Big-Endian
Big-endian stores the most significant byte first. This is similar to how humans read numbers and Hex in most cases: starting with the most important information.
Suppose we want to store the number 12345678 (hexadecimal: 0x00BC614E
) in memory. In big-endian, the bytes are stored in this order:
00 BC 61 4E
- Most significant byte (
00
) is stored at the lowest memory address (00). - Least significant byte (
4E
) is stored at the highest address (03).
Big-endian is considered more "human-readable" because the data is stored in the order we naturally read it.
2. Little-Endian
Little-endian stores the least significant byte first. This might feel counterintuitive to humans but is more efficient for modern processors.
Using the same number 12345678 (0x00BC614E
), here's how it looks in little-endian:
4E 61 BC 00
- Least significant byte (
4E
) is stored at the lowest memory address (00). - Most significant byte (
00
) is stored at the highest address (03).
This "reversal" is common in Bitcoin's internal data representation.
3. Endianness in Bitcoin
In Bitcoin, little-endian is the standard for storing most data, like transaction IDs, block headers, and amounts. However, when this data is displayed to humans (e.g., in block explorers), it is often converted to big-endian for readability.
Size
4 bytes
Format
Little-Endian
Description
The version number for the transaction. Used to enable new features.
Example
02000000
Byte Visualization
Bitcoin Transaction Example
Let's say a transaction output amount is 12345678 satoshis. In Bitcoin:
- This value is stored as a 64-bit integer (8 bytes) in little-endian format.
- To humans, the hexadecimal representation would look like this in big-endian:
00 00 00 00 00 BC 61 4E
- In little-endian, this is reversed:
4E 61 BC 00 00 00 00 00
If you were decoding raw Bitcoin transaction data, you'd need to reverse the byte order to understand the values correctly.
Why Does Bitcoin Use Little-Endian?
Bitcoin uses little-endian because Satoshi developed it on a little-endian computer.
Most modern CPUs are little-endian and the network protocols typically use big-endian, which can create a mismatch:
- Big-endian is used for network communication (called network byte order).
- Little-endian is used for internal storage in Bitcoin.
This duality requires developers to frequently convert between the two formats when working with Bitcoin data.
4. Working with Endianness in Code
When working with Bitcoin data, you'll frequently need to convert between little-endian and big-endian formats. Here are some common Python functions for handling endianness:
Little-Endian to Integer
def little_endian_to_int(b):
'''Convert little-endian bytes to integer'''
return int.from_bytes(b, 'little')
# Example usage
bytes_data = bytes.fromhex('4E61BC00') # 12345678 in little-endian
number = little_endian_to_int(bytes_data)
print(number) # Output: 12345678
Integer to Little-Endian
def int_to_little_endian(n, length):
'''Convert integer to little-endian bytes of specified length'''
return n.to_bytes(length, 'little')
# Example usage
number = 12345678
bytes_data = int_to_little_endian(number, 4)
print(bytes_data.hex()) # Output: 4e61bc00
Common Gotchas
- Byte Order Confusion: When reading transaction IDs or hashes:
# INCORRECT - reading txid directly from hex
txid = "4e61bc0000000000000000000000000000000000000000000000000000000000"
# CORRECT - reverse the bytes for proper txid display
txid_bytes = bytes.fromhex(txid)
txid_display = txid_bytes[::-1].hex()
- Length Specification: When converting to little-endian, always specify the correct byte length:
# For Bitcoin amounts (8 bytes)
amount = int_to_little_endian(12345678, 8) # Correct
amount = int_to_little_endian(12345678, 4) # Wrong - too short!