Bitmessage Protocol Version 3
Contents
Common standards
Hashes
Most of the time SHA-512 hashes are used, however RIPEMD-160 is also used when creating an address.
A double-round of SHA-512 is used for the Proof Of Work. Example of double-SHA-512 encoding of string "hello":
hello 9b71d224bd62f3785d96d46ad3ea3d73319bfbc2890caadae2dff72519673ca72323c3d99ba5c11d7c7acc6e14b8c5da0c4663475c2e5c3adef46f73bcdec043(first round of sha-512) 0592a10584ffabf96539f3d780d776828c67da1ab5b169e9e8aed838aaecc9ed36d49ff1423c55f019e050c66c6324f53588be88894fef4dcffdb74b98e2b200(second round of sha-512)
For Bitmessage addresses (RIPEMD-160) this would give:
hello 9b71d224bd62f3785d96d46ad3ea3d73319bfbc2890caadae2dff72519673ca72323c3d99ba5c11d7c7acc6e14b8c5da0c4663475c2e5c3adef46f73bcdec043(first round is sha-512) 79a324faeebcbf9849f310545ed531556882487e (with ripemd-160)
Common structures
All integers are encoded in big endian
Message structure
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | magic | uint32_t | Magic value indicating message origin network, and used to seek to next message when stream state is unknown |
12 | command | char[12] | ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected) |
4 | length | uint32_t | Length of payload in number of bytes. The maximum allowed value is 1,600,003 bytes |
4 | checksum | uint32_t | First 4 bytes of sha512(payload) |
? | message_payload | uchar[] | The actual data, a message. Not to be confused with objectPayload. |
Known magic values:
Magic value | Sent over wire as |
---|---|
0xE9BEB4D9 | E9 BE B4 D9 |
Variable length integer
An integer can be encoded depending on the represented value to save space. Variable length integers always precede an array/vector of a type of data that may vary in length. Varints MUST use the minimum possible number of bytes to encode a value. For example; the value 6 can be encoded with one byte therefore a varint that uses three bytes to encode the value 6 is malformed and the decoding task must be aborted.
Value | Storage length | Format |
---|---|---|
< 0xfd | 1 | uint8_t |
<= 0xffff | 3 | 0xfd followed by the integer as uint16_t |
<= 0xffffffff | 5 | 0xfe followed by the integer as uint32_t |
- | 9 | 0xff followed by the integer as uint64_t |
Variable length string
A variable length string can be stored using a variable length integer to encode the length followed by the string itself.
Field Size | Description | Data type | Comments |
---|---|---|---|
1 - 9 | length | var_int | Length of the string |
length | string | char[] | The string itself (can be empty) |
Variable length list of integers
n integers can be stored using n+1 variable length integers where the first var_int equals n.
Field Size | Description | Data type | Comments |
---|---|---|---|
1 - 9 | count | var_int | Number of var_ints below |
1 - 9 | var_int | The first value stored | |
1 - 9 | var_int | The second value stored... | |
1 - 9 | var_int | etc... |
Network address
When a network address is needed somewhere, this structure is used. Network addresses are not prefixed with a timestamp or stream in the version message.
Field Size | Description | Data type | Comments |
---|---|---|---|
8 | time | uint64 | the Time. |
4 | stream | uint32 | Stream number for this node |
8 | services | uint64_t | same service(s) listed in version |
16 | IPv6/4 | char[16] | IPv6 address. IPv4 addresses are written into the message as a 16 byte IPv4-mapped IPv6 address
(12 bytes 00 00 00 00 00 00 00 00 00 00 FF FF, followed by the 4 bytes of the IPv4 address). Hidden Service addresses can be represented as an IPv6 address with a 48-bit routing prefix of fd87:d87e:eb43::48 under the Unique Local Address block (fc00::/7) with the remaining 10 bytes being the Base256 encoding of the Hidden Service address |
2 | port | uint16_t | port number |
Inventory Vectors
Inventory vectors are used for notifying other nodes about objects they have or data which is being requested. Two rounds of SHA-512 are used, resulting in a 64 byte hash. Only the first 32 bytes are used; the remaining 32 bytes are ignored.
Inventory vectors consist of the following data format:
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | hash | char[32] | Hash of the object |
Identity
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | behavior bitfield | uint32_t | A bitfield of optional behaviors and features of the identity. |
64 | public signing key | uchar[64] | The ECC public key used for signing in uncompressed format without the point compression prefix |
64 | public encryption key | uchar[64] | The ECC public key used for encryption in uncompressed format without the point compression prefix |
3 - 9 | nonce_trials_per_byte | var_int | Used to calculate the difficulty target of messages accepted by this node. The higher this value, the more difficult the Proof of Work must be before this individual will accept the message. This number is the average number of nonce trials a node will have to perform to meet the Proof of Work requirement. 1000 is the network minimum so any lower values will be automatically raised to 1000. |
3 - 9 | extra_bytes | var_int | Used to calculate the difficulty target of messages accepted by this node. The higher this value, the more difficult the Proof of Work must be before this individual will accept the message. This number is added to the data length to make sending small messages more difficult. 1000 is the network minimum so any lower values will be automatically raised to 1000. |
Envelope
Bitmessage uses ECIES to encrypt its messages. For more information see Encryption
Plain Envelope
Field Size | Description | Data type | Comments |
---|---|---|---|
16 | IV | uchar[16] | Initialization Vector used for AES-256-CBC |
2 | elliptic curve | uint16_t | Elliptic Curve secp256k1. This is the NID (numerical identifier) 714 (0x02CA) assigned by OpenSSL to represent secp256k1 |
2 | X length | uint16_t | Length of X component of public key R |
X length | X | uchar[X length] | X component of public key R |
2 | Y length | uint16_t | Length of Y component of public key R |
Y length | Y | uchar[Y length] | Y component of public key R |
? | encrypted | uchar[] | Cipher text |
32 | mac | uchar[32] | HMACSHA256 Message Authentication Code |
Tagged Envelope
A tagged envelope is identical to an plain envelope but prepended with a tag. Tagged envelopes are only used by v4 pubkeys and v5 broadcasts.
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | tag | uchar[32] | The recipients tag |
1+ | envelope | plain_envelope |
Enumerateds and Flags
Message Encodings
Value | Name | Description |
---|---|---|
0 | IGNORE | Any data with this number may be ignored. The sending node might simply be sharing its public key with you. |
1 | TRIVIAL | UTF-8. No 'Subject' or 'Body' sections. Useful for simple strings of data, like URIs or magnet links. |
2 | SIMPLE | UTF-8. Uses 'Subject' and 'Body' sections. No MIME is used.
|
Further values for the message encodings can be decided upon by the community. Any MIME or MIME-like encoding format, should they be used, should make use of Bitmessage's 8-bit bytes.
Identity bitfield features
Bit | Name | Description |
---|---|---|
0 | undefined | The most significant bit at the beginning of the structure. Undefined |
1 | undefined | The next most significant bit. Undefined |
... | ... | ... |
30 | include_destination | Receiving node expects that the RIPE hash encoded in their address preceedes the encrypted message data of msg messages bound for them. |
31 | does_ack | If true, the receiving node does send acknowledgements (rather than dropping them). |
Node services
The following services are currently assigned:
Value | Name | Description |
---|---|---|
1 | NODE_NETWORK | This is a normal network node. |
2 | NODE_SSL | This node supports SSL/TLS in the current connect |
Object Types
Value | Name | Description |
---|---|---|
0 | getpubkey | |
1 | pubkey | |
2 | msg | A msg or msg ack |
3 | broadcast |
Error Levels
Value | Name | Description |
---|---|---|
0 | WARNING | |
1 | ERROR | |
2 | FATAL | A fatal or fatal-like error has occured. The connection usually terminated following this error. |
Message types
Undefined messages received on the wire must be ignored.
error
error is used to convey the reason for a following disconnection.
Field Size | Description | Data type | Comments |
---|---|---|---|
1+ | error level | var_int | The error level of this error |
1+ | ban time | var_int | The length of time the emitting node will refuse connections from the receiving node |
1+ | inv_vector_length | var_int | The length of the inventory vector (max: 100) Inventory vectors are fixed length at 32 bytes. Why does the size need to specified? Perhaps this was intended to be a count of inventory vectors? |
inv_vector_length | inv_vector | inv_vect | The inventory vector of the offending object this error relates to |
1+ | errorText | var_str | The error text (max length: 1000) |
The only error PyBitmessage sends is a FATAL error when it receives a version message where the timestamp is out by more than hour from its own
version
When a node creates an outgoing connection, it will immediately advertise its version. The remote node will respond with its version. No further communication is possible until both peers have exchanged their version. A PyBitmessage server responds with verack then version. A bmd server responds with version then verack
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | int32_t | Identifies protocol version being used by the node. The current protocol version is 3. Old nodes will always assume that future protocol version nodes are compatible; it is up to the new client to judge whether the previous version is incompatible and disconnect if it thinks that it is a good idea. |
8 | services | uint64_t | bitfield of features to be enabled for this connection |
8 | timestamp | int64_t | standard UNIX timestamp in seconds |
26 | addr_recv | net_addr | The network address of the node receiving this message (not including the time or stream number) |
26 | addr_from | net_addr | The network address of the node emitting this message (not including the time or stream number and the ip itself is ignored by the receiver) |
8 | nonce | uint64_t | Random nonce used to detect connections to self. |
1+ | user_agent | var_str | User Agent generally in the form of /Application:Version/ (max length: 5000) |
1+ | stream_numbers | var_int_list | The stream numbers that the emitting node is interested in. Sending nodes must not include more than 160,000 stream numbers. |
A "verack" packet shall be sent if the version packet was accepted. Once you have sent and received a verack messages with the remote node, send an addr message advertising up to 1,000 peers of which you are aware, and one or more inv messages advertising all of the valid objects of which you are aware.
verack
This message is sent in reply to version and has no payload. The TCP timeout starts out at 20 seconds; after verack messages are exchanged, the timeout is raised to 10 minutes.
If both sides announce that they support SSL, they MUST perform a SSL handshake immediately after they both send and receive verack. During this SSL handshake, the TCP client acts as a SSL client, and the TCP server acts as a SSL server. PyBitmessage v0.5.4 or later requires the AECDH-AES256-SHA cipher using the secp256k1 curve over TLSv1.
addr
Provide information on known nodes of the network. Only nodes that have been known to be on the network in the last 3 hours should be advertised. This command is regularly abused and any entries should not be relied upon as being nodes
Field Size | Description | Data type | Comments |
---|---|---|---|
1 - 3 | count | var_int | Number of address entries (max: 1,000) |
38 x count | list of net_addr | net_addr[] | Address of other nodes on the network. |
inv
Allows a node to advertise its knowledge of one or more objects.
Field Size | Description | Data type | Comments |
---|---|---|---|
1 - 3 | count | var_int | Number of inventory entries (max: 50,000) |
32 x count | list of inv_vect | inv_vect[] | Inventory vectors |
getdata
getdata is used in response to an inv message to retrieve the content of a specific object after filtering known elements.
Field Size | Description | Data type | Comments |
---|---|---|---|
1 - 3 | count | var_int | Number of inventory entries (max: 50,000) |
32 x count | list of inv_vect | inv_vect[] | Inventory vectors |
Current usage reveals getdata to only ever contain 1 entry
ping
ping is used to check whether a peer is still responsive and has no payload. A ping should be sent if a peer has not sent anything for more than 5 minutes. A peer receiving a ping may either close the connection or keep it open. To keep a connection open a peer should respond with either a pong (if it has nothing to send) or any contextually valid message. If a peer is silent for a full 10 minutes the connection should be closed.
pong
pong may be sent in response to a ping. pong may also be sent pre-emptively when a node recognises it hasn't sent anything to a peer for up to 5 minutes and wishes to keep the connection open. pong has no payload.
object
An object is a message which is shared throughout a stream. It is the only message which propagates; all others are only between two nodes. Objects have a type, like 'msg', or 'broadcast'. To be a valid object, the Proof Of Work must be done. The maximum allowable length of an object (not to be confused with the objectPayload) is 218 bytes (256 KiB).
Field Size | Description | Data type | Comments |
---|---|---|---|
8 | nonce | uint64_t | A nonce that satisfies the Proof Of Work |
8 | expiresTime | uint64_t | The "end of life" time of this object. Objects shall be shared with peers until its end-of-life time has been reached. The node should store the inventory vector of that object for some extra period of time to avoid reloading it from another node with a small time delay. The time may be no further than 28 days + 3 hours in the future. |
4 | objectType | uint32_t | The object type. Nodes should relay objects even if they use an undefined object type. |
1 - 9 | version | var_int | The object's version. |
1 - 9 | stream number | var_int | The stream number in which this object may propagate |
? | objectPayload | uchar[] | This field varies depending on the object type; see below. |
Object types
Here are the payloads for various object types.
getpubkey
When a node has the hash of a public key (from an address) but not the public key itself, it must send out a request for the public key.
v2 and v3 getpubkey
Field Size | Description | Data type | Comments |
---|---|---|---|
20 | ripe | uchar[20] | The ripemd hash of the public key |
v4 getpubkey
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | tag | uchar[32] | The tag derived from the address version, stream number, and ripe |
pubkey
v2 pubkey
This is the first 3 fields of an identity (behavior bitfield, public signing key, and public encryption key). The proof of work difficulty fields (nonce_trials_per_byte and extra_bytes) are omitted, inferring the network default values. For clarity This is still in use and supported by PyBitmessage but new v2 addresses are not generated by PyBitmessage.
Field Size | Description | Data type |
---|---|---|
4 | behavior bitfield | uint32_t |
64 | public signing key | uchar[64] |
64 | public encryption key | uchar[64] |
v3 pubkey
This is the full identity structure protected by a signature
Field Size | Description | Data type | Comments |
---|---|---|---|
138 - 150 | identity | Identity | This is the full identity structure |
7 - 73 | signature | var_str | The ECDSA signature covering this structure prepended with the object header (excluding the nonce). The signature is actually two signed positive integers r and s encoded in ASN.1 according to DER |
v4 pubkey
This is basically an encrypted v3 pubkey except the version is is 4.
Field Size | Description | Data type | Comments |
---|---|---|---|
? | envelope | tagged_envelope | Encrypted pubkey data |
When version 4 pubkeys are created, most of the data in the pubkey is encrypted. This is done in such a way that only someone who has the Bitmessage address which corresponds to a pubkey can decrypt and use that pubkey. This prevents people from gathering pubkeys sent around the network and using the data from them to create messages to be used in spam or in flooding attacks.
In order to encrypt the pubkey data, a double SHA-512 hash is calculated from the address version number, stream number, and ripe hash of the Bitmessage address that the pubkey corresponds to. The first 32 bytes of this hash are used to create a public and private key pair with which to encrypt and decrypt the pubkey data, using the same algorithm as message encryption (see Encryption). The remaining 32 bytes of this hash are added to the unencrypted part of the pubkey and used as a tag, as above. This allows nodes to determine which pubkey to decrypt when they wish to send a message.
In PyBitmessage, the double hash of the address data is calculated using the python code below:
doubleHashOfAddressData = hashlib.sha512(hashlib.sha512(encodeVarint(addressVersionNumber) + encodeVarint(streamNumber) + hash).digest()).digest()
msg
Used for person-to-person messages.
v1 msg
Field Size | Description | Data type | Comments |
---|---|---|---|
? | envelope | plain_envelope | Encrypted msg data |
Decrypted msg
Field Size | Description | Data type | Comments |
---|---|---|---|
139 - 151 | sender | Identity | Sender's identity prepend with the version. The proof of work difficulty (nonce_trials and extra_padding fields) may be personalised for the recipient |
20 | destination ripe | uchar[20] | The ripe hash of the public key of the receiver of the message |
1+ | encoding | var_int | Message encoding |
1+ | message | var_str | The message encoded as per encoding |
1+ | ack_data | var_str | The acknowledgement data to be transmitted. This is a fully qualified object with Proof Of Work completed |
7 - 73 | signature | var_str | The ECDSA signature covering this structure prepended with the object header (excluding the nonce). The signature is actually two signed positive integers r and s encoded in ASN.1 according to DER |
msg ack
A special form of msg used as an acknowledgement receipt. The objectType and version fields in the object header are set exactly the same as for a v1 msg
Field Size | Description | Data type | Comments |
---|---|---|---|
32 | ack data | uchar[32] | A random sequence of bytes that the sender waits for as an indication that the recipient has received their msg |
broadcast
Users who are subscribed to the sending address will see the message appear in their inbox.
Pubkey objects and v5 broadcast objects are encrypted the same way: The data encoded in the sender's Bitmessage address is hashed twice. The first 32 bytes of the resulting hash constitutes the "private" encryption key and the last 32 bytes constitute a tag so that anyone listening can easily decide if this particular message is interesting. The sender calculates the public key from the private key and then encrypts the object with this public key. Thus anyone who knows the Bitmessage address of the sender of a broadcast or pubkey object can decrypt it.
Having a broadcast version of 5 indicates that a tag is used which, in turn, is used when the sender's address version is >=4.
v4 broadcast
Field Size | Description | Data type | Comments |
---|---|---|---|
? | envelope | plain_envelope |
v5 broadcast
Field Size | Description | Data type | Comments |
---|---|---|---|
? | envelope | tagged_envelope |
Decrypted broadcast
A decrypted broadcast is nearly identical to a decrypted msg. The decrypted broadcast does not have destination ripe field nor an acknowlegement field.
Field Size | Description | Data type | Comments |
---|---|---|---|
139 - 151 | sender | Identity | Sender's identity |
1+ | encoding | var_int | Message encoding |
1+ | message | var_str | The message encoded as per encoding |
7 - 73 | signature | var_str | The ECDSA signature covering this structure prepended with the object header (excluding the nonce). The signature is actually two signed positive integers r and s encoded in ASN.1 according to DER |