Protocol specification

From Bitmessage Wiki
Jump to navigation Jump to search

Common standards

Hashes

Usually, when a hash is computed within Bitmessage, it is computed twice. Most of the time SHA-512 hashes are used, however RIPEMD-160 is also used when a shorter hash is desirable (for example when creating an address).

Example of double-SHA-256 encoding of string "hello":

hello
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 (first round of sha-256)
9595c9df90075148eb06860365df33584b75bff782a510c6cd4883a419833d50 (second round of sha-256)

For bitcoin addresses (RIPEMD-160) this would give:

hello
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 (first round is sha-256)
b6a9c8c230722b7c748331a8b450f05566dc7d0f (with ripemd-160)

Common structures

All integers are encoded in big endian. (This is different from Bitcoin).

Message structure

Field Size Description Data type Comments
4 magic uint32_t Magic value indicating message origin network, and used to seek to next message when stream state is unknown
12 command char[12] ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected)
4 length uint32_t Length of payload in number of bytes
4 checksum uint32_t First 4 bytes of sha512(sha512(payload))
? payload uchar[] The actual data

Known magic values:

Magic value Sent over wire as
0xE9BEB4D9 E9 BE B4 D9

Variable length integer

Integer can be encoded depending on the represented value to save space. Variable length integers always precede an array/vector of a type of data that may vary in length.

Value Storage length Format
< 0xfd 1 uint8_t
<= 0xffff 3 0xfd followed by the length as uint16_t
<= 0xffffffff 5 0xfe followed by the length as uint32_t
- 9 0xff followed by the length as uint64_t

Variable length string

Variable length string can be stored using a variable length integer followed by the string itself.

Field Size Description Data type Comments
1+ length var_int Length of the string
? string char[] The string itself (can be empty)

Variable length list of integers

n integers can be stored using n+1 variable length integers where the first var_int equals n.

Field Size Description Data type Comments
1+ length var_int Number of var_ints below
1+ var_int The first value stored
1+ var_int The second value stored...
1+ var_int etc...

Network address

When a network address is needed somewhere, this structure is used. This protocol and structure supports IPv6, but note that the original client currently only supports IPv4 networking. Network addresses are not prefixed with a timestamp in the version message.

Field Size Description Data type Comments
4 time uint32 the Time
8 services uint64_t same service(s) listed in version
16 IPv6/4 char[16] IPv6 address. The original client only supports IPv4 and only reads the last 4 bytes to get the IPv4 address. However, the IPv4 address is written into the message as a 16 byte IPv4-mapped IPv6 address

(12 bytes 00 00 00 00 00 00 00 00 00 00 FF FF, followed by the 4 bytes of the IPv4 address).

2 port uint16_t port number

Hexdump example of Network address structure

0000   00 00 00 00 00 00 00 01  00 00 00 00 00 00 00 00  ................
0010   00 00 FF FF 0A 00 00 01  20 8D                    ........ .

Network address:
 00 00 00 00 00 00 00 01                         - 1 (NODE_NETWORK: see services listed under version command)
 00 00 00 00 00 00 00 00 00 00 FF FF 0A 00 00 01 - IPv6: ::ffff:10.0.0.1 or IPv4: 10.0.0.1
 20 8D                                           - Port 8333

Inventory Vectors

Inventory vectors are used for notifying other nodes about objects they have or data which is being requested.

Inventory vectors consist of the following data format:

Field Size Description Data type Comments
32 hash char[32] Hash of the object

Unencrypted Message Data

After using your private key to decrypt a message, the data format is as follows:

Field Size Description Data type Comments
20 nulls char[20] Run of 20 null bytes to verify that the message is bound for self.
1+ msg_version var_int Message version
1+ address_version var_int Sender's address-version number. This is needed in order to calculate the sender's address to show in the UI, and also to allow for forwards compatible changes to the public-key data included below.
1+ stream var_int Sender's stream number
1+ nLength var_int Length of the sender's n
? n uchar[] Sender's n
1+ eLength var_int Length of the sender's e
? e uchar[] Sender's e
1+ encoding var_int Message Encoding type
1+ message_length var_int Message Length
? message uchar[] The message.
1+ ack_length var_int Length of the acknowledgement data
8 POW_nonce uint64_t Random nonce used for the Proof Of Work
? ack_data uchar[] The ack data to be transmitted.
? signiture uchar[] The cryptographic signature of the encoding, message_length, message, ack_length, POW_nonce, and ack_data all appended. The length should match the length of n

Message Encodings

Value Name Description
0 IGNORE Any data with this number may be ignored. The broadcasting node might simply be sharing its public key with you.
1 TRIVIAL UTF-8. No 'Subject' or 'Body' sections. Useful for simple strings of data, like URIs or magnet links.
2 SIMPLE UTF-8. Uses 'Subject' and 'Body' sections. No MIME is used.

messageToTransmit = 'Subject:' + subject + '\n' + 'Body:' + message

Further values for the message encodings can be decided upon by the community. Any MIME or MIME-like encoding format, should they be used, should make use of Bitmessage's 8-bit bytes.

Message types

version

When a node creates an outgoing connection, it will immediately advertise its version. The remote node will respond with its version. No futher communication is possible until both peers have exchanged their version.

Payload:

Field Size Description Data type Comments
4 version int32_t Identifies protocol version being used by the node
8 services uint64_t bitfield of features to be enabled for this connection
8 timestamp int64_t standard UNIX timestamp in seconds
26 addr_recv net_addr The network address of the node receiving this message
26 addr_from net_addr The network address of the node emitting this message
8 nonce uint64_t Node random nonce, randomly generated every time a version packet is sent. This nonce is used to detect connections to self.
1+ user_agent var_str User Agent (0x00 if string is 0 bytes long)
1+ stream numbers var_int_list The stream numbers that the emitting node is interested in.

A "verack" packet shall be sent if the version packet was accepted.

The following services are currently assigned:

Value Name Description
1 NODE_NETWORK This is a normal network node.

verack

The verack message is sent in reply to version. This message consists of only a message header with the command string "verack".

addr

Provide information on known nodes of the network. Non-advertised nodes should be forgotten after typically 3 hours

Payload:

Field Size Description Data type Comments
1+ count var_int Number of address entries (max: 1000)
30x? addr_list net_addr Address of other nodes on the network.

inv

Allows a node to advertise its knowledge of one or more objects. It can be received unsolicited, or in reply to getmessages.

Payload (maximum payload length: 50000 items):

Field Size Description Data type Comments
? count var_int Number of inventory entries
32x? inventory inv_vect[] Inventory vectors

getdata

getdata is used in response to inv, to retrieve the content of a specific object, and is usually sent after receiving an inv packet, after filtering known elements.

Payload (maximum payload length: 50000 entries):

Field Size Description Data type Comments
? count var_int Number of inventory entries
32x? inventory inv_vect[] Inventory vectors

getpubkey

When a node has the hash of a public key (from an address) but not the public key itself, it must send out a request for the public key.

Field Size Description Data type Comments
8 POW nonce uint64_t Random nonce used for the Proof Of Work
4 time uint32_t The time that this message was generated and broadcast
1+ address version var_int The address' version
1+ stream number var_int The address' stream number
20 pub key hash uchar[] The ripemd hash of the public key

pubkey

A public key

Field Size Description Data type Comments
8 POW nonce uint64_t Random nonce used for the Proof Of Work
1+ address version var_int The address' version
1+ stream number var_int The address' stream number
1+ n length var_int Length of the sender's n
? n uchar[] Sender's n
1+ e length var_int Length of the sender's e
? e uchar[] Sender's e

msg

Field Size Description Data type Comments
8 POW nonce uint64_t Random nonce used for the Proof Of Work
4 time uint32_t The time that this message was generated and broadcast
? encrypted uchar[] Encrypted data. (Todo: Add details describing the encryption format) See also Unencrypted Message Data Format