Message Tagging Proposal
This is a draft proposal to allow Bitmessage messages to be 'tagged' so they can be received by lite clients, such as mobile phones. This page describes the changes proposed and the reasoning behind them.
Summary of the proposal
- Bitmessage messages can be 'tagged' with a hash derived from the message's destination address and a time value.
- This tag allows lite clients, such as mobile phones, to receive messages sent to them without downloading and processing large amounts of data.
- The time component of the tag is noon on the day the message is sent, expressed in Unix time.
- Message tagging is only necessary when sending messages to lite clients. Otherwise the tag field is filled with random data.
Reasoning behind the proposal
In order for 'lite' clients such as mobile phones to receive messages, there needs to be some way for it to identify messages that have been sent to its addresses. The logic behind this proposal is that regardless of how we decide to scale Bitmessage's peer to peer network through streams and other changes, Bitmessage should offer a way for 'lite' clients to retrieve messages destined for them without downloading and processing a whole stream's worth of data, just as Electrum does for Bitcoin. This can be accomplished by introducing a 'tag' to Bitmessage messages which is derived from the destination address.
Tagging messages in this way is clearly a trade-off in terms of security, as it reveals some extra information, albeit obfuscated information, about the message's destination. However, this trade-off deliverss a substantial benefit in allowing mobile phones and other lite clients a viable part of Bitmessage. Adding a tag to messages when the destination is a lite client can be a user-configurable option (there is already an option for this in the PyBitmessage settings page), so it will not be forced on anyone.
Message tags are calculated by hashing two sets of data. The first set is data taken from the message's destination address. The second is a time value based on the day the message is sent.
Adding a time component to the data used to calculate the tag reduces the information leakage created by tagging messages. It is proposed that the time value should be based on the day on which the message is sent. Thus, if a lite client is offline for 5 days, it can calculate 5 different tag values for each of its addresses and request any messages marked with those tags from servers. The balance here is between the level of obfuscation introduced into the tag and the burden created for a lite client requesting messages with those tags. Unless an adversary discovers the address that the messages are being sent to, there should be no way to link messages sent on different days to each other.
Where a message is not being sent to a lite client's address and does thus not need to be tagged, the 'tag' field of the message can be filled with random data. If the hashing function used to calculate the tag is secure, then message tags and random data of the same length should be indistinguishable.
Proposed changes in detail
The message tagging proposed is very similar to the tagging of version 4 pubkeys that is already done in Bitmessage (see https://bitmessage.org/wiki/Protocol_specification#pubkey). The 'address data' component of the data used to calculate the message tag is the same as that in an encrypted pubkey, namely addressVersion + streamNumber + ripeHash of the address in question.
A Bitmessage client can determine whether an address is a 'lite client' address that requires messages sent to it to be tagged by examining the address's behaviour bitfield (see https://bitmessage.org/wiki/Protocol_specification#Pubkey_bitfield_features).
The 'time' component of the tag is noon on the day the message is sent, expressed in Unix time. For example, the unix timestamp for 12:00 (noon) on 07/07/2014 is 1404734400. Unix time is already used throughout Bitmessage, so there should be no difficulty for Bitmessage clients to calculate this value.
A pseudocode expression of the proposed procedure to calculate a message tag follows:
fullTag = sha512(sha512(addressVersionNumber + streamNumber + ripeHash + unixTimestamp))
messageTag = fullTag[0-31]
Under this proposal, a msg object would be composed as follows. This can be compared to the current specification found at https://bitmessage.org/wiki/Protocol_specification#msg.
|Field Size||Description||Data type||Comments|
|8||POW nonce||uint64_t||Random nonce used for the Proof Of Work|
|4 (or 8)||time||uint32_t||The time that this message was generated and broadcast. We are transitioning to 8 byte time.|
|1+||streamNumber||var_int||The stream number of the destination address.|
|32||tag||uchar||The message tag, made up of bytes 0-31 of sha512(sha512(addressVersionNumber + streamNumber + ripeHash + unixTimestamp)). If the message has no tag, this field is filled with 32 random bytes.|
|?||encrypted||uchar||Encrypted data. See Encrypted payload. See also Unencrypted Message Data Format|
All feedback, ideas, and criticism about this proposal are welcome. A discussion thread can be found on the Bitmessage forum here: XXX.