Everyone who knows and uses XMPP is probably aware of a new player in the game. Matrix.org is often recommended as a young, arising alternative to the aging protocol behind the Jabber ecosystem. However the founders do not see their product as a direct competitor to XMPP as their approach to the problem of message exchanging is quite different.
An open network for secure, decentralized communication.matrix.org
During his talk at the FOSDEM in Brussels, matrix.org founder Matthew Hodgson roughly compared the concept of matrix to how git works. Instead of passing single messages between devices and servers, matrix is all about synchronization of a shared state. A chat room can be seen as a repository, which is shared between all servers of the participants. As a consequence communication in a chat room can go on, even when the server on which the room was created goes down, as the room simultaneously exists on all the other servers. Once the failed server comes back online, it synchronizes its state with the others and retrieves missed messages.
Olm, Megolm – What’s the deal?
Matrix introduced two different crypto protocols for end-to-end encryption. One is named Olm, which is used in one-to-one chats between two chat partners (this is not quite correct, see Updates for clarifying remarks). It can very well be compared to OMEMO, as it too is an adoption of the Signal Protocol by OpenWhisperSystems. However, due to some differences in the implementation Olm is not compatible with OMEMO although it shares the same cryptographic properties.
The other protocol goes by the name of Megolm and is used in group chats. Conceptually it deviates quite a bit from Olm and OMEMO, as it contains some modifications that make it more suitable for the multi-device use-case. However, those modifications alter its cryptographic properties.
Comparing Cryptographic Building Blocks
|Ratcheting Algoritm||Double Ratchet||Double Ratchet|
- Signal uses a Curve X25519 IdentityKey, which is capable of both encrypting, as well as creating signatures using the XEdDSA signature scheme. Therefore no separate FingerprintKey is needed. Instead the fingerprint is derived from the IdentityKey. This is mostly a cosmetic difference, as one less key pair is required.
- Olm does not distinguish between the concepts of signed and unsigned PreKeys like the Signal protocol does. Instead it only uses one type of PreKey. However, those may be signed with the FingerprintKey upon upload to the server.
- OMEMO includes the SignedPreKey, as well as an unsigned PreKey in the handshake, while Olm only uses one PreKey. As a consequence, if the senders Olm IdentityKey gets compromised at some point, the very first few messages that are sent could possibly be decrypted.
In the end Olm and OMEMO are pretty comparable, apart from some simplifications made in the Olm protocol. Those do only marginally affect its security though (as far as I can tell as a layman).
The similarities between OMEMO and Matrix’ encryption solution end when it comes to group chat encryption.
OMEMO does not treat chats with more than two parties any other than one-to-one chats. The sender simply has to manage a lot more keys and the amount of required trust decisions grows by a factor roughly equal to the number of chat participants.
Yep, this is a mess but luckily XMPP isn’t a very popular chat protocol so there are no large encrypted group chats ;P
So how does Matrix solve the issue?
When a user joins a group chat, they generate a session for that chat. This session consists of an Ed25519 SigningKey and a single ratchet which gets initialized randomly.
The public part of the signing key and the state of the ratchet are then shared with each participant of the group chat. This is done via an encrypted channel (using Olm encryption). Note, that this session is also shared between the devices of the user. Contrary to Olm, where every device has its own Olm session, there is only one Megolm session per user per group chat.
Whenever the user sends a message, the encryption key is generated by forwarding the ratchet and deriving a symmetric encryption key for the message from the ratchets output. Signing is done using the SigningKey.
Recipients of the message can decrypt it by forwarding their copy of the senders ratchet the same way the sender did, in order to retrieve the same encryption key. The signature is verified using the public SigningKey of the sender.
There are some pros and cons to this approach, which I briefly want to address.
First of all, you may find that this protocol is way less elegant compared to Olm/Omemo/Signal. It poses some obvious limitations and security issues. Most importantly, if an attacker gets access to the ratchet state of a user, they could decrypt any message that is sent from that point in time on. As there is no new randomness introduced, as is the case in the other protocols, the attacker can gain access by simply forwarding the ratchet thereby generating any decryption keys they need. The protocol defends against this by requiring the user to generate a new random session whenever a new user joins/leaves the room and/or a certain number of messages has been sent, whereby the window of possibly compromised messages gets limited to a smaller number. Still, this is equivalent to having a single key that decrypts multiple messages at once.
The Megolm specification lists a number of other caveats.
On the pro side of things, trust management has been simplified as the user basically just has to decide whether or not to trust each group member instead of each participating device – reducing the complexity from a multiple of n down to just n. Also, since there is no new randomness being introduced during ratchet forwarding, messages can be decrypted multiple times. As an effect devices do not need to store the decrypted messages. Knowledge of the session state(s) is sufficient to retrieve the message contents over and over again.
By sharing older session states with own devices it is also possible to read older messages on new devices. This is a feature that many users are missing badly from OMEMO.
On the other hand, if you really need true future secrecy on a message-by-message base and you cannot risk that an attacker may get access to more than one message at a time, you are probably better off taking the bitter pill going through the fingerprint mess and stick to normal Olm/OMEMO (see Updates for remarks on this statement).
Note: End-to-end encryption does not really make sense in big, especially public chat rooms, since an attacker could just simply join the room in order to get access to ongoing communication. Thanks to Florian Schmaus for pointing that out.
I hope I could give a good overview of the different encryption mechanisms in XMPP and Matrix. Hopefully I did not make any errors, but if you find mistakes, please let me know, so I can correct them asap 🙂
- Olm – A Cryptographic Ratchet
- Megolm Group Ratchet
- Implementing End-to-End Encryption in Matrix clients
- OMEMO Encryption
- The Double Ratchet Algorithm
- Extended Triple Diffie-Hellman Key Exchange
Thanks for Matthew Hodgson for pointing out, that Olm/OMEMO is also effectively using a symmetric ratchet when multiple consecutive messages are sent without the receiving device sending an answer. This can lead to loss of future secrecy as discussed in the OMEMO protocol audit.
Also thanks to Hubert Chathi for noting, that Megolm is also used in one-to-one chats, as matrix doesn’t have the same distinction between group and single chats. He also pointed out, that the security level of Megolm (the criteria for regenerating the session) can be configured on a per-chat basis.