Future of OMEMO

OMEMO is an XMPP extension protocol, which specifies end-to-end encryption for XMPP clients using the double ratchet algorithm of the Signal protocol. Introduced back in 2015 by GSoC student Andreas Straub in the Conversations client, OMEMO had a lot of press coverage and many privacy and security oriented websites praise XMPP clients that do support it. Its beyond debate, that OMEMO brought many new faces to XMPP. For many users, having end-to-end encryption built into their chat client is a must. Today OMEMO is implemented in a range of clients on different platforms. While Conversations, ChatSecure and Dino support it out of the box, there is a series of plugins that teach OMEMO to other clients such as Gajim, Pidgin and Miranda NG.

However, there is quite a lot of controversy around OMEMO. Part of it are technical discussions, others are more or less of a political nature. Let me list some of them for you.

Some users and client developers see no value in OMEMOs forward secrecy (the fact, that messages can only be decrypted once per device, so new devices do not have access to the chat history of the user). That is a fair point. Especially webclients have a hard time implementing OMEMO in a sensible way. Also the average user is probably having a hard time understanding what exactly forward secrecy is and what the consequences are. Communicating to the user, that not having access to past messages is actually a feature might be a hard task for a client developer.

OMEMOs trust management (still) sucks. One architectural key feature of OMEMO is, that every device does have its own identity key. If a user wants to send a message to one of their contacts, they’re presented with a list of all of their identity keys. Now they have to decide, which keys to trust and which not by comparing fingerprints (seamingly arbitrary strings of 64 characters). This is not a very comfortable thing to do. Some clients encourage the user to verify your contacts devices by scanning QR-Codes, which is way more comfortable, users do however have to meet up in person or share the QR code on another channel.
But what if you get a new device or just reinstall your chat application? Suddenly all your contacts have to decide whether to trust your new fingerprint or not. In the long run this will lead to the user just being annoyed and blindly accepting new fingerprints, ruining the concept of end-to-end encryption.

Daniel Gultsch introduced the concept of BTBV (Blind Trust Before Verification) which can be summed up as “do not bother the user with encryption and hope everything goes well until the user explicitly states that they are interested in having good security”. The principle is, that clients blindly trust any OMEMO identity keys until the user commits to verifying them manually. This makes OMEMO easy to use for those who don’t really care about it, while offering serious users who depend on it the needed security.

But what do you do, if suddenly a rogue fingerprint appears? Do you panic and message all your contacts not to trust the stranger key? In theory any device which has access to the users account (the users server too) can just add another OMEMO identity key to the users list of devices and there is not really anything the user can do about it. OMEMO does not have a “blacklist”-feature or a signed list of trusted keys. It would however be possible to implement such thing in the future by combining OMEMO with OpenPGP for example. Of course, if some stranger has access to your account, it is time to change the password/server anyways.

Another weakness of OMEMO is, that it is currently only usable for encrypting the messages body. Since XMPP is an extensible protocol with other use cases than messaging, it would be nice to have support for arbitrary extension element encryption. There is however the extension protocol “OX” (XEP-0373: OpenPGP for XMPP), which has such capabilities. This feature can be extracted from OX and reused in OMEMO relatively easy.

Lets now focus on the “political” controversies around OMEMO.

In 2016/2017 there has been a lot of discussions, whether or not OMEMO should become a standard in the first place. The problem is, that in it’s current form (which has not really changes since its introduction), OMEMO depends on the wire format used by libsignal (the Signal protocol library used by conversations). That library however is licensed under the GPLv3 license, preventing permissively licensed and closed source applications from implementing OMEMO. While the Signal protocol itself is openly documented, the wire format used by libsignal is not, so any implementations which want to be compatible to current OMEMO clients must implement the same wire format by looking into the libsignal source code, which in turn makes the implementation a derivative of libsignal, which must be licensed under the GPL as well. There has been a pull request against the OMEMO XEP which addressed this issue by specifying an independent wire format for OMEMO, however that pull request was more or less rejected due to inactivity of the author.

During the phases of hot debates around OMEMO, it was discussed to base the protocol on Olm instead of the Signal protocol. Olm is the encryption protocol used by matrix.org. However, up to this point there is no Olm based OMEMO implementation, that I know of, neither have there been any experiments in that direction from people that proposed the idea (again – not that I know of).

Another idea was to completely redesign OMEMOs double ratchet specification as OMEMO-NEXT from ground up without even mentioning a specific library as the foundation. This would obviously be a way more complex XEP, as all the cryptographic operations and primitives which are currently abstracted and hidden away behind the Signal protocol, would have to be written down in that XEP. However, recently the IETF announced that work is done to create MLS (Message Layer Security), which does exactly that. It specifies a completely open version of the double ratchet algorithm along with infrastructure to share key material and so on. I’m not sure whether this is a coincidence, or if some of those who proposed OMEMO-NEXT are directly involved with the development of MLS. We’ll see, when MLS is ready and whether it will compete against OMEMO. I’d really love to see a cross-protocol encryption algorithm btw 😉 #bridges #federateEverything

Now lets talk about the biggest problem of OMEMO. Currently the XEP does not have an active “legal guardian”. The author has been inactive for an extended period of time, ignoring requests for comments, causing a total halt in the development of the standard (making changes to the XEP needs the authors approval). Things like specifying a new wire protocol are possible and reasonably easy to do. However not having changes written down in the XEP makes it nearly impossible to make coordinated changes. I’m sure there is a ton of potential for OMEMO and it is sad to see its (protocol-) development having come to a halt.

I’m sure that many of its current issues can be addressed by changes to the XEP. This is what I think needs to be done to OMEMO to make it more usable:

  • Specify a new wire protocol: This could make OMEMO accesible for commercial applications and allow independent implementations -> broader deployment
  • Specify a general payload encryption scheme: This could benefit other encryption protocols as well and would make it possible to apply end-to-end encryption to a wider variety of use cases.
  • Reuse the payload encryption scheme in OMEMO: Utilize OMEMO for things besides body encryption.
  • Specify a way to sign device lists with “persistent key” algorithms like OpenPGP: This could simplify trust management.
  • Specify a way to backup the identity key: This could reduce the identity key chaos, since the key could be reused on new devices/installations. However clients would have to make it clear to the user, not to use the same key on multiple devices.
This have been my thoughts about OMEMOs current state. What do you think about it?
Happy Hacking!

Summer of Code: Smack has OpenPGP Support!

I am very proud to announce, that Smack got support for OpenPGP for XMPP!

Today the Pull Request I worked on during my GSoC project was merged into Smacks master branch. Admittedly it will take a few months until smack-openpgp will be included in a Smack release, but that gives me time to further finalize the code and iron out any bugs that may be in there. If you want to try smack-openpgp for yourself, let me know of any issues you encounter 🙂

(Edit: There are snapshot releases of Smack available for testing)

Now Smack does support two end-to-end encryption methods, which complement each other perfectly. OMEMO is best for people that want to be safe from future attacks, while OpenPGP is more suited for users who appreciate being able to access their chat history at any given time. OpenPGP is therefore the better choice for web based applications, although it is perfectly possible to implement web based clients that do OMEMO (see for example the Wire web app, which does ratcheting similar to OMEMO).

What’s left to do now is updating smack-openpgp due to updates made to XEP-0373 and extensive testing against other implementations.

Happy Hacking!

Summer of Code: Finalizing the PR

Quick update:

Only a few days are left until the last and final Evaluation Phase.

I spent the week opening my pull request against Smacks master branch and adding a basic trust management implementation. Now the user is required to make decisions whether to trust a contacts key or not. However, the storage implementation is kept very modular, so an implementor can easily create a trust store implementation that realizes custom behaviour.

Smack-openpgp now allows users which did not subscribe to one another to exchange encryption keys quite easily. If a user receives an encrypted message, the implementation automatically fetches the senders keys to allow signature verification.

Furthermore there are more JUnit tests now, so that Smacks total test coverage actually increases when my PR gets merged 😀

Happy Hacking!

Summer of Code: First PGPainless Release!

I’m very happy and proud to announce the first alpha release of PGPainless!

PGPainless 0.0.1-alpha1 is the first non-snapshot release and is available from maven central. It was an interesting experience to go through the process of creating a release and I’m looking forward to have many more releases in the future 🙂

The current release contains a workaround for the bug I described in an earlier blog post. The issue was, that bouncycastle wouldn’t mark the public sub keys of a secret key ring as sub keys, which results in loss of keys if the user tries to create a public key ring from the exported public keys. My workaround fixes the issue by iterating through all sub keys of an existing key ring and converting the key packages of subkeys to subkey packages. The code is also available as a gist.

Ironically I had some issues related to OpenPGP during the release process. Releases to maven central have to be signed with an OpenPGP key, so I created a dedicated signing key pair using GnuPG, which I wanted to put into a separate GPG key box. After creating the key, I exported it using

gpg --armor --export-secret-keys [key-id] > pgpainless-singing-key.asc

imported it into a dedicated key box using

gpg --no-default-keyring --keyring pgpainless.gpg --import pgpainless-signing-key.asc

and deleted the key from my old key box, as well as the .asc-file. But when I tried to sign my release, I got the error, that a secret key would be needed. After checking the key box, I noticed, that only a public key was present.

Unfortunately that key had already been published to the key servers and I have no way to revoke it, due to lack of a secret key. I have no idea, what exactly happened or how it could happen, but its too late to recover the key.

Edit: I found my error: Importing a secret key is only possible with the flag `–allow-secret-key-import`, eg

gpg --import --allow-secret-key-import key.asc

So in the end I had to create a new OpenPGP key, which I now carefully backed up on a freshly bought USB stick which will be locked away for the event that I lose the copy on my work station. Better safe than sorry.

Happy Hacking!

Summer of Code: Plan for the grand finale

I passed the second evaluation phase 🙂 Now begins the final spurt, as the last month of GSoC has begun. My main goal can be summarized as follows: Get everything merged!

To get that done, I have to polish up my smack-openpgp branch which has grown to a size of 7000 loc. There are still some minor quirks, but Florian recommended to focus on the big picture instead of spending too much time on small details and edge cases.

I also have to release pgpainless to maven central and establish some kind of release cycle. It will be a future challenge for me personally to synchronize the releases of smack-openpgp and pgpainless.

But now enough talking, I have to get to work 🙂

Happy Hacking!

Summer of Code: Second evaluation phase

Quite some time has passed since I bothered you with my last post 🙂 A lot has happened since, I have been making steady process in both smack-openpgp, as well as pgpainless.

One big step that I took was to get rid of smack-openpgp-bouncycastle, which now has been merged into smack-openpgp. Having modular code may be worthwhile, however it poses some big challenges. The biggest problem with having smack-openpgp not depend on smack-openpgp-bouncycastle was, that I could not use classes that represent encryption keys directly in smack-openpgp. Instead I had to create interfaces that encapsule functionality and call those in order to get stuff done from inside smack-openpgp. Last week me and flow decided that it would make my job a lot easier if we just got rid of smack-openpgp-bouncycastle by merging the two modules. In case there will be another implementation at some point, the code would still be modular enough to allow extension by overriding classes and methods.

Now smack-openpgp depends on pgpainless directly, which means that I don’t have to create duplicate code to get bundled information from pgpainless to smack-openpgp for instance. This change gave me a huge performance boost in the development process, as it makes the next steps much more clear for me due to less abstraction.

I rewrote the whole storage backend of smack-openpgp, keeping everything as modular as possible. Now there are 3 different store types. One store is responsible for keys, another one for metadata and a third one for trust decisions. For all of those I created a file-based implementation which just writes information to files. An implementor can for example chose to write information to a database instead. For all those store classes I wrote a parametrized junit test, meaning new implementations can easily be tested by simply inserting an instance of the new store into an array.

Unfortunately I stumbled across yet another bug in bouncycastle, which makes it necessary to implement a workaround in my project until a patched version of bouncycastle is released.
The issue was, that a key ring which consists of a master key and some subkeys was not exported correctly. The subkeys would be exported as normal keys, which caused the constructor of the key ring to skip those, as it expected sub keys, not normal keys. That lead to the subkeys getting lost, which caused smack-openpgp to be unable to encrypt messages for contacts which use a master key and subkeys for OpenPGP.

This bug has been fixed pretty fast by the bouncycastle team and the minimal test I created to illustrate my problem has been incorporated into bouncycastle. Thank you 🙂

Currently I’m working on a workaround for the bug in smack-openpgp, but that work is already working. Next I will polish up my test client and do some more field testing to iron out all the edge cases I probably overlooked 🙂

Happy Hacking!

Summer of Code: Checkstyle to the rescue!

Today I added some checkstyle rules to PGPainless.Checkstyle is a gradle plugin, which checks the source code for style violations.

Some say, strict checkstyle rules are unnecessary and that it is annoying to be held back from pushing a commit to the master branch only to fix “style issues” for half an hour. I must say, in the beginning I thought the same way. I was annoyed thinking “why does it matter, if a line comment ends with a period or not?” But after being forced to give it a try when I first became a contributor to the Smack project, I became a fan of it. In the beginning I had to often recommit my changes because they broke the checkstyle rules. For example I often forgot to leave spaces between mathematical operators. I would write “i = 5+5;” instead of “i = 5 + 5;”. But after some amount of time, I got less and less warnings.

I adopted most of the (honestly *very* strict) rules in Smacks rule set to my own coding style. I like how it automatically leads to cleaner, more uniform code (not that it is impossible to write garbage with it of course). For that reason, I decided to put those rules into place in PGPainless today (I only left one rule out, because who the hell cares about the alphabetical sorting of imports???).

At some point, PGPainless will be released as a maven artifact. In preparation for this historical event, I bought the domain pgpainless.org. For now it is just a forwarding to the PGPainless git repository, but I will likely setup a small website with documentation etc. at some point.

During my testing of Smacks OX implementation, I came across an interesting problem. When a user queries a PubSub node in Smack, Smack first does a disco#info query on that node to determine, whether it is a LeafNode or a CollectionNode. This normally works fine. However, it becomes more and more popular to make use of the PubSub access model ‘open’. The open access model makes a PubSub node accessible to entities (like other users) which are not in the contact list of the user. This enables the use of OMEMO in group chats, where not every participant is in your contact list for example.

The problem is that a server which allows access to open PubSub nodes, does not necessarily allow the disco#info query. The question is: Should disco#info queries on open PubSub nodes be allowed or not? An argument against it is, that it might allow “jid-harvesting”. An attacker might use disco#info queries on open PubSub nodes in order to determine, whether the user exists or not. This is a bad thing, because it allows spammers to collect the Jabber IDs of potential victims. On the other hand however, the attacker could simply do a direct PubSub query on the open node and the result would be the same. The benefit of allowing disco#info queries would be, that you can in fact determine the node type.
For now my mail to the standards mailing list remained unanswered, but I think that there should be a well defined expected behavior for this edge case.

For now I worked around the issue by using Javas reflections to access the LeafNode constructor directly, avoiding the disco#info query.

Other than that, I didn’t get a whole lot done this week. Unlike the demotivating week though, this time the reason was primarily exciting new hardware 😀

Happy Hacking!

Summer of Code: An (almost) three line fix to a three days problem

diff --git a/src/main/java/de/vanitasvitae/crypto/pgpainless/decryption_verification/DecryptionStreamFactory.java b/src/main/java/de/vanitasvitae/crypto/pgpainless/decryption_verification/DecryptionStreamFactory.java
index d651b1b..bca7ba4 100644
--- a/src/main/java/de/vanitasvitae/crypto/pgpainless/decryption_verification/DecryptionStreamFactory.java
+++ b/src/main/java/de/vanitasvitae/crypto/pgpainless/decryption_verification/DecryptionStreamFactory.java
@@ -157,15 +157,13 @@ public class DecryptionStreamFactory {
         PGPPrivateKey decryptionKey = null;
         PGPPublicKeyEncryptedData encryptedSessionKey = null;
         while (iterator.hasNext()) {
-            encryptedSessionKey = (PGPPublicKeyEncryptedData) iterator.next();
-            long keyId = encryptedSessionKey.getKeyID();
+            PGPPublicKeyEncryptedData encryptedData = (PGPPublicKeyEncryptedData) iterator.next();
+            long keyId = encryptedData.getKeyID();
 
             resultBuilder.addRecipientKeyId(keyId);
             LOGGER.log(LEVEL, "PGPEncryptedData is encrypted for key " + Long.toHexString(keyId));
-            if (decryptionKey != null) {
-                continue;
-            }
 
             PGPSecretKey secretKey = decryptionKeys.getSecretKey(keyId);
             if (secretKey != null) {
                 LOGGER.log(LEVEL, "Found respective secret key " + Long.toHexString(keyId));
+                encryptedSessionKey = encryptedData;
                 decryptionKey = secretKey.extractPrivateKey(decryptionKeyDecryptor.getDecryptor(keyId));
                 resultBuilder.setDecryptionKeyId(keyId);
             }
         }
         PublicKeyDataDecryptorFactory keyDecryptor = new BcPublicKeyDataDecryptorFactory(decryptionKey);

The above 3 deletions and 1 addition are the fix for a bug in my decryption routines, which took me 3 days to find. Lets examine the bug in more detail in order to understand what the code does, what went wrong, and why it took me so long to find it.

As I described in an earlier blog post, an encrypted OpenPGP message basically consists of the symmetrically encrypted body and a bunch of public key encrypted session keys. In Bouncycastle, the encrypted session keys are called PGPPublicKeyEncryptedData objects. The are encrypted using the asymmetric public key of a recipient and they contain the symmetric session key which was used to encrypt the actual message. So if Alice writes a message to Bob and Carla, the message will contain at least two PGPPublicKeyEncryptedData elements. One containing the session key encrypted for Bob, the other one for Carla.

In PGPainless, I iterate through all PGPPublicKeyEncryptedData objects of an incoming message in order to get a list of all recipients of the message. I do that because I can imagine it would be cool if Bob can see, that the message was also encrypted for Carla. Also during every iteration I see, if the current PGPPublicKeyEncryptedData object is encrypted for a key of which I do have the secret key available. If so, I extract the private key for later use.

Lets have an example:

Alice encrypts a message to both herself and Bob. Then she sends the message to Bob. Bob tries to decrypt the message.
PGPainless saves the first PGPPublicKeyEncryptedData object in the variable encryptedSessionKey.
This first encrypted session key has Alice’s key ID, so PGPainless adds the id to the list of recipient key ids and goes to the next iteration. Again it stores the encrypted data object in encryptedSessionKey.
This second encrypted session key has Bob’s key ID. PGPainless finds the respective private key, extracts is and saves it to the variable decryptionKey.

Now the iteration ends, as all PGPPublicKeyEncryptedData objects have been processed. Now the decryption can begin. Using the extracted private key, Bob decrypts the encryptedSessionKey to retrieve his session key with which he decrypts the message. Easy Peasy Lemon Squeezy – at least that is what I thought.

I wrote some JUnit tests that encrypt a message for one recipient and those tests work just fine. I used them for example to determine the average message length using different algorithms.

For some strange reason however, during integration testing I noticed that every now and then decryption would fail. I thought it had to do with some race conditions at first. I blamed JUnits @Before and @After annotations which I used to delete the key rings from disk after the test was done to fire too early. Coincidentally after I removed the annotations the test would work flawlessly, which encouraged me and seemed to confirm my hypothesis.

However, the more tests I did, the more confused I became, as I could not find the cause of the failing. Bouncycastle has debugging disabled for their library, so I could not follow the calculations step by step. This is done for security reasons I guess. Fortunately there is a debug version which I discovered today. Using this library, I can see, which lines are responsible for throwing exceptions and step through the execution in great detail.

The final hint however came from a forum post. So lets see what exactly went wrong. For that purpose lets assume the same scenario as above, but with a slight twist.

Again, Alice encrypts a message to herself and Bob, but this time she prepends Bobs session key first.
Bob, when decrypting the message, stores the first PGPPublicKeyEncryptedData object in encryptedSessionKey, notices that this is his own session key and extracts his respective private key into the variable decryptionKey.
In the next iteration (remember, Bob wants to know all recipients), he stores the next PGPPublicKeyEncryptedData object in encryptedSessionKey. This is Alices session key.

Now he proceeds with decrypting the encryptedSessionKey with his decryptionKey – and hell breaks lose. Why? Because at this point in time encryptedSessionKey actually contains Alices session key, not Bobs.

The tricky part about this bug is, that it only happens by chance. I’m not sure in which order session keys are prepended to the message by Bouncycastle, but in certain cases this caused my code to fail – while in certain cases it did not. One benefit that OpenPGP has over OMEMO is, that you can write better tests. In OMEMO keys always change when the ratchet advances, so you cannot really test if certain messages decrypt. At least it is much easier using OpenPGP. But this bug told me a lesson, that you have to be very very careful with your JUnit tests. At some point I replaced randomly generated key pairs with some fixed keys to get more reliable test results and to be able to confirm the result using GnuPG. Ironically I was very lucky that as a result my test would reproduce the second scenario above. If instead it would have produced the first scenario, it would have taken me much much longer to discover the cause of the issue.

Fighting this bug took me 3 days. Luckily I didn’t spend 100% of my time bug hunting. I also wrote another integration test, which covers one very cool feature of OpenPGP for XMPP.

Secret Key Synchronization

In OX it is possible to store the secret OpenPGP key in a private PubSub node. That way the key can be easily transported to other devices, so you can read your encrypted messages in a web browser for example. This is one huge benefit over OMEMO, where you can only read messages received *after* you began using the new device.

During my testing, I also found out, that ejabberd despite announcing support for alternative PubSub access models, does not properly handle some requests.
For normal messaging, OX recommends to publish the public keys of users in a PubSub node configured with the ‘open’ access model. That allows users to access your public key, even if they are not in your roster.

Smack however does a disco info query on a PubSub node prior to fetching it in order to get some metadata about it. It turns out that ejabberd does return an error for such a request, stating that you have to be subscribed to the owner of the node in order to fetch information about it. I’m pretty sure though, that this bug will be fixed soon 🙂

Happy Hacking!

Summer of Code: The demotivating week

I guess in anybodies project, there is one week that stands out from the others by being way less productive than the rest. I just had that week.

I had to take one day off on Friday due to circulation problems after a visit at the doctor (syringes suck!), so I had the joy of an extended weekend. On top that, I was not at home that time, so I didn’t write any code during these days.

At least I got some coding done last week. Yesterday I spent the whole day scratching my head about an error that I got when decrypting a message in Smack. Strangely that error did not happen in my pgpainless tests. Today I finally found the cause of the issue and a way to work around it. Turns out, somewhere between key generation and key loading from persistent storage, something goes wrong. If I run my test with fresh keys, everything works fine while if I run it after loading the keys from disk, I get an error. It will be fun working out what exactly is going wrong. My breakpoint-debugging skills are getting better, although I still often seem to skip over important code points during debugging.

My ongoing efforts of porting the Smack OX code over from using bouncy-gpg to pgpainless are still progressing slowly, but steady. Today I sent and received a message successfully, although the bug I mentioned earlier is still present. As I said, its just a matter of time until I find it.

Apart from that, I created another very small pull request against the Bouncycastle repository. The patch just fixes a log message which irritated me. The message stated, that some data could not be encrypted, while in fact date is being decrypted. Another patch I created earlier has been merged \o/.

There is some really good news:
Smack 4.4.0-alpha1 has been released! This version contains my updated OMEMO API, which I have been working on since at least half a year.

This week I will continue to integrate pgpainless into Smack. There is also still a significant lack of JUnit tests in both projects. One issue I have is, that during my project I often have to deal with objects, that bundle information together. Those data structures are needed in smack-openpgp, smack-openpgp-bouncycastle, as well as in pgpainless. Since smack-openpgp and pgpainless do not depend on one another, I need to write duplicate code to provide all modules with classes that offer the needed functionality. This is a real bummer and creates a lot of ugly boilerplate code.

I could theoretically create another module which bundles those structures together, but that is probably overkill.

On the bright side of things, I passed the first evaluation phase, so I got a ton of motivation for the coming days 🙂

Happy Hacking!

Summer of Code: Evaluation and Key Lengths

The week of the first evaluation phase is here. This is the fourth week of GSoC – wow, time flew by quite fast this year 🙂

This week I plan to switch my OX implementation over to PGPainless in order to have a working prototype which can differentiate between sign, crypt and signcrypt elements. This should be pretty straight forward. In case everything goes wrong, I’ll keep the current implementation as a working backup solution, so we should be good to go 🙂

OpenPGP Key Type Considerations

I spent some time testing my OpenPGP library PGPainless and during testing I noticed, that messages encrypted and signed using keys from the family of elliptic curve cryptography were substantially smaller than messages encrypted with common RSA keys. I knew already, that one benefit of elliptic curve cryptography is, that the keys can be much smaller while providing the same security as RSA keys. But what was new to me is, that this also applies to the length of the resulting message. I did some testing and came to interesting results:

In order to measure the lengths of produced cipher text, I create some code that generates two sets of keys and then encrypts messages of varying lengths. Because OpenPGP for XMPP: Instant Messaging only uses messages that are encrypted and signed, all messages created for my tests are encrypted to, and signed with one key. The size of the plaintext messages ranges from 20 bytes all the way up to 2000 bytes (1000 chars).

Diagram comparing the lengths of ciphertext of different crypto systems
Comparison of Cipher Text Length

The resulting diagram shows, how quickly the size of OpenPGP encrypted messages explodes. Lets assume we want to send the smallest possible OX message to a contact. That message would have a body of less than 20 bytes (less than 10 chars). The body would be encapsulated in a signcrypt-element as specified in XEP-0373. I calculated that the length of that element would be around 250 chars, which make 500 bytes. 500 bytes encrypted and signed using 4096 bit RSA keys makes 1652 bytes ciphertext. That ciphertext is then base64 encoded for transport (a rule of thumb for calculating base64 size is ceil(bytes/3) * 4 which results in 2204 bytes. Those bytes are then encapsulated in an openpgp-element (adds another 94 bytes) which can be appended to a message. All in all the openpgp-element takes up 2298 bytes, compared to a normal body, which would only take up around 46 bytes.

So how do elliptic curves come to the rescue? Lets assume we send the same message again using 256 bit ECC keys on the curve P-256. Again, the length of the signcrypt-element would be 250 chars or 500 bytes in the beginning. OpenPGP encrypting those bytes leads to 804 bytes of ciphertext. Applying base64 encoding results in 1072 bytes, which finally make 1166 bytes of openpgp-element. Around half the size of an RSA encrypted message.

For comparison: I estimated a typical XMPP chat message body to be around 70 characters or 140 bytes based on a database dump of my chat client.

We must not forget however, that the stanza size follows a linear function of the form y = m*x+b, so if the plaintext size grows, the difference between RSA and ECC will become less and less significant.
Looking at the data, I noticed, that applying OpenPGP encryption always added a constant number to the size of the plaintext. Using 256 bit ECC keys only adds around 300 bytes, encrypting a message using 2048 bit RSA keys adds ~500 bytes, while RSA with 4096 bits adds 1140 bytes. The formula for my setup would therefore be y = x + b, where x and y are the size of the message before and after applying encryption and b is the overhead added. This formula doesn’t take base64 encoding into consideration. Also, if multiple participants -> multiple keys are involved, the formula is suspected to underestimate, as the overhead will grow further.

One could argue, that using smaller RSA keys would reduce the stanza size as well, although not as much, but remember, that RSA keys have to be big to be secure. An 3072 bit RSA key provides the same security as an 256 bit ECC key. Quoting Wikipedia:

The NIST recommends 2048-bit keys for RSA. An RSA key length of 3072 bits should be used if security is required beyond 2030.

As a conclusion, I propose to add a paragraph to XEP-0373 suggesting the use of ECC keys to keep the stanza size low.