How to Implement a XEP for Smack.

Photo of the Earth taken from Space

Smack is a FLOSS XMPP client library for Java and Android app development. It takes away much of the burden a developer of a chat application would normally have to carry, so the developer can spend more time working on nice stuff like features instead of having to deal with the protocol stack.

Many (80+ and counting) XMPP Extension Protocols (XEPs) are already implemented in Smack. Today I want to bring you along with me and add support for one more.

What Smack does very well is to follow the Open-Closed-Principle of software architecture. That means while Smacks classes are closed for modification by the developer, it is pretty easy to extend Smack to add support for custom features. If Smack doesn’t fit your needs, don’t change it, extend it!

The most important class in Smack is probably the XMPPConnection, as this is where messages coming from and going to. However, even more important for the developer is what is being sent.

XMPP’s strength comes from the fact that arbitrary XML elements can be exchanged by clients and servers. Heck, the server doesn’t even have to understand what two clients are sending each other. That means that if you need to send some form of data from one device to another, you can simply use XMPP as the transport protocol, serialize your data as XML elements with a namespace that you control and send if off! It doesn’t matter, which XMPP server software you choose, as the server more or less just forwards the data from the sender to the receiver. Awesome!

So lets see how we can extend Smack to add support for a new feature without changing (and therefore potentially breaking) any existing code!

For this article, I chose XEP-0428: Fallback Indication as an example protocol extension. The goal of Fallback Indication is to explicitly mark <body/> elements in messages as fallback. For example some end-to-end encryption mechanisms might still add a body with an explanation that the message is encrypted, so that older clients that cannot decrypt the message due to lack of support still display the explanation text instead. This enables the user to switch to a better client ๐Ÿ˜› Another example would be an emoji in the body as fallback for a reaction.

XEP-0428 does this by adding a fallback element to the message:

<message from="alice@example.org" to="bob@example.net" type="chat">
  <fallback xmlns="urn:xmpp:fallback:0"/>  <-- THIS HERE
  <encrypted xmlns="urn:example:crypto">Rgreavgl vf abg n irel ybat
gvzr nccneragyl.</encrypted>
  <body>This message is encrypted.</body>
</message>

If a client or server encounter such an element, they can be certain that the body of the message is intended to be a fallback for legacy clients and act accordingly. So how to get this feature into Smack?

After the XMPPConnection, the most important types of classes in Smack are the ExtensionElement interface and the ExtensionElementProvider class. The later defines a class responsible for deserializing or parsing incoming XML into the an object of the former class.

The ExtensionElement is itself an empty interface in that it does not provide anything new, but it is composed from a hierarchy of other interfaces from which it inherits some methods. One notable super class is NamedElement, more on that in just a second. If we start our XEP-0428 implementation by creating a class that implements ExtensionElement, our IDE would create this class body for us:

package tk.jabberhead.blog.wow.nice;

import org.jivesoftware.smack.packet.ExtensionElement;
import org.jivesoftware.smack.packet.XmlEnvironment;

public class FallbackIndicationElement implements ExtensionElement {
    
    @Override
    public String getNamespace() {
        return null;
    }

    @Override
    public String getElementName() {
        return null;
    }

    @Override
    public CharSequence toXML(XmlEnvironment xmlEnvironment) {
        return null;
    }
}

The first thing we should do is to change the return type of the toXML() method to XmlStringBuilder, as that is more performant and gains us a nice API to work with. We could also leave it as is, but it is generally recommended to return an XmlStringBuilder instead of a boring old CharSequence.

Secondly we should take a look at the XEP to identify what to return in getNamespace() and getElementName().

<fallback xmlns="urn:xmpp:fallback:0"/>
[   ^    ]      [        ^          ]
element name          namespace

In XML, the part right after the opening bracket is the element name. The namespace follows as the value of the xmlns attribute. An element that has both an element name and a namespace is called fully qualified. That’s why ExtensionElement is inheriting from FullyQualifiedElement. In contrast, a NamedElement does only have an element name, but no explicit namespace. In good object oriented manner, Smacks ExtensionElement inherits from FullyQualifiedElement which in term is inheriting from NamedElement but also introduces the getNamespace() method.

So lets turn our new knowledge into code!

package tk.jabberhead.blog.wow.nice;

import org.jivesoftware.smack.packet.ExtensionElement;
import org.jivesoftware.smack.packet.XmlEnvironment;

public class FallbackIndicationElement implements ExtensionElement {
    
    @Override
    public String getNamespace() {
        return "urn:xmpp:fallback:0";
    }

    @Override
    public String getElementName() {
        return "fallback";
    }

    @Override
    public XmlStringBuilder toXML(XmlEnvironment xmlEnvironment) {
        return null;
    }
}

Hm, now what about this toXML() method? At this point it makes sense to follow good old test driven development practices and create a JUnit test case that verifies the correct serialization of our element.

package tk.jabberhead.blog.wow.nice;

import static org.jivesoftware.smack.test.util.XmlUnitUtils.assertXmlSimilar;
import org.jivesoftware.smackx.pubsub.FallbackIndicationElement;
import org.junit.jupiter.api.Test;

public class FallbackIndicationElementTest {

    @Test
    public void serializationTest() {
        FallbackIndicationElement element = new FallbackIndicationElement();

        assertXmlSimilar("<fallback xmlns=\"urn:xmpp:fallback:0\"/>",
element.toXML());
    }
}

Now we can tweak our code until the output of toXml() is just right and we can be sure that if at some point someone starts messing with the code the test will inform us of any breakage. So what now?

Well, we said it is better to use XmlStringBuilder instead of CharSequence, so lets create an instance. Oh! XmlStringBuilder can take an ExtensionElement as constructor argument! Lets do it! What happens if we return new XmlStringBuilder(this); and run the test case?

<fallback xmlns="urn:xmpp:fallback:0"

Almost! The test fails, but the builder already constructed most of the element for us. It prints an opening bracket, followed by the element name and adds an xmlns attribute with our namespace as value. This is typically the “head” of any XML element. What it forgot is to close the element. Lets see… Oh, there’s a closeElement() method that again takes our element as its argument. Lets try it out!

<fallback xmlns="urn:xmpp:fallback:0"</fallback>

Hm, this doesn’t look right either. Its not even valid XML! (ใƒŽเฒ ็›Šเฒ )ใƒŽๅฝกโ”ปโ”โ”ป Normally you’d use such a sequence to close an element which contained some child elements, but this one is an empty element. Oh, there it is! closeEmptyElement(). Perfect!

<fallback xmlns="urn:xmpp:fallback:0"/>
package tk.jabberhead.blog.wow.nice;

import org.jivesoftware.smack.packet.ExtensionElement;
import org.jivesoftware.smack.packet.XmlEnvironment;

public class FallbackIndicationElement implements ExtensionElement {
    
    @Override
    public String getNamespace() {
        return "urn:xmpp:fallback:0";
    }

    @Override
    public String getElementName() {
        return "fallback";
    }

    @Override
    public XmlStringBuilder toXML(XmlEnvironment xmlEnvironment) {
        return new XmlStringBuilder(this).closeEmptyElement();
    }
}

We can now serialize our ExtensionElement into valid XML! At this point we could start sending around FallbackIndications to all our friends and family by adding it to a message object and sending that off using the XMPPConnection. But what is sending without receiving? For this we need to create an implementation of the ExtensionElementProvider custom to our FallbackIndicationElement. So lets start.

package tk.jabberhead.blog.wow.nice;

import org.jivesoftware.smack.packet.XmlEnvironment;
import org.jivesoftware.smack.provider.ExtensionElementProvider;
import org.jivesoftware.smack.xml.XmlPullParser;

public class FallbackIndicationElementProvider
extends ExtensionElementProvider<FallbackIndicationElement> {
    
    @Override
    public FallbackIndicationElement parse(XmlPullParser parser,
int initialDepth, XmlEnvironment xmlEnvironment) {
        return null;
    }
}

Normally implementing the deserialization part in form of a ExtensionElementProvider is tiring enough for me to always do that last, but luckily this is not the case with Fallback Indications. Every FallbackIndicationElement always looks the same. There are no special attributes or – shudder – nested named child elements that need special treating.

Our implementation of the FallbackIndicationElementProvider looks simply like this:

package tk.jabberhead.blog.wow.nice;

import org.jivesoftware.smack.packet.XmlEnvironment;
import org.jivesoftware.smack.provider.ExtensionElementProvider;
import org.jivesoftware.smack.xml.XmlPullParser;

public class FallbackIndicationElementProvider
extends ExtensionElementProvider<FallbackIndicationElement> {
    
    @Override
    public FallbackIndicationElement parse(XmlPullParser parser,
int initialDepth, XmlEnvironment xmlEnvironment) {
        return new FallbackIndicationElement();
    }
}

Very nice! Lets finish the element part by creating a test that makes sure that our provider does as it should by creating another JUnit test. Obviously we have done that before writing any code, right? We can simply put this test method into the same test class as the serialization test.

    @Test
    public void deserializationTest()
throws XmlPullParserException, IOException, SmackParsingException {
        String xml = "<fallback xmlns=\"urn:xmpp:fallback:0\"/>";
        FallbackIndicationElementProvider provider =
new FallbackIndicationElementProvider();
        XmlPullParser parser = TestUtils.getParser(xml);

        FallbackIndicationElement element = provider.parse(parser);

        assertEquals(new FallbackIndicationElement(), element);
    }

Boom! Working, tested code!

But how does Smack learn about our shiny new FallbackIndicationElementProvider? Internally Smack uses a Manager class to keep track of registered ExtensionElementProviders to choose from when processing incoming XML. Spoiler alert: Smack uses Manager classes for everything!

If we have no way of modifying Smacks code base, we have to manually register our provider by calling

ProviderManager.addExtensionProvider("fallback", "urn:xmpp:fallback:0",
new FallbackIndicationElementProvider());

Element providers that are part of Smacks codebase however are registered using an providers.xml file instead, but the concept stays the same.

Now when receiving a stanza containing a fallback indication, Smack will parse said element into an object that we can acquire from the message object by calling

FallbackIndicationElement element = message.getExtension("fallback",
"urn:xmpp:fallback:0");

You should have noticed by now, that the element name and namespace are used and referred to in a number some places, so it makes sense to replace all the occurrences with references to a constant. We will put these into the FallbackIndicationElement where it is easy to find. Additionally we should provide a handy method to extract fallback indication elements from messages.

...

public class FallbackIndicationElement implements ExtensionElement {
    
    public static final String NAMESPACE = "urn:xmpp:fallback:0";
    public static final String ELEMENT = "fallback";

    @Override
    public String getNamespace() {
        return NAMESPACE;
    }

    @Override
    public String getElementName() {
        return ELEMENT;
    }

    ...

    public static FallbackIndicationElement fromMessage(Message message) {
        return message.getExtension(ELEMENT, NAMESPACE);
    }
}

Did I say Smack uses Managers for everything? Where is the FallbackIndicationManager then? Well, lets create it!

package tk.jabberhead.blog.wow.nice;

import java.util.Map;
import java.util.WeakHashMap;

import org.jivesoftware.smack.Manager;
import org.jivesoftware.smack.XMPPConnection;

public class FallbackIndicationManager extends Manager {

    private static final Map<XMPPConnection, FallbackIndicationManager>
INSTANCES = new WeakHashMap<>();

    public static synchronized FallbackIndicationManager
getInstanceFor(XMPPConnection connection) {
        FallbackIndicationManager manager = INSTANCES.get(connection);
        if (manager == null) {
            manager = new FallbackIndicationManager(connection);
            INSTANCES.put(connection, manager);
        }
        return manager;
    }

    private FallbackIndicationManager(XMPPConnection connection) {
        super(connection);
    }
}

Woah, what happened here? Let me explain.

Smack uses Managers to provide the user (the developer of an application) with an easy access to functionality that the user expects. In order to use some feature, the first thing the user does it to acquire an instance of the respective Manager class for their XMPPConnection. The returned instance is unique for the provided connection, meaning a different connection would get a different instance of the manager class, but the same connection will get the same instance anytime getInstanceFor(connection) is called.

Now what does the user expect from the API we are designing? Probably being able to send fallback indications and being notified whenever we receive one. Lets do sending first!

    ...

    private FallbackIndicationManager(XMPPConnection connection) {
        super(connection);
    }

    public MessageBuilder addFallbackIndicationToMessage(
MessageBuilder message, String fallbackBody) {
        return message.setBody(fallbackBody)
                .addExtension(new FallbackIndicationElement());
}

Easy!

Now, in order to listen for incoming fallback indications, we have to somehow tell Smack to notify us whenever a FallbackIndicationElement comes in. Luckily there is a rather nice way of doing this.

    ...

    private FallbackIndicationManager(XMPPConnection connection) {
        super(connection);
        registerStanzaListener();
    }

    private void registerStanzaListener() {
        StanzaFilter filter = new AndFilter(StanzaTypeFilter.MESSAGE, 
                new StanzaExtensionFilter(FallbackIndicationElement.ELEMENT, 
                        FallbackIndicationElement.NAMESPACE));
        connection().addAsyncStanzaListener(stanzaListener, filter);
    }

    private final StanzaListener stanzaListener = new StanzaListener() {
        @Override
        public void processStanza(Stanza packet) 
throws SmackException.NotConnectedException, InterruptedException,
SmackException.NotLoggedInException {
            Message message = (Message) packet;
            FallbackIndicationElement fallbackIndicator =
FallbackIndicationElement.fromMessage(message);
            String fallbackBody = message.getBody();
            onFallbackIndicationReceived(message, fallbackIndicator,
fallbackBody);
        }
    };

    private void onFallbackIndicationReceived(Message message,
FallbackIndicationElement fallbackIndicator, String fallbackBody) {
        // do something, eg. notify registered listeners etc.
    }

Now that’s nearly it. One last, very important thing is left to do. XMPP is known for its extensibility (for the better or the worst). If your client supports some feature, it is a good idea to announce this somehow, so that the other end knows about it. That way features can be negotiated so that the sender doesn’t try to use some feature that the other client doesn’t support.

Features are announced by using XEP-0115: Entity Capabilities, which is based on XEP-0030: Service Discovery. Smack supports this using the ServiceDiscoveryManager. We can announce support for Fallback Indications by letting our manager call

ServiceDiscoveryManager.getInstanceFor(connection)
        .addFeature(FallbackIndicationElement.NAMESPACE);

somewhere, for example in its constructor. Now the world knows that we know what Fallback Indications are. We should however also provide our users with the possibility to check if their contacts support that feature as well! So lets add a method for that to our manager!

    public boolean userSupportsFallbackIndications(EntityBareJid jid) 
            throws XMPPException.XMPPErrorException,
SmackException.NotConnectedException, InterruptedException, 
            SmackException.NoResponseException {
        return ServiceDiscoveryManager.getInstanceFor(connection())
                .supportsFeature(jid, FallbackIndicationElement.NAMESPACE);
    }

Done!

I hope this little article brought you some insights into the XMPP protocol and especially into the development process of protocol libraries such as Smack, even though the demonstrated feature was not very spectacular.

Quick reminder that the next Google Summer of Code is coming soon and the XMPP Standards Foundation got accepted ๐Ÿ˜‰
Check out the project ideas page!

Happy Hacking!

Smack: Some more busy nights and 12 bytes of IV

Decorative image of an architecture blueprint

In the last months I stayed up late some nights, so I decided to add some additional features to Smack.

Among the additions is support for some new XEPs, namely:

I also started working on an implementation of XEP-0245: Message Moderation, but that one is not yet finished and needs more work.

Direct MUC invitations are a method to invite users to a group chat. Smack already had support for another similar mechanism, but this one is recommended by the XMPP Compliance Suites 2020.

Message Fastening is a generalized mechanism to add information to messages. That might be a reaction, eg. a thumbs up which is added to a previous message.

Message Retraction is used to retract previously sent messages. Internally it is based on Message Fastening.

The Stanza Content Encryption pull request only teaches Smack what SCE elements are, but it doesn’t yet teach it how to use them. That is partly due to no E2EE specification actually using them yet. That will hopefully change soon ๐Ÿ˜‰

Anu brought up the fact that the OMEMO XEP is not totally clear on the length of initialization vectors used for message encryption. Historically most clients use 16 bytes length, while normally you would want to use 12. Apparently some AES-GCM libraries on iOS only support 12 bytes length, so using 12 bytes is definitely desirable. Most OMEMO implementations already support receiving 12 bytes as well as 16 bytes IV.

That’s why Smack will soon also start sending OMEMO messages with 12 bytes IV.

Pitfalls for OMEMO Implementations – Part 1: Inactive Devices

Smack’s OMEMO implementation received a security audit a while ago (huge thanks to the Guardian Project for providing the funding!). Radically Open Security, a non-profit pentesting group from the Netherlands focused on free software and ethical hacking went through the code in great detail to check its correctness and to search for any vulnerabilities. In the end they made some findings, although I wouldn’t consider them catastrophically bad (full disclosure – its my code, so I might be biased :D). In this post I want to go over two of the finding and discuss, what went wrong and how the issue was fixed.

Finding 001: Inactive Devices break Forward Secrecy

Inactive devices which no longer come online retain old chaining keys, which, if compromised, break
confidentiality of all messages from that key onwards.

Penetration Test Report – Radically Open Security
  • Vulnerability Type: Loss of Forward Secrecy.
  • Threat Level: High

Finding 003: Read-Only Devices Compromise Forward Secrecy

Read-only devices never update their keys and thus forfeit forward security.

Penetration Test Report – Radically Open Security
  • Vulnerability Type: Loss of Forward Secrecy.
  • Threat Level: Elevated

These vulnerabilities have the same cause and can be fixed by the same patch. They can happen, if a contact has a device which doesn’t come online (Finding 001), or just doesn’t respond to messages (Finding 003) for an extended period of time. To understand the issue, we have to look deeper into how the cryptography behind the double ratchet works.

Background

In OMEMO, your device has a separate session with every device it sends messages to. In case of one-to-one chat, your phone may have a session with your computer (you can read messages you sent from your phone on your computer too), with your contacts phone, your contacts computer and so on.

In each session, the cryptography happens in a kind of a ping pong way. Lets say you send two messages, then your contact responds with 4 messages, then its your turn again to send some messages and so on and so forth.

To better illustrate whats happening behind the scenes in the OMEMO protocol, lets compare an OMEMO chat with a discussion which is moderated using a “talking stick” (or “speachers staff”). Every time you want to send a message, you acquire the talking stick. Now you can send as many messages as you want. If another person wants to send you a message, you’ve got to hand them the stick first.

Now, cryptographically OMEMO is based on the double ratchet. As the name suggests, the double ratchet consists of two ratchets; the Diffie-Hellman-Ratchet and the Symmetric-Key-Ratchet.

The Symmetric-Key-Ratchet is responsible to encrypt every message you send with a new key. Roughly speaking, this is done by deriving a fresh encryption key from the key which was used to encrypt the previous message you sent. This procedure is called Key Derivation Function Chain (KDF-Chain). The benefit of this technique is, that if an attacker manages to break the key of a message, they cannot use that key to decrypt any previous messages (this is called “forward secrecy“). They can however use that key to derive the message keys of following messages, which is bad.

Functional Principle of a Key Derivation Function Chain

This is where the Diffie-Hellman-Ratchet comes into play. It replaces the symmetric ratchet key every time the “talking stick” gets handed over. Whenever you pass it to your contact (by them sending a message to you), your device executes a Diffie-Hellman key exchange with your contact and the result is used to generate a new key for the symmetric ratchet. The consequence is, that whenever the talking stick gets handed over, the attacker can no longer decrypt any following messages. This is why the protocol is described as “self healing”. The cryptographic term is called “future secrecy“.

Functional Principle of the Diffie-Hellman Ratchet Algorithm

Vulnerability

So, what’s wrong with this technique?

Well, lets say you have a device that you used only once or twice with OMEMO and which is now laying in the drawer, collecting dust. That device might still have active sessions with other devices. The problem is, that it will never acquire the talking stick (it will never send a message). Therefore the Diffie-Hellman-Ratchet will never be forwarded, which will lead to the base key of the Symmetric-Key-Ratchet never being replaced. The consequence is, that once an attacker manages to break one key of the chain, they can use that key to derive all following keys, which will get them access to all messages that device receives in that session. The property of “forward secrecy” is lost.

Smack-omemo already kind of dealt with that issue by removing inactive devices from the device list after a certain amount of time (two weeks). This does however only include own devices, not inactive devices of a contact. In the past I tried to fix this by ignoring inactive devices of a contact, by refusing to encrypt message for them. However, this lead to deadlocks, where two devices were mutually ignoring each other, resulting in a situation where no device could send a message to the other one to recover. As a consequence, Smack omitted deactivating stale contacts devices, which resulted in the vulnerability.

The Patch

This problem could not really be solved without changing the XEP (I’d argue). In my opinion the XEP must specify an explicit way of dealing with inactive devices. Therefore I proposed to use message counters, which gets rid of the deadlock problem. A device is considered as read-only, if the other party sent n messages without getting a reply. That way we can guarantee, that in a worst case scenario, an attacker can get access to n messages. That magical number n would need to be determined though.
I opened a pull request against XEP-0384 which introduces these changes.

Mitigation for the issue has been merged into Smack’s source code already ๐Ÿ™‚

As another prevention against this vulnerability, clients could send empty ping messages to forward the Diffie-Hellman-Ratchet. This does however require devices to take action and does not work with clients that are permanently offline though. Both changes would however go hand in hand to improve the OMEMO user experience and security. Smack-omemo already offers the client developer a method to send ping messages ๐Ÿ™‚

I hope I could help any fellow developers to avoid this pitfall and spark some discussion around how to address this issue in the XEP.

Happy (ethical) Hacking!

Summer of Code: Smack has OpenPGP Support!

I am very proud to announce, that Smack got support for OpenPGP for XMPP!

Today the Pull Request I worked on during my GSoC project was merged into Smacks master branch. Admittedly it will take a few months until smack-openpgp will be included in a Smack release, but that gives me time to further finalize the code and iron out any bugs that may be in there. If you want to try smack-openpgp for yourself, let me know of any issues you encounter ๐Ÿ™‚

(Edit: There are snapshot releases of Smack available for testing)

Now Smack does support two end-to-end encryption methods, which complement each other perfectly. OMEMO is best for people that want to be safe from future attacks, while OpenPGP is more suited for users who appreciate being able to access their chat history at any given time. OpenPGP is therefore the better choice for web based applications, although it is perfectly possible to implement web based clients that do OMEMO (see for example the Wire web app, which does ratcheting similar to OMEMO).

What’s left to do now is updating smack-openpgp due to updates made to XEP-0373 and extensive testing against other implementations.

Happy Hacking!

Smack: Some busy nights

Hello everyone!

This weekend I stayed up late almost every evening. Thus I decided that I wanted to code something, but I wasn’t sure what, so I took a look at the list of published XEPs to maybe find something that is easy to implement, but missing from Smack.

I found that XEP-0394: Message Markup was missing from Smacks list of supported extensions, so I began to code. The next day I finished my work and created Smack#194. One or two nights later I again stayed up late and decided to take another look for an unimplemented XEP. I settled on XEP-0382: Spoiler Messages  this time, which was really easy to implement (apart from the one little attribute, which for whatever reason I struggled to parse until I found a solution). The result of that night is Smack#195.

So if you find yourself laying awake one night with no chance to sleep, just look out for an easy to do task on your favourite free software project. I’m sure this will help you sleep better once the task is done.

Happy Hacking!