Implementing Packet Sequence Validation using Pushdown Automata


This is part 2 of a small series on verifying the validity of packet sequences using tools from theoretical computer science. Read part 1 here.

In the previous blog post I discussed how a formal grammar can be transformed into a pushdown automaton in order to check if a sequence of packets or tokens is part of the language described by the grammar. In this post I will discuss how I implemented said automaton in Java in order to validate OpenPGP messages in PGPainless.

In the meantime, I made some slight changes to the automaton and removed some superfluous states. My current design of the automaton looks as follows:

If you compare this diagram to the previous iteration, you can see that I got rid of the states “Signed Message”, “One-Pass-Signed Message” and “Corresponding Signature”. Those were states which had ε-transitions to another state, so they were not really useful.

For example, the state “One-Pass-Signed Message” would only be entered when the input “OPS” was read and ‘m’ could be popped from the stack. After that, there would only be a single applicable rule which would read no input, pop nothing from the stack and instead push back ‘m’. Therefore, these two rule could be combined into a single rule which reads input “OPS”, pops ‘m’ from the stack and immediately pushes it back onto it. This rule would leave the automaton in state “OpenPGP Message”. Both automata are equivalent.

One more minor detail: Since I am using Bouncy Castle, I have to deal with some of its quirks. One of those being that BC bundles together encrypted session keys (PKESKs/SKESKs) with the actual encrypted data packets (SEIPD/SED). Therefore when implementing, we can further simplify the diagram by removing the SKESK|PKESK parts:

Now, in order to implement this automaton in Java, I decided to define enums for the input and stack alphabets, as well as the states:

public enum InputAlphabet {
    LiteralData,
    Signature,            // Sig
    OnePassSignature,     // OPS
    CompressedData,
    EncryptedData,        // SEIPD|SED
    EndOfSequence         // End of message/nested data
}
public enum StackAlphabet {
    msg,                 // m
    ops,                 // o
    terminus             // #
}
public enum State {
    OpenPgpMessage,
    LiteralMessage,
    CompressedMessage,
    EncryptedMessage,
    Valid
}

Note, that there is no “Start” state, since we will simply initialize the automaton in state OpenPgpMessage, with ‘m#’ already put on the stack.

We also need an exception class that we can throw when OpenPGP packet is read when its not allowed. Therefore I created a MalformedOpenPgpMessageException class.

Now the first design of our automaton itself is pretty straight forward:

public class PDA {
    private State state;
    private final Stack<StackAlphabet> stack = new Stack<>();
    
    public PDA() {
        state = State.OpenPgpMessage;    // initial state
        stack.push(terminus);            // push '#'
        stack.push(msg);                 // push 'm'
    }

    public void next(InputAlphabet input)
            throws MalformedOpenPgpMessageException {
        // TODO: handle the next input packet
    }

    StackAlphabet popStack() {
        if (stack.isEmpty()) {
            return null;
        }
        return stack.pop();
    }

    void pushStack(StackAlphabet item) {
        stack.push(item);
    }

    boolean isEmptyStack() {
        return stack.isEmpty();
    }

    public boolean isValid() {
        return state == State.Valid && isEmptyStack();
    }
}

As you can see, we initialize the automaton with a pre-populated stack and an initial state. The automatons isValid() method only returns true, if the automaton ended up in state “Valid” and the stack is empty.

Whats missing is an implementation of the transition rules. I found it most straight forward to implement those inside the State enum itself by defining a transition() method:

public enum State {

    OpenPgpMessage {
        @Overrides
        public State transition(InputAlphabet input, PDA automaton)
                throws MalformedOpenPgpMessageException {
            StackAlphabet stackItem = automaton.popStack();
            if (stackItem != OpenPgpMessage) {
                throw new MalformedOpenPgpMessageException();
            }
            swith(input) {
                case LiteralData:
                    // Literal Packet,m/ε
                    return LiteralMessage;
                case Signature:
                    // Sig,m/m
                    automaton.pushStack(msg);
                    return OpenPgpMessage;
                case OnePassSignature:
                    // OPS,m/mo
                    automaton.push(ops);
                    automaton.push(msg);
                    return OpenPgpMessage;
                case CompressedData:
                    // Compressed Data,m/ε
                    return CompressedMessage;
                case EncryptedData:
                    // SEIPD|SED,m/ε
                    return EncryptedMessage;
                case EndOfSequence:
                default:
                    // No transition
                    throw new MalformedOpenPgpMessageException();
            }
        }
    },

    LiteralMessage {
        @Overrides
        public State transition(InputAlphabet input, PDA automaton)
                throws MalformedOpenPgpMessageException {
            StackAlphabet stackItem = automaton.popStack();
            switch(input) {
                case Signature:
                    if (stackItem == ops) {
                        // Sig,o/ε
                        return LiteralMessage;
                    } else {
                        throw new MalformedOpenPgpMessageException();
                    }
                case EndOfSequence:
                    if (stackItem == terminus && automaton.isEmptyStack()) {
                        // ε,#/ε
                        return valid;
                    } else {
                        throw new MalformedOpenPgpMessageException();
                    }
                default:
                    throw new MalformedOpenPgpMessageException();
            }
        }
    },

    CompressedMessage {
        @Overrides
        public State transition(InputAlphabet input, PDA automaton)
                throws MalformedOpenPgpMessageException {
            StackAlphabet stackItem = automaton.popStack();
            switch(input) {
                case Signature:
                    if (stackItem == ops) {
                        // Sig,o/ε
                        return CompressedMessage;
                    } else {
                        throw new MalformedOpenPgpMessageException();
                    }
                case EndOfSequence:
                    if (stackItem == terminus && automaton.isEmptyStack()) {
                        // ε,#/ε
                        return valid;
                    } else {
                        throw new MalformedOpenPgpMessageException();
                    }
                default:
                    throw new MalformedOpenPgpMessageException();
            }
        }
    },

    EncryptedMessage {
        @Overrides
        public State transition(InputAlphabet input, PDA automaton)
                throws MalformedOpenPgpMessageException {
            StackAlphabet stackItem = automaton.popStack();
            switch(input) {
                case Signature:
                    if (stackItem == ops) {
                        // Sig,o/ε
                        return EncryptedMessage;
                    } else {
                        throw new MalformedOpenPgpMessageException();
                    }
                case EndOfSequence:
                    if (stackItem == terminus && automaton.isEmptyStack()) {
                        // ε,#/ε
                        return valid;
                    } else {
                        throw new MalformedOpenPgpMessageException();
                    }
                default:
                    throw new MalformedOpenPgpMessageException();
            }
        }
    },

    Valid {
        @Overrides
        public State transition(InputAlphabet input, PDA automaton)
                throws MalformedOpenPgpMessageException {
            // Cannot transition out of Valid state
            throw new MalformedOpenPgpMessageException();
        }
    }
    ;

    abstract State transition(InputAlphabet input, PDA automaton)
            throws MalformedOpenPgpMessageException;
}

It might make sense to define the transitions in an external class to allow for different grammars and to remove the dependency on the PDA class, but I do not care about this for now, so I’m fine with it.

Now every State has a transition() method, which takes an input symbol and the automaton itself (for access to the stack) and either returns the new state, or throws an exception in case of an illegal token.

Next, we need to modify our PDA class, so that the new state is saved:

public class PDA {
    [...]

    public void next(InputAlphabet input)
            throws MalformedOpenPgpMessageException {
        state = state.transition(input, this);
    }
}

Now we are able to verify simple packet sequences by feeding them one-by-one to the automaton:

// LIT EOS
PDA pda = new PDA();
pda.next(LiteralData);
pda.next(EndOfSequence);
assertTrue(pda.isValid());

// OPS LIT SIG EOS
pda = new PDA();
pda.next(OnePassSignature);
pda.next(LiteralData);
pda.next(Signature);
pda.next(EndOfSequence);
assertTrue(pda.isValid());

// COMP EOS
PDA pda = new PDA();
pda.next(CompressedData);
pda.next(EndOfSequence);
assertTrue(pda.isValid());

You might say “Hold up! The last example is a clear violation of the syntax! A compressed data packet alone does not make a valid OpenPGP message!”.

And you are right. A compressed data packet is only a valid OpenPGP message, if its decompressed contents also represent a valid OpenPGP message. Therefore, when using our PDA class, we need to take care of packets with nested streams separately. In my implementation, I created an OpenPgpMessageInputStream, which among consuming the packet stream, handling the actual decryption, decompression etc. also takes care for handling nested PDAs. I will not go into too much details, but the following code should give a good idea of the architecture:

public class OpenPgpMessageInputStream {
    private final PDA pda = new PDA();
    private BCPGInputStream pgpIn = ...; // stream of OpenPGP packets
    private OpenPgpMessageInputStream nestedStream;

    public OpenPgpMessageInputStream(BCPGInputStream pgpIn) {
        this.pgpIn = pgpIn;
        switch(pgpIn.nextPacketTag()) {
            case LIT:
                pda.next(LiteralData);
                ...
                break;
            case COMP:
                pda.next(CompressedData);
                nestedStream = new OpenPgpMessageInputStream(decompress());
                ...
                break;
            case OPS:
                pda.next(OnePassSignature);
                ...
                break;
            case SIG:
                pda.next(Signature);
                ...
                break;
            case SEIPD:
            case SED:
                pda.next(EncryptedData);
                nestedStream = new OpenPgpMessageInputStream(decrypt());
                ...
                break;
            default:
                // Unknown / irrelevant packet
                throw new MalformedOpenPgpMessageException();
    }

    boolean isValid() {
        return pda.isValid() &&
               (nestedStream == null || nestedStream.isValid());

    @Override
    close() {
        if (!isValid()) {
            throw new MalformedOpenPgpMessageException();
        }
        ...
    }
}

The key thing to take away here is, that when we encounter a nesting packet (EncryptedData, CompressedData), we create a nested OpenPgpMessageInputStream on the decrypted / decompressed contents of this packet. Once we are ready to close the stream (because we reached the end), we not only check if our own PDA is in a valid state, but also whether the nestedStream (if there is one) is valid too.

This code is of course only a rough sketch and the actual implementation is far more complex to cover many possible edge cases. Yet, it still should give a good idea of how to use pushdown automata to verify packet sequences 🙂 Feel free to check out my real-world implementation here and here.

Happy Hacking!

, , ,

Leave a Reply

Your email address will not be published. Required fields are marked *