The signal-protocol for node and browsers
Signal is a protocol for encrypted chat. Developed by Moxie Marlinspike and Trevor Perrin, the Signal protocol backs the excellent Signal Messenger, and has since been incorporated into WhatsApp, Facebook Messenger and Google Allo.
I have forked WhisperSystem's own libsignal-protocol-javascript, adding support for the node and the browser. You can find my fork signal-protocol on:
- github
- npm as signal-protocol
In this article, I describe roughly what I did to Open WhisperSystems' original JS package. I conclude with some lingering questions and concerns for asynchronous messaging in the browser, and propose a few next steps.
WARNING: My fork has NOT been reviewed by an experienced cryptographer. It is for research only! Also, I am not affiliated with OpenWhisperSystems. See Disclaimer.
1 What is Signal? Why Javascript?
The Signal protocol 1 is a ratcheting forward secrecy protocol that works in both synchronous and asynchronous messaging. Signal provides forward secrecy (i.e., a leaked key will not compromise future messages), even when messages arrive out-of-order. Needless to say, Signal is a great protocol for encrypted chat. But, on the web, a lot of things look like messaging. 2 Web developers are familiar with receiving out-of-order messages.
Through all sorts of network errors, Signal allows us to keep the same session alive, without ever tearing it down. This is a property I want when I'm collaborating on a Google doc with someone, or talking to a customer as a server. In other words, Signal can supplement TLS for providing encrypted interactions with applications on the web. It allows me to have authenticated and encrypted communications with users; In the Google docs case, Signal prevents the server from seeing cleartext user data, drawing the liability for data breaches away from service providers. It can also provide authenticated communication from the user to the server. Where TLS 3 can provide an encrypted channel for convincing the user they are talking to the right server, Signal can convince the server it is talking to the right user - and that it is the same user that we spoke to last time.4
2 Making Signal work in node and browsers
So, there are some reasons why we might want Signal to work well in our web browser. And, the official libsignal-protocol-javascript offered most of the functionality. My work was gruntwork - getting Signal to compile into a browser package, and providing polyfills for native use in Node. If that doesn't sound like your bag of chips, skip to "Lingering questions."
2.1 Compatibility with JS modules
Javascript's module system makes for testable, reusable components.
Previously, WhisperSystem's JS library attached stuff to window
throughout the program.
This has a few immediate negative effects - it will crash node, for one thing, and it will also make it difficult to bundle stuff.
More damning, however, is that attaching to window
makes it impossible to figure out
what scope our function cares about, and what scope it shouldn't be able to see.
Moving all mentions of windows into modules, and re-configuring the way scope is passed around, took more or less my entire Sunday. The complexity was due in part to…
2.1.1 Emscripten + bundling
Emscripten is a crazy amazing tool - it compiles LLVM bytecode to performant Javascript. It's no surprise that WhisperSystems would use it for a key crypto primitive, the ed25519 curves.
However, how to require emscripten code with bundlers like browserify was not immediately obvious. Eventually, I found this blogpost, which led me to the hack that saved the day.
module.exports = Module; // Do not recurse into module Module.inspect = function() { return '[Module]'; };
Appending this code to a compiled Emscripten file will allow you to import it with browserify or WebKit.
2.1.2 WebWorkers
Finally, it was time to deal with the WebWorker, which can work with ed25519 curves outside of the main execution thread. This certainly helps performance, and I assume it helps defend against side-channel attacks, as well.
So, how do you use WebWorkers with bundled browser builds? Of course, substack has written something for this.
webworkify can spawn a webworker from code using the same, single bundle of JS with a regular old require()
.
2.2 Polyfills for node
With the browser bundle built, it was time to make stuff work in Node.
I used node-webcrypto-ossl as a drop-in native replacement for the WebCrypto API that gets its cryptographic primatives from OpenSSL. The tests pass, but we will need a cryptographer to know how reliable this library is. (It has not been independently reviewed). Please get in touch if you would like to review - see Disclaimer, below.
For webworkers, I use tiny-worker as a (more-or-less) drop-in replacement for webworkify. Tiny-worker uses threads in Node instead of WebWorkers. Again, we will probably need a cryptographer to verify this decision.
2.3 Testing across platforms
This project was my first time testing with Travis (for Node v5 and v6 builds), and with Sauce Labs (for browsers). Travis took a bit of fiddling to get right (hint: use dist: trusty
for modern build tools). But, once everything was up and running, I was glad to have set up my CI. (It's amazing whats available for free on the net these days. I didn't pay a dime for either of those tools).
3 Lingering questions
After this work, I have a few questions remaining.
3.1 Can we trust WebCrypto API on the wild, wild web?
Currently, Signal in the browser relies on the WebCrypto API.
In principle, the WebCrypto API is fantastic - well-build cryptographic primitives, written in native code, executable from the JSVM.
In practice, using the API involves making calls to window.crypto
- which could be anything.
Consider the following line of attack, which could be exploited by a plugin, or script, that has access to the browser window in which libsignal is running:
- Come up with a bogus myFakeGetRandomBytes(typedArray) method that generates non-random stuff.
window.crypto.getRandomBytes = myFakeGetRandomBytes
- Import signal as normal.
I believe this exploit speaks to the absolute necessity of keeping the browser environment isolated from normal userland browser, which is probably a madhouse of unsafe code running, reading the page, etc. If you must run the signal protocol in-browser, run it in Electron, or as a Chrome app (Signal Desktop currently does the latter).
In the future, we may think about moving toward emscripten-compiled dependencies for cryptographic primitives. At the end of the day, window.crypto can be absolutely anything. If we can bundle all primitives with the rest of the application code, we can verify the integrity of that one JS bundle, e.g. with subresource integrity.
3.2 What are the WebWorkers doing?
The tests feature WebWorkers, but they are not called anywhere else as far as I can see. I would like to learn more about how WebWorkers are to be used in Signal.
3.3 What's the deal with session state at rest?
There is a bunch of state in the protocol, stored in sessions, which have a special API. Again, I am not a cryptographer, and do not know how best to store sessions. This will be future work for me to research, and I hope to hear more from the community as well about how best to encrypt session (and identity) data at rest, both in node and in the browser.
4 Future work: Shared, forking datastructures
Imagine Google Docs, but encrypted. Edits from friends could be Signal messages, containing a diff and an insertion/deletion point.
Of course, ordering the messages is always an issue. That's where a Merkle tree like hyperlog could come in handy. Even if messages are not strictly ordered, we can still construct a (forking) datastructure that represents all possible conflict-resolution states of the text.
A project like this will require a lot of infrastructural work - encrypting the data at rest, setting up keyservers, coordinating session-start messages. But, it could lay the groundwork for a entire application architectures based on Signal. I love peer-to-peer, but there is no reason to forsake servers completely - so long as they do not have a privileged position on the network relative to other peers. Signal can help here.
5 Thanks
Thanks to Adam Rothman and Nick Doty for notes and converstaions.
6 Disclaimer
I am not a cryptographer. Do NOT use this library for encrypting secrets. It is for research ONLY! Also, I am not affiliated with Open WhisperSystems. I am just a hobbyist. This library has NOT been reviewed by by the folks at WhisperSystems (though I have issued a PR on the bundling (non-node) aspects of this work).
Relatedly, signal-protocol needs a look over by an experienced cryptographer, preferably someone with experience doing crypto in node and the browser. If you know such a person, or are one, please get in touch (my email is on my github profile). We want not only to harden our system, but to prevent against foolhardy ventures, where possible.
Footnotes:
The signal protocol was formerly named axolotl, for its ability to "heal" itself when an attacker compromises a session key. Here is formal analysis of the Signal protocol.
Regarding apps that use chat, see Jon Libov's Futures of Text.
TLS has many shortcomings, and I am not suggesting Signal as a replacement, or a pancea for authenticity on the web. I recommend SSL and the Future of Authenticity for a fun overview. Trevor Perrin's Noise protocol has been proposed as an alternative to TLS.
EDIT: In fact, TLS has a feature called session resumption. However, it can compromise forward secrecy. (Thanks to @baby for the tip).