Saturday, 8 October 2011

the websocket api spec doesn't like you

I like the look of the W3C WebSockets client specification but it's a bit short-sighted given its inevitable ubiquity.

Unlike the IETF WebSocket protocol, it doesn't have something close to the order of 100 prior revisions with varying levels of implementation across different browsers and browser versions. The client specification is simple and hasn't changed much between revisions but it suffers the flaw of being a specification that can't be implemented in most languages without deviation.

Let's look at an example:

var socket = new WebSocket('ws://websocket.site.local/');

socket.onopen = function () {
    socket.send('Hello World!');
};

This example is pretty straightforward to understand. The browser connects out, handshakes with the server and then sends back a message if there are no problems. In an event-driven language like JavaScript the process looks something like this:

  1. Enter the WebSocket constructor
  2. Call an asynchronous IO function for connecting to the server
  3. Exit the WebSocket constructor
  4. Assign the "Hello World" callback to onopen
  5. Exit the above code block
  6. At some undetermined point in the future, the event loop will run the callback for the IO operation in stage 2
  7. If the callback thinks the IO succeeded, it will call the onopen function

If you don't have an event loop processing asynchronous IO callbacks then the above code does this:

  1. Enter the WebSocket constructor
  2. Connect to the server using blocking IO
  3. Call the onopen function if it's set (it isn't)
  4. Exit the WebSocket constructor
  5. Assign the "Hello World" callback to onopen
  6. Exit the above code block

So onopen will never actually be called.

JavaScript works fine because under the cooperative multitasking scheme only one function executes at a time and ultimately has the responsibility of giving up the CPU to the next function: background threads might perform tasks in parallel but from the perspective of the programmer the current function rules supreme.

So how do you have the same sequence of events when there isn't an event loop? You could try to implement the blocking IO part on a separate thread but then whether onopen is called or not is determined by who wins the race: the original thread making the assignment or the IO thread completing its handshake. Clearly another solution is needed.

The solution

When I was writing a PHP WebSocket client I didn't have time to reprogram the entire language to use an event loop (I'll do that next weekend) so I had to deviate from the specification that I was becoming quite fond of. There was a simple solution that involved removing the responsibility of connecting to the server from the constructor and instead putting it in to separate method that explicitly needs to be called after the socket has been fully prepared.

$socket = new WebSocket('ws://websocket.site.local');

$socket->onopen = function () use ($socket) {
    $socket->send('Hello World!');
};

$socket->connect();

This rebuilds the chain of causation the JavaScript implementation gives us but leaves me wishing the W3C had come up with something a bit more universal. Event-driven programming might be the current hot topic and might prove itself to be superior to multi-threading, but for the forseeable future most programmers wont be using an event loop.

For the same reasons that HTTP became the lingua franca of APIs and SOAs, WebSockets will likely also find themselves playing a large role in the future. Too bad we wont be able to stick to the spec!

Is there a better way? Should the W3C look beyond the browser?

No comments:

Post a Comment