Using HTML5 WebSockets

One of the interesting features of HTML5 is the inclusion of WebSockets. This allows you to open up a TCP/IP socket, and do full duplex messaging between the browser and the server.

There are a variety of frameworks out there that make it easy to do socket programming (e.g. SignalR, socket.io) but I was curious to understand how raw WebSockets works.

In the browser, it’s fairly straight forward to use the WebSockets API. The code looks a bit like this:

if ("WebSocket" in window)
{
  ws = new WebSocket("ws://your.domain.com:1337/Path");
  ws.onopen = function() {
    // the socket is open
    ws.send("Hello");
  };
  ws.onmessage = function (evt){ 
    // message received
    console.log(evt.data);
  };
  ws.onclose = function() { 
    // websocket is closed.
  };
}

It’s a fairly straight forward method to send a message, and an event is fired when a message is received.

On the server side however, things are a little more complicated.

When the browser wants to start using WebSockets, an HTTP GET is made, requesting an upgrade to the connection. The request looks like this:

GET http://your.domain.com:1337/Path/ HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Host: your.domain.com:133
Origin: http://another.domain.com
Sec-WebSocket-Key: yrLV2RDDPo10K4jwARFt6Q==
Sec-WebSocket-Version: 13
Cookie: xxx

The server should then response like this:

HTTP/1.1 101 Switching Protocols
Upgrade: WebSocket
Connection: Upgrade
Sec-WebSocket-Accept: 3EAoJRawXLXN/IeksBFhfwlhGec=

The important part is the ‘Sec-WebSocket-Key’ and ‘Sec-WebSocket-Accept’ headers. You must take the ‘Sec-WebSocket-Key’ value, append this value to it ‘258EAFA5-E914-47DA-95CA-C5AB0DC85B11’, and then calcuate a base64 encoded SHA-1 hash, which you set as the ‘Sec-WebSocket-Accept’ header in the response (don’t ask me why!).

This bit of JavaScript (node.js) will do this for you:

if (data.toString().substring(0,3) == "GET") {
  var key = getRequestVariable(data.toString().split("\r\n"), 'Sec-WebSocket-Key')
  var sha = sha1(key + '258EAFA5-E914-47DA-95CA-C5AB0DC85B11');
  socket.write("HTTP/1.1 101 Switching Protocols\r\nUpgrade: WebSocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept:" + sha + "\r\n\r\n");
}

var crypto = require('crypto');

function sha1(value) {
  var sha = crypto.createHash('sha1');
  sha.update(value);
  return sha.digest('base64');
}

function getRequestVariable(items, name){
  for (var i = 0; i < items.length; i++) {
    var parts = items[i].split(':');
    if (parts && parts.length > 1 && parts[0] == name) {
      return parts[1].trim();
    }
  }   
}

Now that you’ve shaken hands, you’re ready to start talking across your socket. However, the socket data you receive on the server (from the browser) is encoded, and can’t simply be read as text. The format is complicated, but essentially the frame takes this format:

  • The first bit is hard coded to 129 (and can be ignored).
  • The next few bytes describe the length of the data (more bytes are required to indicate the length, depending on the value for length).
  • The next 4 bytes hold values for masks, used to decode the data.
  • The remaining bytes contain the data XORd with the masks.

To decode the data, you must extract the masks, and then apply an XOR with each byte against a mask value (cycling through the 4 masks in turn).

You can use this function to decode the data for you:

function decodeWebSocket(data){
  var datalength = data[1] & 127;
  var indexFirstMask = 2;
  if (datalength == 126) {
    indexFirstMask = 4;
  } else if (datalength == 127) {
    indexFirstMask = 10;
  }
  var masks = data.slice(indexFirstMask,indexFirstMask + 4);
  var i = indexFirstMask + 4;
  var index = 0;
  var output = "";
  while (i < data.length) {
    output += String.fromCharCode(data[i++] ^ masks[index++ % 4]);
  }
  return output;
}

Sending data to the browser is slightly more simple, as you don’t need to use the masks. However, you still need to set the byte (or bytes) indicating the length.

function encodeWebSocket(bytesRaw){
  var bytesFormatted = new Array();
  bytesFormatted[0] = 129;
  if (bytesRaw.length <= 125) {
    bytesFormatted[1] = bytesRaw.length;
  } else if (bytesRaw.length >= 126 && bytesRaw.length <= 65535) {
    bytesFormatted[1] = 126;
    bytesFormatted[2] = ( bytesRaw.length >> 8 ) & 255;
    bytesFormatted[3] = ( bytesRaw.length ) & 255;
  } else {
    bytesFormatted[1] = 127;
    bytesFormatted[2] = ( bytesRaw.length >> 56 ) & 255;
    bytesFormatted[3] = ( bytesRaw.length >> 48 ) & 255;
    bytesFormatted[4] = ( bytesRaw.length >> 40 ) & 255;
    bytesFormatted[5] = ( bytesRaw.length >> 32 ) & 255;
    bytesFormatted[6] = ( bytesRaw.length >> 24 ) & 255;
    bytesFormatted[7] = ( bytesRaw.length >> 16 ) & 255;
    bytesFormatted[8] = ( bytesRaw.length >> 8 ) & 255;
    bytesFormatted[9] = ( bytesRaw.length ) & 255;
  }
  for (var i = 0; i < bytesRaw.length; i++){
    bytesFormatted.push(bytesRaw.charCodeAt(i));
  }
  return new Buffer(bytesFormatted);
}

Simple!

Advertisements