TOC 
Network Working GroupS. Ferguson
Internet-DraftCaucho
Intended status: Standards TrackMay 16, 2010
Expires: November 17, 2010 


The WebSockets Protocol
draft-ferg-hybi-websockets-latest

Abstract

The WebSocket protocol enables a bidirectional stream of messages between a client and a server. Messages consist of a sequence of binary frames over TCP. The protocol uses HTTP for its handshake, upgrading to the bidirectional binary frames defined in this document.

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

This Internet-Draft will expire on November 17, 2010.



Table of Contents

1.  Introduction
    1.1.  Requirements
    1.2.  Protocol Overview
2.  General Requirements
    2.1.  Requirements
    2.2.  Syntax Notation
    2.3.  Terminology
    2.4.  Basic Rules
3.  Client Handshake
    3.1.  Client handshake variables
    3.2.  Client Handshake
        3.2.1.  Client Grammar
4.  Server Handshake
    4.1.  Server handshake variables
    4.2.  Server Handshake
        4.2.1.  Server Grammar
5.  Stream Protocol
    5.1.  Stream syntax
    5.2.  Frame syntax
    5.3.  Stream Close
    5.4.  Connection Keepalive
6.  Control Frames
    6.1.  NOP (op=0)
    6.2.  CLOSE (op=1)
    6.3.  HELLO (op=2)
    6.4.  ERROR (op=3)
    6.5.  HEADERS (op=4)
    6.6.  PING-REQUEST (op=5)
    6.7.  PING-RESPONSE (op=6)
7.  HELLO Headers
    7.1.  url
    7.2.  origin
    7.3.  protocol
    7.4.  status
8.  Security Considerations
    8.1.  HTTP
    8.2.  Browser Scripting attacks
9.  Acknowledgements
10.  Normative References
§  Author's Address
§  Intellectual Property and Copyright Statements




 TOC 

1.  Introduction

NOTE: this section is copied verbatim from the [I.D.loreto‑hybi‑requirements] (Loreto, S., “HyBi Requirements and Features,” March 2010.) requirements document.

HTTP RFC2616 (Fielding, R., Gettys, J., Mogul, J., Mainter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” June 1999.) [RFC2616] is a client/server protocol, where the HTTP servers store the data and provide it when it is requested by clients. When used to used to retrieve data from an HTTP server, the client sends HTTP requests to the server, and the server returns the requested data in HTTP responses. So the client has to poll continuously the server in order to receive new data.

Recently techniques that enable bidirectional communication over HTTP have become more pervasive. Those techniques reduce the need to poll continuously the server thanks to the usage of HTTP hanging requests and multiple connections between the client and the server [I-D.loreto-http-bidirectional].

The goal of HyBi is to provide an efficient and clean two-way communication channel between client and server.

The communication channel will:



 TOC 

1.1.  Requirements

NOTE: requirements are from [I.D.loreto‑hybi‑requirements] (Loreto, S., “HyBi Requirements and Features,” March 2010.) requirements document.

  1. It MUST be possible to send a message when the total size is either unknown or exceeds a fixed buffer size.
  2. The WebSocket server MUST have the ability to send arbitrary binary content to the client on the established communication channel, in the form of ordered discrete blocks.
  3. The WebSocket server MUST have the ability to send arbitrary text content to the client on the established communication channel, in the form of ordered discrete blocks.
  4. Textual data MUST be encoded as UTF-8
  5. The WebSocket protocol MUST allow HTTP and WebSocket connections to be served from the same port.
  6. The WebSocket protocol MUST provide for graceful close of an active WebSocket connection on request from the user Application.
  7. The WebSocket client MUST be able to request the server, during the handshake, to use a specific WebSocket sub-protocol.
  8. WebSocket should be designed to be robust against cross-protocol attacks.



 TOC 

1.2.  Protocol Overview

This section is non-normative.

A WebSockets client connects either to a WebSockets port with a plain TCP connection [Ed: port 880 looks like it's available] or using TLS to a possibly-shared HTTP port 443.

When using TLS, the client will use [I.D.agl‑tls‑nextprotoneg] (Langley, A., “Transport Layer Security (TLS) Next Protocol Negotiation Extension,” January 2010.) to select WebSockets as the NextProtocol, letting WebSockets share the same TCP port with existing HTTP servers.

The WebSocket protocol begins with a handshake using a WebSocket HELLO control frame from the client and a HELLO control frame from the server. After a peer has sent its hello, it may send messages and control frames until the final CLOSE control frame.

The client handshake is a HELLO control frame that looks like the following:

  %x80.02.00.86 WebSocket/1.0
  url: ws://example.com:880/sample/resource
  origin: http://example.com/launchpage.php
  protocol: tictactoe.example.com/1.0
  <Stream data follows>

The server handshake is a HELLO control frame that looks like the following:

  %x80.02.00.32 WebSocket/1.0
  protocol: tictactoe.example.com/1.0
  <Stream data follows>

The bidirectional stream consists of a sequence of data frames combined into application messages, and control frames used to manage the connection itself.

  Stream  = *( Message / control-frame )
  Message = *( non-final-frame ) final-frame

A typical message might consist of a single data frame encoding a text message using UTF-8.

  %x00.00.00.0C Hello, world

A long message can be broken into multiple frames, where the first frame signals more data is available.

  %x40.00.00.06 Hello,
  %x00.00.00.06  world

Control frames are short WebSocket-specific frames with an 8-bit opcode used to control the connection. The following is a PING-REQUEST to test the liveness of the connection.

  %x80.05.00.00

The 32-bit data frame header has the high bit clear, a "more" flag, and a 28-bit payload length.

  +---+------+------+------------+
  | 0 | M(1) | X(2) | length(28) |
  +---+------+------+------------+

The 32-bit control frame header has the high bit set, an 8-bit opcode, and a 16-bit payload length.

  +---+------+-------+------------+
  | 1 | X(7) | op(8) | length(24) |
  +---+------+-------+------------+

The client and server will send application messages asynchronously until the end of the stream. Each will close the stream with a CLOSE control frame.

Clients and servers may use a Keepalive control frame to verify if a connection is still valid, which may be needed when network routers drop connections silently.



 TOC 

2.  General Requirements



 TOC 

2.1.  Requirements

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

An implementation is not compliant if it fails to satisfy one or more of the MUST or REQUIRED level requirements for the protocols it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be "unconditionally compliant"; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be "conditionally compliant."



 TOC 

2.2.  Syntax Notation

This specification uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” January 2008.).

The following core rules are included by reference, as defined in [RFC5234] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” January 2008.), Appendix B.1: LF (line feed), SP (space),



 TOC 

2.3.  Terminology

This specification the following HyBi-related terms:

connection: A transport layer virtual circuit established between a client and a server for the purpose of communication.

control-frame: A frame used to control connection behavior outside of the application data stream.

frame: The basic unit of WebSocket communication, consisting of a structured sequence of octets matching the syntax defined in the actual protocol and transmitted on the established communication channel.

message: user message: a block of related data with identified boundaries.

origin server: The server on which a given resource resides or is to be created.



 TOC 

2.4.  Basic Rules

The following URI definitions come from [RFC3986] (Berners-Lee, T., Fielding, R., and L. Mainter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.).

  absolute-URI  = <absolute-URI, defined in [RFC3986], Section 4.3>
  relative-part = <relative-part, defined in [RFC3986], Section 4.2>
  authority     = <authority, defined in [RFC3986], Section 3.2>
  port          = <port, defined in [RFC3986], Section 3.2.3>
  query         = <query, defined in [RFC3986], Section 3.4>
  uri-host      = <host, defined in [RFC3986], Section 3.2.2>

The following is the format of a WebSocket URI.

     ws-URI = "ws:" "//" host ":" port relative-part [ "?" query ]

The following is the format of a WebSocket TLS URI.

     wss-URI = "wss:" "//" host ":" port relative-part [ "?" query ]


 TOC 

3.  Client Handshake



 TOC 

3.1.  Client handshake variables

Before establishing a connection, the client will gather the following data, typically from a WebSocket URL or client API call: /secure/, /host/, /port/, /resource/ and an optional /protocol/.

/secure/ is a flag indicating whether TLS will be used or not. The WebSocket URL uses the "wss:" scheme to indicate TLS, and "ws:" to indicate non-TLS.

/url/ is the absolute URL of the server resource. For WebSockets, the URL is constructed from the virtual host, the port, and the resource name.

/protocol/ is the application protocol name. If the application does not set a protocol, use the string "text".

/origin/ calculated from the server's HTTP resource that initiated the WebSockets connection, for browser-client. [Ed: Non-browser clients will use ??]



 TOC 

3.2.  Client Handshake

After gathering the handshake information above, the client initiates the handshake. If the handshake fails at any step, the client MUST close the connection.

The client send portion of the handshake proceeds as follows:

  1. The client MUST wait for any other handshake to the same server identified by /host/ and /port/ to complete or fail, i.e. the client MUST serialize handshakes to a server.
  2. If /secure/ is true, the client initiates a TLS handshake. If [I.D.agl‑tls‑nextprotoneg] (Langley, A., “Transport Layer Security (TLS) Next Protocol Negotiation Extension,” January 2010.) is available, the client MUST set "NextProtocol" to "WebSocket/1.0".
  3. The client sends the HELLO control frame with "WebSocket/1.0" as the version. The client MUST send the following HELLO headers:
    1. A "url" header with the resource URL. The URL must be an absolute URL.
    2. An "origin" header with /origin/ as its value.
    3. A "protocol" header with /protocol/ as its value.
    4. [Ed: we could add a "nonce" to handle certain replay attacks.]
  4. The client may send additional HELLO headers.
  5. After sending the HELLO, the client may send data and control frames even before receiving the server's response.
  6. The first data from the server MUST be a HELLO control frame from the server. The client MUST close the connection if it detects any errors in the HELLO control frame. The server's HELLO frame must satisfy the following:
    1. The version MUST be "WebSocket/1.0".
    2. The "status" header MUST exist with value "200".
    3. The "origin" header MUST exist with value /origin/.
    4. The "protocol" header MUST exist with value /protocol/.
    5. [Ed: we could add a "digest" here with a digest of a nonce.]
  7. The client sends and receives messages and control frames following the stream protocol.
  8. The client may send a CLOSE control frame at any time. After sending the CLOSE, it MUST NOT send any more data.
  9. When the client receives a CLOSE control frame, it MUST stop reading from the stream.



 TOC 

3.2.1.  Client Grammar

The full syntax for the data sent by the WebSocket client is as follows:

   Request        = HELLO
                    Stream
                    CLOSE

   HELLO          = control-frame
                  ; where OP = 1 (HELLO)

   CLOSE          = control-frame
                  ; where OP = 2 (CLOSE)

   HELLO-payload  = "WebSocket/1.0" LF
                    "url:" SP /url/ LF
                    "origin:" SP /origin/ LF
                    "protocol:" SP /protocol/ LF
                    * ( header ":" SP value LF )


 TOC 

4.  Server Handshake



 TOC 

4.1.  Server handshake variables

Before establishing a connection, the server will gather the following data for a WebSocket resource: /secure/, /url/, /protocol/, and an optional /origin/.

  1. /secure/ is a flag indicating whether TLS [RFC2246] (Dierks, T. and C. Allen, “The TLS Protocol Version 1.0,” January 1999.) will be used or not.
  2. /url/ is the absolute URL of the WebSocket resource, constructed from the WebSocket scheme, the uri-host of [RFC3986] (Berners-Lee, T., Fielding, R., and L. Mainter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.), the port of [RFC3986] (Berners-Lee, T., Fielding, R., and L. Mainter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.), and the relative-part of [RFC3986] (Berners-Lee, T., Fielding, R., and L. Mainter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.).
  3. /origin/ is the URL of the source page that that initiates the WebSockets connection for browser clients, for example an HTML page that launches the client JavaScript. If the server does not launch the client in this fashion, /origin/ is unset.
  4. /protocol/ is the application protocol name. If not specified by the application, use "text".



 TOC 

4.2.  Server Handshake

After gathering the handshake information above, the server initiates the handshake. If the handshake fails at any step, the server MUST reject the request with an ERROR control frame.

  1. If the connection is a TLS connection, and the "NextProtocol" message from [I.D.agl‑tls‑nextprotoneg] (Langley, A., “Transport Layer Security (TLS) Next Protocol Negotiation Extension,” January 2010.) is set, its value MUST be "WebSocket/1.0".
  2. The server receives and parses the client's HELLO control frame. If the HELLO does not satisfy the following requirements, the server MUST reject the request.
    1. The "url" header MUST match /url/.
    2. The "origin" MUST match /origin/ if set.
    3. The "protocol" MUST match /protocol/ if set.
  3. If the client request is valid, the server sends its HELLO control frame with the following values:
    1. The version is "WebSocket/1.0".
    2. The "status" header is "200".
    3. The "origin" header is the origin header copied from the client's HELLO frame.
    4. "protocol" is set to /protocol/
    5. The server may send additional HELLO headers.
    6. [Ed: here the server could calculate a "digest"
  4. The handshake is now complete.
  5. The server now sends and receives messages and control frames as defined by the Stream syntax.
  6. The server may send a CLOSE control frame at any time. It MUST NOT send any frames after the CLOSE frame.
  7. When the server receives a CLOSE control frame from the server, it MUST stop reading, and close the connection.



 TOC 

4.2.1.  Server Grammar

   Response       = HELLO
                    Stream
                    CLOSE

   HELLO          = control-frame
                  ; where OP = 1 (HELLO)

   CLOSE          = control-frame
                  ; where OP = 2 (CLOSE)

   HELLO-payload  = "WebSocket/1.0" LF
                    "status: 200" LF
                    "origin:" SP /origin/ LF
                    "protocol:" SP /protocol/ LF
                    * ( header ":" SP value LF )


 TOC 

5.  Stream Protocol



 TOC 

5.1.  Stream syntax

Once the handshake has been established, the message stream is symmetrical. Each side sends and reads a sequence of data messages split into frames interleaved with any control frames interleaved.

A stream is a sequence of binary data messages, where each data message is a sequence of partial data frames. Control frames may appear between data messages to control the connection.

If the application message data is unicode text, as in a JavaScript browser, the sender MUST encode the text as UTF-8.

Because the sending side may use a fixed-sized buffer, it may split a message into any number of non-final data frames followed by the final data frame. Short messages will fit into a single final-frame.

The syntax of each frame is defined in the next section.

When closing a connection, the client and server MUST send a CLOSE control frame. No bytes may be sent or read after the CLOSE control frame.

A receiver MUST close the connection if it detects any errors while reading, including any illegal frame syntax, too-long frame lengths, or any unknown control frame. The client and server MUST NOT attempt to recover from frame errors.

The stream syntax is defined by the following grammar. The frame section below defines the grammar of the frames themselves.

   Stream               = *( Message / control-frame) Close
   Message              = *( non-final-data-frame ) final-data-frame

   Close                = control-frame
                        ; where control-op=CLOSE


 TOC 

5.2.  Frame syntax

A frame consists of an initial code byte, followed by the length of the frame encoded by a variable number of bytes, followed by the frame data.

   final-data-frame     = final-header N*OCTET
                        ; where N = the length encoded in the header

   non-final-data-frame = non-final-header N*OCTET
                        ; where N = the length encoded in the header

   control-frame        = control-header N*OCTET
                        ; where N = the length encoded in the header

   final-header         = %x00-0F 3*OCTET
                        ; where the last 28 bits encode the
                        ; length N as a big endian integer

   non-final-header     = %x40-4F 3*OCTET
                        ; where the last 28 bits encode the
                        ; length N as a big endian integer

   control-header       = %x80 OP 2*OCTET
                        ; and the last 16 bits encode the
                        ; length N as a big endian integer

   OP                   = OCTET

The 32-bit data frame header has the high bit clear, a "more" flag, 2 reserved bits, and a 28-bit payload length.

  +---+------+------+------------+
  | 0 | M(1) | X(2) | length(28) |
  +---+------+------+------------+

The 32-bit control frame header has the high bit set, a 7-bit opcode, and a 16-bit payload length.

  +---+------+---------+------------+
  | 1 | X(7) | code(8) | length(16) |
  +---+------+---------+------------+

The high bit of the initial byte determines a control frame (1) against a data frame.

The following non-normative pseudo-code shows parsing of the frame.

  header = read32(); // read 32 bits in network order

  is_control = (code & 0x80000000) != 0;

  if (is_control)  {
    // control frame

    control_op = (code >> 16);
    length = (code & 0x0000ffff);
  }
  else {
    // data frame

    is_final = (code & 0x40000000);
    length = (code & 0x0fffffff);
  }

  read(buffer, 0, length);

Control frames OP codes are defined by WebSockets, and may not be used by applications. If a client or server receives a control OP not defined by WebSockets, it MUST close the connection.

Control messages allow the client and server to manage the stream behavior, like graceful close, keepalive messages, and even allowing for multiplexing extensions.



 TOC 

5.3.  Stream Close

Because WebSockets needs to distinguish an intentional close from a dropped connection, the client or server MUST send a CLOSE control frame at the end of the stream. Either side may choose to close the connection gracefully at any time.

When the client or server wishes to close the stream gracefully, it MUST send a CLOSE control frame. After sending the CLOSE, no other data may be sent and the TCP socket must also be closed.



 TOC 

5.4.  Connection Keepalive

Because TCP connection may drop without notification to either client or server, either by network failure or by TCP router timeouts, the WebSocket protocol defines a pair of keepalive control frames. By defining a pair of control frames, WebSockets avoids circular ping cascades.

Either the client or server may send a PING-REQUEST control frame to determine if the connection is still alive. The peer MUST respond with a PING-RESPONSE.

If the PING-REQUEST sending peer does not receive a response within a reasonable time, it may close the connection. The client may establish a new connection, but recovery of the original stream is not defined by WebSockets, and must be defined by the application or sub-protocol.



 TOC 

6.  Control Frames

Each control frame has an opcode in the range %x00-7F, followed by any control data for the opcode. [Ed: the code %x7F is reserved to allow for opcodes beyond 127, where the full opcode is encoded as the first bytes of the payload. In practice, this will never be defined.]

Except for the ops defined here (0-5), the codes are reserved by the specification. Applications MUST NOT define their own control frames.



 TOC 

6.1.  NOP (op=0)

control-op = 0 is the no-operation control frame.

The NOP control frame has no payload.

The NOP control frame is the following 4 bytes:

    %x80.00.00.00


 TOC 

6.2.  CLOSE (op=1)

control-op = 1 is the stream close control frame.

Either the server or client may send a CLOSE at any time. After sending the CLOSE, the sender MUST NOT send any data and the receiver MUST NOT read any further data; the stream MUST be closed.

The CLOSE control frame has no payload.

The CLOSE control frame is the following four bytes:

    %x80.01.00.00


 TOC 

6.3.  HELLO (op=2)

control-op = 2 is the required initial control frame. The HELLO control frame MUST be the first data for both the client and server streams.

The HELLO payload consists of the WebSockets version followed by header: value pairs encoded in UTF-8. The header names must be lower case us-ascii.

The HELLO payload grammar, encoded as UTF-8:

  HELLO-payload = "WebSocket/1.0" LF
                  * ( header ":" SP value ) LF

  header        = * ( ["a" - "z"] / ["0" - "9"] / "-" )

  value         = * ( value-char )

  value-char    = %x0020-10FFFF
  SP            = %x0020
  LF            = %x000A

A client HELLO control frame looks like thefollowing:

  %x80.02.00.86 WebSocket/1.0
  url: ws://example.com:880/sample/resource
  origin: http://example.com/launchpage.php
  protocol: tictactoe.example.com/1.0


 TOC 

6.4.  ERROR (op=3)

control-op = 3 is an error frame informing a peer that the connection is being closed due to an error connection. In particular, a server will return an error frame to a failed client handshake.

The ERROR payload consists of the WebSockets version, and followed by header: value pairs.

The ERROR payload grammar, encoded as UTF-8:

  ERROR-payload = "WebSocket/1.0" LF
                  * ( header ":" SP value ) LF

  header        = * ( ["a" - "z"] / ["0" - "9"] / "-" )

  value         = * ( value-char )

  value-char    = %x0020-10FFFF
  SP            = %x0020
  LF            = %x000A

A server ERROR control frame looks like the following:

  %x80.03.00.42 WebSocket/1.0
                status: 404
                message: the resource is not available.


 TOC 

6.5.  HEADERS (op=4)

The HEADERS control frame allows for dynamic renegotiation of connection values like heartbeat timeouts, flow-control windows, etc.

HEADERS consists of a list of "header: value" pairs, like the HELLO frame.

The header payload is encoded as UTF-8.

The header names are restricted to US-ASCII lower case alphanumeric characters, plus the "-" character.

The HEADERS payload grammar, encoded as UTF-8:

  HEADERS-payload = * ( header ":" SP value ) LF

  header          = * ( ["a" - "z"] / ["0" - "9"] / "-" )

  value           = * ( value-char )

  value-char      = %x0020-10FFFF
  SP              = %x0020
  LF              = %x000A

The HEADERS control frame might look like:

    %x80.04.00.22 Heartbeat: 120s
                  Buffer-Max: 65536


 TOC 

6.6.  PING-REQUEST (op=5)

The PING-REQUEST may be sent by either the client or the server to check if the connection is still valid. The receiving end MUST respond with a PING-RESPONSE control frame.

Because the WebSocket connection is long-lived, intermediaries like home routers might close idle connections without notifying either end. Clients and servers may use the PING-REQUEST ping to check the status of the connection.

It is recommended that clients and servers do not send PING-REQUEST unless specifically configured to do so by the application.

PING-REQUEST does not have a payload.

The PING-REQUEST control frame is the following 4 bytes:

    %x80.05.00.00

[Ed: the working group has also discussed asymmetrical heartbeats as an alternative to the ping-style. For the heartbeat to work, the timeouts would need to be negotiated in HELLO or HEADER.]



 TOC 

6.7.  PING-RESPONSE (op=6)

The PING-RESPONSE is a response control frame to the PING-REQUEST. When a peer receives a PING-REQUEST control frame, it MUST send a PING-RESPONSE, to let the other end know the connection is still available.

PING-RESPONSE does not have a payload.

The PING-RESPONSE control frame is 4 bytes as follows:

    %x80.06.00.00

[Ed: the working group has also discussed asymmetrical heartbeats as an alternative to the ping-style.]



 TOC 

7.  HELLO Headers

The following describes the HELLO headers used during the initial handshake. HELLO values are UTF-8 strings and the header names are lower-case ALPHA characters plus the "-" character.

The client required headers are "url", "origin", and "protocol". Other HELLO headers may be used, but are not defined or mandated by the WebSockets specification.

The server required headers are "status", "origin", and "protocol". Other HELLO headers may be used, but are not defined or mandated by the WebSockets specification.



 TOC 

7.1.  url

The "url" must be a valid [RFC3986] (Berners-Lee, T., Fielding, R., and L. Mainter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) absolute-URI. In particular, the host must be defined.

The URL value is encodes as UTF-8.



 TOC 

7.2.  origin

The client MUST send an "origin" header during the handshake to inform the server of the source HTTP page.

The server may use the "origin" header to reject connections from unknown origins, preventing certain kinds of browser hijacking scenarios.

[Ed: must non-browser clients send a dummy "origin" even though the concept is meaningless?]



 TOC 

7.3.  protocol

"protocol" is a required header used to validate the application protocol build on top of WebSockets.

If the server does not understand the protocol, it MUST reject the connection.

The server must return the same protocol in the client HELLO if it does understand the protocol.

Although the protocol value is an arbitrary header-value, it is recommended to use unique names with a version to avoid conflicts, such as "tictactoe.example.com/1.0".

The "text" protocol is reserved by WebSockets. The payload for the "text" protocol MUST be unicode characters encoded in UTF-8.



 TOC 

7.4.  status

"status" is a required header in the server's HELLO frame giving the handshake value.

For WebSockets, the value must be "200" for a successful connection.



 TOC 

8.  Security Considerations

This section is meant to inform application developers and users of security issues related to WebSockets. This list is unlikely to be complete.



 TOC 

8.1.  HTTP

Many, if not most, of the security issues related to HTTP are also present in WebSockets, because WebSockets uses HTTP for its handshake, and because many WebSockets clients and servers will also be HTTP clients and servers.



 TOC 

8.2.  Browser Scripting attacks

Compromised HTTP sites or improperly designed HTTP applications can allow arbitrary JavaScript code to execute on a browser. The hijacked script might attempt to use a HTTP request for a WebSocket server, or might attempt to use a WebSocket request for a HTTP server.

The script may also use a WebSocket request for an entirely different server than the requesting page. The risk can be minimized by servers checking the "origin" header, but this may not be sufficient.

Hijacked clients may also attempt to open a WebSocket connection using a HTTP/XML connection from the browser, attempting to spoof a valid WebSocket connection. WebSocket servers should be written to minimize these risks.

Hijacked clients may open a WebSocket connection to a non-WebSocket HTTP service.



 TOC 

9.  Acknowledgements

This specification draft is substantially derived from Ian Hickson's "The WebSockets Protocol" at http://www.whatwg.org.specs/web-socket-protocol/.

This draft also incorporates discussions from the HyBi mailing list.



 TOC 

10. Normative References

[HTML] Hickson, I., “HTML,” May 2010.
[I.D.abarth-origin] Barth, A., Jackson, C., and I. Hickson, “The HTTP Origin Header,” September 2009.
[I.D.agl-tls-nextprotoneg] Langley, A., “Transport Layer Security (TLS) Next Protocol Negotiation Extension,” January 2010.
[I.D.loreto-hybi-requirements] Loreto, S., “HyBi Requirements and Features,” March 2010.
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” RFC 2119, BCP 14, March 1997.
[RFC2246] Dierks, T. and C. Allen, “The TLS Protocol Version 1.0,” RFC 2246, January 1999.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Mainter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, “Internationalizing Domain Names in Applications (IDNA),” RFC 3490, March 2003.
[RFC3629] Yergeau, F., “UTF-8, a transformation format of ISO 10646,” STD 63, RFC 3629, November 2003.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Mainter, “Uniform Resource Identifier (URI): Generic Syntax,” STD 66, RFC 3986, January 2005.
[RFC3987] Duerst, M. and M. Suignard, “Internationalized Resource Identifier (IRIs),” RFC 3987, January 2005.
[RFC4366] Blake-Wilson, S., Nystrom, M., Hopwood, D., Mikkelsen, J., and T. Wright, “Transport Layer Security (TLS) Extensions,” RFC 4366, April 2006.
[RFC5234] Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” STD 68, RFC 5234, January 2008.
[WEBADDRESSES] Connolly, D. and C. Sperberg-McQueen, “Web addresses in HTML 5,” May 2009.
[WSAPI] Hickson, I., “The Web Sockets API,” May 2010.


 TOC 

Author's Address

  Scott Ferguson
  Caucho Technology


 TOC 

Full Copyright Statement

Intellectual Property