*DRAFT DRAFT DRAFT* Chopper: a pre-processing layer for steganograpic communication spec written by Nick Mathewson based on designs by Zachary Weinberg, Jeffrey Wang, Vinod Yegneswaran, Linda Briesemeister, Steven Cheung, Frank Wang, and Dan Boneh. Incorporates some text from code comments by Zachary Weinberg 0. Status This document attempts to coalesce the chopper design from the StegoTorus paper into a single specification, incorporating changes and clarifications from the stegotorus implementation. It is still a draft, and needs lots more review from Zack and others before promulgating it widely. 1. Introduction and preliminaries. This document describes a concrete instance of the protocol "Chopper" as given in the StegoTorus design [StegoTorus]. The purpose of this protocol is to transmit a data stream as a series of encrypted blocks over a reliable but not necessarily order-preserving transport, such that an observer can't distinguish the blocks from a series of random byte-strings of equivalent length. This final nondistinguishability property is crucial for layering a steganographic transport over Chopper. 1.1. Prerequisites This protocol requires the following operations: AE_Enc(Key, Nonce, Data) AE_Dec(Key, Nonce, AEData) An authenticating encryption mode such that AE_Enc encrypts 'Data' using 'Key' and gives a bytestring of length Len(Data)+TAG_LEN, and AE_Dec either recovers the or original Data from the encrypted bytestring, or declares that authentication field on the bytestring was incorrect. Keys for this cipher are of length AE_KLEN bytes. Nonces are of length NONCE_LEN. DH_GEN() -> (x,X) DH_FINISH(x,Y) -> Z / nil A key generation and a key agreement operation instantiated with some Diffie-Hellman operation. DH_GEN() uses our local entropy source and produces a private key x and a public key X. DH_FINISH() takes a private key x and a public key Y and produces either shared key material Z or a nil element to indicate that Y was not well-formed. These operations must provide the usual Diffie-Hellman security properties. The public keys of this algorithm are encoded as bytestrings of length DH_LEN bytes. KEM_GEN() -> (x,X) KEM_ENC(X) -> k, K KEM_DEC(x, K) -> k / nil A public-key key-encapsulation mechanism such that PK_GEN() generates a private key x and a public key X; KEM_ENC generates a secret 'k' and its value encrypted to a public key (K); and KEM_DEC, given an encrypted secret and a private key, either recovers the original secret k or gives a nil value to indicate that M was not a valid encryption. We require that this public-key cipher be chosen so that messages are not distinguishable from random bytestrings. Messages in this cryptosystem are of length KEM_LEN_BITS _bits_. BEnc(Key, Data) -> Encrypted_Data BDec(Key, Data) -> Decrypted_Data A block cipher, of block size BLOCK_LEN. BLOCK_LEN must be at least 16 bytes. Keys for this cipher are of length B_KLEN bytes. KDF(Sec, Salt, N) -> Key_Material A KDF producing N bytes of secure key material from a secret 'Sec' and salt 'Salt'. Additionally, NONCE_LEN = BLOCK_LEN. We also use these notations: x | y The byte string x concatenated with the byte string y. Int(N,B) The nonnegative integer N encoded as a B-byte big-endian number. IVal(S): The integer value of a bytestring S considered as a big-endian nonnegative number. Z(N) N bytes of numeric value 0. This is equivalent to Int(0, N) Len(S) The number of bytes in bytestring S. S[a:b] A subsequence of the bytestring S beginning with the a'th byte (zero-indexed) and ending immediately before the b'th byte (zero-indexed). Note that Len(S[a:b]) == b - a, and that S[0:Len(S)] = S. S[a] The a'th byte (zero indexed) of a bytestring S. a // b Truncating division of nonnegative integer a by positive integer b. The result of a // b is the largest integer less than or equal to a / b. 1.2. Choice of primitives We define the profile "AES128-GCM / AES128 / ECDH P-256 / HKDF-SHA256 / Möller-2004 " as follows: Let AE_Enc and AE_DEC be AES128-GCM. Let TAG_LEN = 16, and AE_KLEN=16. [AES-GCM] Let BEnc and BDec be encryption and decryption respectively with AES128. Let BLOCK_LEN = 16, and B_KLEN=16. [AES] Let DH_GEN and DH_FINISH be ECDH on the NIST standard curve P-256, with public values represented by their x coordinates, transmitted as a 32-byte big-endian number. [P256] Let KDF be HKDF-SHA256. [HKDF] Let KEM_* be taken from the cryptosystem of [Möller], instantiated with the curves given in that paper. So KEM_LEN_BITS is 164. 1.3. Protocol overview The chopper protocol sends an authenticated, bidirectional encrypted data stream between a client and a server sharing a reliable data transport that might deliver blocks out-of-order. The data stream is considered a single "link", though it may be sent over one or more simultaneous "connections" associated with the link. The "client" is an anonymous party who initiates a link; the "server" is the party who responds to the client's request. The client must know a public key for the server. To reconstruct the data, each block of data on a link is given a sequence number, starting from zero. Sequence numbers must never repeat on the same link without rekeying. Each link is associated with a temporally unique 32-bit link ID. These must be unique among all links active on a server at a time. 2. The block format A plaintext block header consists of: HEADER (BLOCK_LEN bytes) SeqNo [4 bytes] dataLen [2 bytes] padLen [2 bytes] opcode [1 byte] check [BLOCK_LEN - 9 bytes] A plaintext body consists of: BODY data [dataLen bytes] padding [padLen bytes] An encrypted block consists of: the header encrypted with the block cipher, and the data and the padding encrypted together with the authenticated-encryption cipher. Most values for a header are self-explanatory. The sequence number begins with 0, increments by one with each block sent, and must never wrap around. The dataLen and padLen fields are the number of bytes in the block's data and padding respectively. The opCode is used by other parts of the protocol. The check field is used to provide authenticity by ensuring that comparatively few ciphertext blocks will decrypt to a valid plaintext block -- it is initialized to a known value, usually Z(BLOCK_LEN-9). The encrypted value of the header block is used as a nonce for authenticated encryption to encrypt the data and padding together. When processing a block, the receiver maintains a 256-element sliding window of acceptable sequence numbers, which begins one after the highest sequence number so far _processed_ (not received). Because blocks may arrive in a way that doesn't let us determine where one ends and other begins, our decryption operation may yield any of three results: "We need more input (N bytes total) to decrypt this block completely"; "We used the first N bytes of the input stream to decrypt a block whose content is D"; or "This block is corrupt." A little more formally: We define HEADER(seqNo, dataLen, padLen, opCode, check) such that seqNo is an unsigned integer less than 2^32 dataLen is an unsigned integer less than 2^16 padLen is an unsigned integer less than 2^16 opCode is an unsigned integer less than 2^8 check is a bytestring, with len(check) = BLOCK_LEN - 9 as: return Int(seqNo, 4) | Int(dataLen, 2) | Int(padLen, 2) | Int(opCode, 1) | check We define UNHEADER(H) -> (seqNo, dataLen, padLen, opCode, check) such that Len(H) = BLOCK_LEN as: Let seqNo = IVal(H[0:4]) Let dataLen = IVal(H[4:6]) Let padLen = IVal(H[6:8]) Let opCode = IVal(H[8:9]) Let check = H[9:Len(H)] return (seqNo, dataLen, padLen, opCode, check) We define EncBlock(key_bs, key_ae, seqNo, check, data, padLen, opCode) as: Let H = HEADER(seqNo, len(data), padLen, opCode, check) Let EH = BEnc(key_bs, H) Let EB = AE_Enc(key_ae, EH, data | Z(padLen)) return EH | EB We define DecBlock(key_bs, key_ae, seqNoTop, needCheck, EncData) -> ("DONE", seqNo, data, opCode, N)/("nil")/("NOT_DONE", N) where EncData is a sequence of at least BLOCK_LEN + TAG_LEN bytes, as: Let EH = EncData[0:BLOCK_LEN] Let H = BDec(key_bs, EH) Let seqNo, dataLen, padLen, opCode, check = UNHEADER(H) If check != needCheck or seqNo >= seqNoTop + 256: return ("nil") # XXXX Too early? Let TotalLen = dataLen + padLen + BLOCK_LEN + TAG_LEN if Len(EncData) < TotalLen: return ("NOT_DONE", TotalLen) Let EB = EncData[BLOCK_LEN:TotalLen] Let B = AE_Dec(key_ae, EH, EB) if B = nil: return nil return ("DONE", seqNo, B[0:dataLen], opCode, N) 2.1. Opcodes The following OpCode values are defined: 0 DATA -- Application data; see 4 below. 1 FIN -- Last block of application data for clean shutdown; see 4. 2 RST -- Protocol error; unclean shutdown 3 RC -- Reconnect 4 NC -- New link, client side; see 3 below. 5 NS -- New link, server side; see 3 below. 6 RKI -- Initiate rekeying; see 5 below. 7 RKR -- Respond to rekeying; see 5 below. 3. Negotiating or resuming a link. To start a new link, the client sends a block prefixed with a KEM output to set up a temporary key, containing the first half of handshake: NewLinkReq(X_KEM, AccessToken, padLen) -> block, temp_key, dh_key where X_KEM is a public key for the KEM algorithm: -- Generate the temporary key material and temporary key let temp_key, temp_key_enc, temp_key_padded, check = PaddedKEM(X_KEM) -- Generate the message Let dh_key, dh_pubkey = DH_GEN() Let body = dh_pubkey | AccessToken -- Derive temporary key Let bs_c2s_tempkey, bs_s2c_tempkey, ae_c2s_tempkey, and ae_s2c_tempkey = KDF(temp_key, X_KEM, AE_KLEN*2 + B_KLEN*2) Return temp_key_padded | EncBlock(bs_c2s_tempkey, ae_c2s_tempkey, 0, check, body, padLen, NC) Where PaddedKEM(X_KEM) -> temp_key, temp_key_enc, temp_key_padded, check is defined as: Let n_pad_bits' = 8 - ( KEM_LEN_BITS - 8 * (KEM_LEN_BITS // 8) ) Let n_pad_bits = n_pad_bits if n_pad_bits < 8 = 0 if n_pad_bits = 0. Let temp_key_pad = n_pad_bits random bits. Let temp_key, temp_key_enc = KEM_ENC(X_KEM) Let temp_key_padded = temp_key_pad | temp_key_enc Let check = Z(BLOCK_LEN-9-1) | 8 - n_pad_bits zero bits | temp_key_pad return (temp_key, temp_key_enc, temp_key_padded, check) To resume a link, the client sends a special block containing a temporary key encapsulated with the KEM, and a description of the link to be resumed: ResumeReq(X_KEM, linkID, AccessToken, padLen) -> block, temp_key, dh_key where X_KEM is a public key for the KEM algorithm: Let bs_c2s_tempkey, bs_s2c_tempkey, ae_c2s_tempkey, and ae_s2c_tempkey = KDF(temp_key, X_KEM, AE_KLEN*2 + B_KLEN*2) let H = Header(linkID, 0, 0, RC, check) let EH = BEnc(bs_c2s_tempkey, H) [[XXXX what keeps the client from guessing a link? Is 32 bits really enough?]] return temp_key_padded | EH To process a request on a connection not currently associated with a link, the server proceeds as follows: XXXX define constants as used below to avoid ceil stuff DecInitialBlock(x_KEM, X_KEM, encData) where encData is at least CEIL(KEM_LEN_BITS / 8) + BLOCK_LEN let tmp_key_padded = encData[0:TMP_KEY_PADDED_LEN] let temp_key_pad_bits = the first PAD_BITS bits of tmp_key_padded let EH = encData[TMP_KEY_PADDED_LEN:TMP_KEY_PADDED_LEN+BLOCK_LEN] let tmp_key_enc = tmp_key_padded minus its initial PAD_BITS bits let tmp_key = KEM_DEC(x, tmp_key_padded) if tmp_key = nil return nil [[XXXX check for replay]] Let bs_c2s_tempkey, bs_s2c_tempkey, ae_c2s_tempkey, and ae_s2c_tempkey = KDF(temp_key, X_KEM, AE_KLEN*2 + B_KLEN*2) let H = BDec(EH, bs_s2s_tempkey) Let seqNo, dataLen, padLen, opCode, check = UNHEADER(H) if check[0:Len(check)-1] != Z(Len(check) - 1): return nil if opCode != NC and opcode != RC: return nil if opcode == NC: bodyLen = TMP_KEY_PADDED_LEN + BLOCK_LEN + dataLen + padLen + tagLen if Len(encryptedData) < bodyLen): return "NOT_DONE", TMP_KEY_PADDED_LEN + BLOCK_LEN + dataLen + padLen + tagLen: let EB = encData[TMP_KEY_PADDED_LEN + BLOCK_LEN:bodyLen] let B = AE_DEC(ae_c2s_tempkey, EH, EB) if B = nil: return nil if Len(B) < DH_LEN: return nil Let x,X = DH_GEN() Let Z = DH_Finish(x, B[0:DH_LEN]) if Z = nil: return nil [[XXXX handle AuthTag]] Let linkID = NewLinkID() Let B_reply = X | Int(linkID, 4) | authTag return "Reply", EncBlock(bs_s2c_tempkey, ae_s2c_tempkey, 0, Z(BLOCK_LEN-9), B_reply, padLen, NS). else: -- opcode == RC: if seqNo == 0 or padLen != 0 or dataLen != 0: return nil return "Resume", linkID. 4. Transmitting and reassembling data Blocks may arrive in any order, with any amount of data (including 0). When sending blocks, each party starts with sequence number 0, and increments the sequence number by 1 for each block sent. A block can be sent on any connection currently associated with a link. Parties must not accept any duplicate sequence number, or any sequence number over 256 larger than the largest sequence number currently processed (not received). Unreliable transports will require some means for retransmission of missing data; this is out of scope for the current document. To close a link, either party may send a FIN block to indicate that it has no more data to send. When the other party sends a FIN block, the link is closed, though both parties must 5. Re-keying a link To re-key a connection, a peer sends an RKI block containing as its body the public-key output of a DH_Gen() operation. Upon receiving an RKI block, the other peer sends an RKR block containing the other public-key output of a DH_Gen() operation. Once the RKR block has been sent or received both parties re-derive new key sets as in the original handshake from section 4 above, and send subsequent blocks using sequence number 0. After sending an RKI block, a peer may not send any more data until it has received an RKR block. [[XXXX There is a race in the system described in the paper where both parties decide to send an RKI block at around the same time. Neither may in that case send an RKR block, and the link can not re-key itself. One option is, if you receive an RKI block while waiting for an RKR block, treat it as an RKR block, and re-key. But that needs analysis to see if it will work.]] [Moeller] Möller, B. A Public-Key Encryption Scheme with Pseudo-random Ciphertexts. In Computer Security — ESORICS (2004), vol. 3193 of Lecture Notes in Computer Science, pp. 335–351. http:// www.bmoeller.de/pdf/pke-pseudo-esorics2004.pdf. [StegoTorus] Zachary Weinberg, Jeffrey Wang, Vinod Yegneswaran, Linda Briesemeister, Steven Cheung, Frank Wang, and Dan Boneh. "StegoTorus: A Camouflage Proxy for the Tor Anonymity System." ACM CCS 2012.