6.824 Lecture 18: Anonymous e-mail with mix-nets - Last two lectures discussed how to proof identity. Today we talk about how to keep confidential the identity of the sender. This problem appears difficult, since one today Internet we can in principle trace transmissions to its origin. - Why hide your identity? - privacy - to do something bad Let's design a naive e-mail anonymizing proxy. Input: To: input@anon.net From: rtm@mit.edu Really-To: 6.824-staff@mit.edu Output: From: xxx To: 6.824-staff@mit.edu anonymizes source IP address and some mail headers but in what ways does it not provide anonymity? intended recipient may see identifying info in: other mail headers, and the body of the msg note we probably don't want even intended recipient to know us. similarly for snoopers on the output side snooper on the input side also sees your IP address. snooper on both can link in and out messages at proxy. by timing. or by length. snooper on output side can see sequence of e-mails. perhaps can link them together as all from you. so revealing id in one breaks all of them. legal action against proxy, or theft of backup tapes, may reveal: logs linking input and output. Equivalently, perhaps the anonymizer is malicious! - One approach: mix-nets [chaum 1981] - A message with data is forwarded through a series of independently operated nodes (called a mix); 1 through n. Lets say s is source and d is destination. The sender picks mix 1 through n from a large collection of nodes. - Each messages contains a set of instructions and data. The instructions tell the mix to which mix to forward the message to and which key to use to encrypt the remaining instructions and data. - Each mix has a public/private key pair (Kipub, Kipriv) - The instructions for each mix are encrypted with Kipub---thus only mix i can read them. - Assumption: Mixes don't collude. In practice, pick mixes in different places in the world, run by different administrators under different laws. - The idea is to recursively encrypt instructions with Kipub: s encrypts with Knpub "Please forward message to d" = C. s encrypts with Kn-1pub "Please forward message to node n and encrypt with Kn; C" etc. Thus, message to mix 1 is E(K1pub, "Please forward message to mix 2 encrypted with K1pub" + E(K2pub, "Please forward message to mix 2" + ....)) Mix 1 decrypts with K1priv and finds "Please forward message to mix 2 encrypted with K2pub + ...". mix 1 cannot read "...", since it is encrypted with K2pub. It encrypts the "..." and a header new header that contains the addres of mix 2; and outputs the resulting message. - To send message M from A -> M1 -> M2 -> B, A prepares this message: {M2,{B,{M}PubB}PubM2}PubM1 - Why does this work? 1. Messages are encrypted, simple snooping doesn't work. 2. Message looks different at each hop, can't easily match up input with output. (Well, timing and size...) 3. Some mixes can be bad; only a problem if most/all are bad. 4. What if first mix is bad? Knows sender, but not recipient or content. 5. What if last mix is bad? Knows recipient, but not sender or content. It's a problem if most/all of the mixes collude. - The main purpose of a mix is to hide the correspondance between its sealed input and its unsealed output. To make the above scheme better we want: - lots of messages from/to different people. - process a batch of input messages at the same time and reorder them. - make each message really different. the mix could add a random number and encrypt it with the rest of the instruction and data. - message should have a fixed-length. the sender should fragment data if it doesn't fit in a single message. - intermix cover traffic among regular traffic - remove duplicate messages - protecting against abuse (e.g., hashcash) - encrypt data with a symmetric key each time it passes through a mix. - etc - You can also build "reply" blocks, to allow replies to be sent without the replier knowing identity of final target. For A to allow B to reply, A would prepare: {M2,K3,{M1,K2,{A,K1}PubM1}PubM2}PubB K1, K2, K3 can be symmetric keys, just used this once, made up randomly by A. B encrypts message with K3, sends to M2. M2 encrypts with K2, sends to M1. M1 encrypts with K1, sends to A. A receives {{{Reply}K3}K2}K1, knows all keys. [Is this what nym.alias.net does?] Why the nested symmetric key encryption? Why not have B use public key encryption with A's public key? Want msg to look different at each hop. - Implementation - special fields in email headers. for example, type-1 remailer: Anon-To: ... Latent-Time: Encrypt-key: followed by a marker line - Other approaches: - Onion routing (for IP routing) - Crowds (for web browsing)