6.824 Lecture 19: Censor-resistant publishing

We discussed mix-nets -- anonymous communication.

What about anonymous publication of information?
  Assume you have something to say.
    Perhaps DVD decoding algorithm.
  You know powerful opponents don't want you to say it.
  You're worried about censoring, legal attacks on servers,
    malicious modification.
  Let's assume that mix-nets can anonymize act of publishing.
    So we're worried about attacks on servers.

Storing documents on individual servers is weak.
  Even if it's not your server.
  Easy to target individual servers.
  Can we somehow store information, but not on specific servers?

Shamir Secret Sharing.
  Given a message M, produce n shares, any k of which reconstruct M.
  But < k don't give any information about M.
  For n = 3:
    Choose random a and b.
    f(x) = a x^2 + b x + M
    Describes a polynomial, f(0) = M.
    Any 3 points specify the polynomial.
    So evaluate f(x) at n random x values.
  Reconstruct with interpolation:
    Say we have (2,f(2)), (3,f(3)), (4,f(4))
    a*2^2 + b*2 + c = f(2)
    a*3^2 + b*3 + c = f(3)
    a*4^2 + b*4 + c = f(4)
    3 equations, 3 unknowns, can solve for a, b, and c.
    M = c
  Note < k equations tells us nothing; can't even verify a guess about M.
    For *any* M, we can produce a and b that make equations work.

Publius: store just shares, on different servers.
  name_i = MD5(contents + i)
  Shares could be stored under these names.
  One per server, or on i mod #servers.
  Note actual Publius encrypts, stores key shares.

Properties?
  Can lose some shares, due to attacks or failures.
  Servers don't know what they store.
    Hard to accuse them individually of publishing illegal data.
  Servers cannot easily even find out what they store.
    MD5 in name_i makes it hard to find other shares.
  Can detect modification.
    name derived from hash of content.

But it's still possible to attack.
  Attacker finds document key (name_i) -- easy since it is published.
  Attacker persuades each server to delete corresponding document share.

How can we make it unattractive to delete document shares?
  How can we motivate people to store other people's shares?

Describe Tangler as if each message were just one block.
  We have a document (or block) M.
  Take two random existing blocks: (x1,y1) and (x2,y2)
  With (0,M), reconstruct degree 3 polynomial, with interpolation.
    We get f(x) = a*x^2 + b*x + M
  f(0) = M, f(x1) = y1, f(x2) = y2
    So this is a polynomial that reconstructs M.
    And we have two shares already: x1,y1 and x2,y2
  Create two new shares, x3,y3 and x4,y4
    Again, with randomly chosen x values.
  Store the new shares on whatever servers.

Share name: sname = hash(x,y)
Document name: sname1,sname2,sname3,sname4

Property:
  My data cannot be restored w/o other people's shares.
  If my file is many blocks, than need shares of many other documents.

The point:
  Attacker, again, knows names of shares of target document.
  Can try to persuade servers to delete them.
  But that will make some subsequently-published documents unreadable.
  Or some previously published documents unreadable.
  
Problems with both Publius and Tangler:
  Small, fixed number of servers.
  No provision for servers coming and going, replica maintenance.
  Still have to worry about anonymous insertion, retrieval.
  Both have stories for this; we'll look at Freenet.