Wednesday, December 29, 2010

Testing some markup features

This post has no real content of interest.

Ideally, this part should show up normally, though I've put it in its own <div> block.

If this is the second paragraph, then it didn't work.

Monday, December 27, 2010

Creating a Private Blog on a Free Blogging Service

There are a few problems with Facebook (and similar services):

  1. It's a walled garden
  2. You have no control over your data (even though you think you do)
  3. You're not the customer, you're the product

Free blogging sites get around problem 1, but not the other two. Generally, you have even less control over your data, since the whole point is to publish so that everyone in the world can potentially read it. To get around all of these problems, you'd need to host your own site, which can be a pain and costs more than it's worth to most people.

So, what can we do with a free blogging site? We can post encrypted articles, and only distribute the keys to the people we want reading them. Not only have we removed problem 1, since anyone with the appropriate crypto can read the articles, but we've also partially removed problem 2. Why partially? I'll get to that in a bit. We still have problem 3, but that's an economic reality for any free service. You can, however, shop around for a service that treats you as a product with dignity, at least, and you can potentially find a paid blogging service that doesn't support encryption (or whose encryption you don't want to use), at which point you become the customer, and just a little more human in the eyes of the service.

Why might you not want to use a perfectly functional encryption service provided by a blog host? It's a question of who has the keys. It's almost certain that the host would have your encryption keys, and would provide the encryption and decryption on the fly. While convenient, it's still a loss of control, and they can hand your decrypted data to anyone they choose (though you may have some contract protections in this regard). It's also likely that they'll use password-based authentication. We're going to use public-key authentication, and we're going to do it in a way that's fairly easy and robust againt forgotten passwords.

Let's consider the following scheme. You write a new article for your semi-private blog. The bulk of this article (or maybe just a small part of it) is a well-delimited block of ciphertext. Maybe it looks like the following:

BEGIN CRYPTOBLOG
Key: http://some.location/key_identifier
66fa53d9b7b04210a54853e406d7b119...
END CRYPTOBLOG
We use special tags to denote the beginning and end of the special contents. This is easy for a person to pick out visually, and is also easy for a program to parse. The first line points to a URL with keying information for this article. We'd expect many articles to use the same key, since there's no reason not to. Keys should be changed occasionally, to prevent certain attacks that come from large amounts of available ciphertext, and when you want to deny someone who previously had access to your articles access to any new ones. We'll use a nice strong symmetric key encryption algorithm, such as AES-256.

We now have our encrypted article, how do we distribute the keys? The simplest way to do this is through another blog post. We have one key, but we want to make it available to a potentially large number of people. Let's say each of them has an RSA public key. A simple way to propagate the key is with a list of the following form:

Alice   E(Alice,key)
Bob     E(Bob,key)
Charlie E(Charlie,key)
Here the first column is the person's name, and the second is the key encrypted with that person's public key. This isn't great, from a privacy standpoint, because you've just transmitted the names of all your friends. Slightly better is
Pubkey(Alice)   E(Alice,key)
Pubkey(Bob)     E(Bob,key)
Pubkey(Charlie) E(Charlie,key)
Now we haven't revealed anyone's name, but we've revealed their public keys. This allows someone to correlate public keys between subsequent AES keys, revealing the degree of churn in your list of friends. Also, by publishing pairs of public keys and ciphertexts, you're potentially giving an adversary a leg up in cracking the corresponding private keys. Since just a smidge more paranoia costs us very little, let's instead go with the following:
H(Pubkey(Alice)|E(Alice,key))     E(Alice,key)
H(Pubkey(Bob)|E(Bob,key))         E(Bob,key)
H(Pubkey(Charlie)|E(Charlie,key)) E(Charlie,key)
The first column is now a hash of the person's public key and the ciphertext in the second column. Note that previously, your friend could immediately recognize the appropriate line of keying material to decrypt in order to retrieve the AES key. Now he or she has to perform a simple hash based on each line until one of them matches. The hash function doesn't have to be particularly great for this, so we can use something simple like MD5 without worrying about security or privacy being appreciably compromised.

What are our security and privacy properties now? Well, your semi-private articles should be well protected by encryption, and your friends should be able to recover the symmetric key. The identities of your friends are protected, for the most part. What data does this system leak, though?

  1. The hosting service knows who's retrieving your posts, though not who's successfully decrypting them.
  2. The world in general knows how often you are posting.
  3. The world in general knows how long your posts are.
  4. The world in general knows how many people are able to read your posts.
We could do better if we were self-hosted, but this is about the limits of using a free service like Blogger. If you think you have a way to reduce the amount of data leaked, please let me know.

That's the scheme, but how to implement it is another matter. We'd like to have some way for someone to navigate to an article, and be presented with a decrypted page. The easiest way to do this is probably to create a Firefox extension. Note that this must be written in javascript and CSS. The state of cryptography in javascript isn't great, from what I've found poking around online. If a person's public and private keys are loaded into the browser, then the extension should be able to use them to decrypt first the symmetric key and then the article. The extension should probably cache the ciphertext (not the plaintext!) of the symmetric key, since it'll likely be used multiple times. The URL identifies the keys sufficiently at that point.

For most people, the public key is likely to be the most intimidating part. Someone running Linux can easily create an RSA key using OpenSSL. There's no need for a signed certificate. I don't know what would need to be done on Windows. If the blogger is reasonably crypto-savvy, then a BER- or DER-formatted RSA public key, an X.509 certificate, or a PGP/GPG certificate should be equally effective mechanisms for relaying public keys. Generating the list of ciphertexts for a new symmetric key will probably be done on the command line. We'll worry about friendlier interfaces later.

A really nice feature of a scheme like this is that if one of your friends forgets his private key password, he can just send you a new public key and you can either email him the key ciphertexts or edit the old postings to add the new public key's cipher.