Recently I’ve run into a couple problems related to things like online voting and payment escrow and where I wanted to be able to provide a hosted service for something, that would be transparent and verifiable. To minimize the amount of trust the users would have to put in me, I wanted to not only make the code that I was running publicly available, but also give people the ability to check that my hosted service was actually running the code that I said it was, and I hadn’t deployed something different, or done something to my server that would make it behave differently.
This kind of transparency didn’t seem like a particularly exotic thing to want, so I googled around for people running a platform that would let me do that kind of thing. But I couldn’t find anything, so I thought I’d write it up and see if anyone has done this, or has thoughts on how it should be done.
Why do I want to make hosted stuff transparent? Here’s an example.
Adam and Bob make a bet, and Chris agrees to referee if one of them tries to cheat. To do this, they agree that two people out of three will need to agree to access the money. If Adam and Bob settle the bet as planned, Chris doesn’t need to do anything. If Adam or Bob loses the bet and disappears, Chris will make sure the money goes to the winner. And if Chris wants to run off with Adam and Bob’s money, he’ll have to persuade one or the other to conspire with him.
To help with transactions like these, someone – let’s call them Dave – runs a website that lets them do the following:
- Adam goes to Dave’s website and types in his own, Bob’s and Chris’s e-mail addresses.
- Dave’s server creates a private BitCoin key and a public BitCoin address. The private key can be used to access money sent to the public address.
- Dave’s server splits the private key into three parts, and sends a different two parts of the three to each of Adam, Bob and Chris along with the public BitCoin address.
- Dave’s server deletes the key, so it won’t be able to access the money that Adam and Bob are about to pay in.
- Adam and Bob pay their stakes to the public address.
- When the bet is settled, the loser sends their part of the private key to the winner, who can now access the money.
- If the loser fails to send the winner their part of the private key, the referee will send theirs to the winner instead.
(*) The BitCoin people have a plan to solve this problem properly by building multiple-signature features right into the transactions, so hopefully when they’re done it’ll take care of itself.
How can Adam, Bob and Chris trust that Dave isn’t going to secretly copy the private key, then use it to steal the money?
The software and hardware Dave is running should be publicly verifiable wherever possible, or failing that verifiably in the control of a large, trusted organization with little incentive to cheat.
Thinking about the way cloud hosting works right now, we might do something like this:
- The server hardware the service runs on is controlled by Rackspace/Amazon.
- The server OS and core software are based on a publicly available image, if possible created using a transparent process by a trusted party (ideally Rackspace/Amazon, as we have to trust them anyway).
- The setup steps for the publicly available image are automated based on a public source code repository whose history cannot be modified, and nobody is able to log into the server and change them.
- A public record is available showing which image is being used for the IP address of the service.
- A public record is available showing which source code repository was used.
The best I think I could do using existing services would be something like:
- On EC2 someone (let’s call them Ed) would create a publicly available AMI based on an official Linux AMI, and publicize the steps he used to make it. I’ll call it Ed’s Transparent AMI.
- Ed’s Transparent AMI would have SSH logins disabled.
- Ed’s Transparent AMI would run a script on boot specified by a parameter.
- Dave would create his setup script and check it into a public Subversion repo on Google Code.
- Dave would create read-only credentials for his EC2 account and publish them.
- Dave would launch an instance using Ed’s Transparent AMI, specifying his setup script.
- Dave would (probably) map a DNS name to the IP address of his instance.
- If the system allowed Dave to update things after the instance was set up, his changes would have to go through public version control, probably using something like Puppet.
- If Adam, Bob or Chris wanted to check up on Dave, they would:
- Look up the IP address of the site.
- Use the public EC2 credentials to find out which instance was attached to the IP address.
- Use the public EC2 credentials to check which script the instance was using.
- Check the history of the script on Google Code to make sure Dave hadn’t done anything suspicious.
- If anyone wanted to check Ed’s Transparent AMI, I guess they’d follow the steps he said he’d used to create it and compare what they got with what he was providing, which is the best we can do for third-party AMIs right now.
Anyone have any thoughts? Friendly person from Rackspace on Twitter?