Got Bots? Want to include proof-of-work checks in your API Proxies? If so, I’ve got something for you.
Proof-of-work mechanisms are a way a receiver of a message can require the sender of a message to perform some work before constructing a valid message. Hashcash is one mechanism for encoding the proof-of-work. Today I’m going to show you how to include Hashcash checks in your API Proxies.
First, some background. With Hashcash, a server can require that a client send it a string, that hashes to a value that has a specified number of leading zeros. The hash function used in Hashcash is SHA1, but it need not be - you could create a different proof-of-work with a different hash if you like.
The SHA1 function is well defined and there are tools included in the operating system for computing them. For example in my OSX laptop, I can do this to compute a sha1 checksum on a string:
$ echo -n abc | shasum -a 1
a9993e364706816aba3e25717850c26c9cd0d89d -
You can look here to find more test vectors for various one-way hash functions.
Because hash functions are one-way, there’s no way to predict the value of a hash function based on the input. The only way to know the hash value is to compute it. And this takes time.
Hashcash lets the server say to the client: “you need to send me a string that starts with a specific set of characters, and ends with anything you like. But the SHA1 hash of the string must have N leading zeros.” Because the client will have to try lots of strings before finding one, it means the client has to do “work”. If the client sends a string that produces a SHA1 with leading zeros, it amounts to proof-of-work. If the client sends a string that does not produce a SHA1 with the requisite number of leading zeros, then the server can reject the request based on a lack of proof-of-work.
There are public-domain libraries that compute Hashcash values. It’s pretty simple if you think about it… In psuedo code:
do {
tstring = compose(fixed_part,timestamp,random_part);
} while (hasLeadingZeros( sha1(tstring), N ));
Doing “the work” on the client side might take 200ms or even 1200ms or more. If you require a hashcash with 32 bits of zeros, it could take weeks of runtime. But on the server, it takes less than a millisecond to verify the work.
This is the secret to bot interdiction. By requiring the client to do work, which is really easy for the server to verify, an API server discourages bots. A bot attack plan relies on a computationally cheap and easy way to communicate with the server. If the protocol requires that the bot takes 5 seconds of compute time before making a request, it makes a bot much less economically feasible.
OK, so proof-of-work discourages bots, and Hashcash is a well-known way to enforce proof-of-work. I’ve just put together an example that shows how you can embed Hashcash verification into any API Proxy running in Apigee Edge.
The code is all available on github here.
And I’ve recorded a 12-minute screencast showing how it works, too.
I demonstrate it using a Java command-line program as the client generating the Hashcash, but in a production case, it will be the dedicated client running on a mobile device, or on a server, etc. that produces the Hashcash. Check it out.
If you have any questions or feedback, on this, let me know! I’d love to hear it.
