Little Bobby Tables
Back in the day – and even today – one of the most common security flaws in websites was a “SQL Injection”. It’s where an attacker uses especially crafted data and puts it into innocent-looking fields within a form that are then used to construct a database query. Or, as XKCD readers will know it, Little Bobby Tables:
The usual way to fix this problem is not actually by “sanitizing your database inputs”, as the cartoon suggests, but by constructing the database query using parametric interpolation rather than concatenation, but it’s still messy. Directly allowing a user to set a parameter value is still undesirable – it doesn’t take a hacker genius to think up ways that might be exploited for fun, or even profit. As such, most programmers will do both sanitizing and parametric interpolation.
What we’ve done here is introduced a simple protocol break. Users still query a database, but they do so indirectly. Choosing odd parameter values won’t affect the query used on the other side, and these can be carefully checked in any case.
But if the webserver is somehow compromised, the attacker can still access the database directly. That exposes a risk – when we hear in the news of breaches and database dumps being posted online, this is what has happened. The solution – like everything in computing – is to add another layer.
The webserver has to accept all legal web traffic – anything HTTP has to handle – but a more specialised service need not. If the webserver validates the syntax of the JSON object and then passes it onto such a specialised server – let’s call it a validation server – which has access to the database, then a compromise of the webserver doesn’t help the attacker much. All they can then do is pass increasingly odd-shaped JSON objects through to the validation server. This is no more than they can do already, in effect – but we can make this harder for them.
Rewrite, Reformat, Recheck
If instead of JSON between the webserver and the validation server, we pass another format – perhaps XML – then an attack which is based around specially formatted JSON will likely fail. We’re now going JSON from the browser to the webserver, and XML from there to the validation server, and finally to SQL for the database server itself. And at each stage, we can perform detailed syntax checks – and ever increasing business logic checks – on the data.
The webserver won’t send queries unless the user is authenticated. The validation server can check authorization. And so on. Each server really need not trust those on the other side – meaning that if a layer is compromised, the attacker gains very little advantage.
Of course, a normal network link still has to pass the data itself. For solutions requiring the highest security, we can use data diodes and similar high assurance gateways. These perform the syntax checks in hardware before sending the data using a physical link that is entirely one-way – think of a torch signalling over a chasm. In our application, we’d place a bidirectional pair of these between the webserver and validation server.
These devices make sending crafted data across the security boundary very hard indeed – and to even do so an attacker has to have compromised the webserver already. Meanwhile, we have achieve complete network isolation between the database network (holding validation and database servers) and the webserver, which has internet access.
This above, sometimes referred to as the simplify-and-check pattern, is that recommended by the UK’s National Cyber Security Centre (NCSC), as a key defensive measure that can be used in a number of situations. It has been written up in a recent NCSC blog post here, with reference to hardware options that combine the simplify-and-check pattern with a data diode to guarantee oneway-ness.
No security solution is ever perfect. Some bug in the validation server’s authorization code might allow an attacker to gain access to data they should not have permission for – though careful testing should reduce that risk. Some bug in the database might return the wrong data, leaking secrets. This particular architecture cannot provide end-to-end security (though more complex architectures can). But on the face of it, the chance of an attacker being able to dump the entire database with this architecture is very low indeed.
In an age where leaks of huge batches of customer data are worryingly common, patterns like this can massively reduce risk. And in a world where customers are increasingly aware of their need for privacy, and the harm breaches can do to them financially, it is ever more important to consider how to reduce these risks.