Web Security: Blacklists, Whitelists and WAFs

Consider a computer or network that is protected by a “firewall”, there will be two basic ways to configure the firewall:

blacklist, ie: everything is permitted except for these items
whitelist, ie: everything is forbidden except for these items

…oh, and there’s the third form:

total disconnection

…but nobody likes that one.

It took us a long time to get even this far; in the early days of network firewalls we started with blacklists, reasoning that it was sufficient to stop people being able to connect to key services like FTP and Telnet…

Blacklists

We were so naive.

Entire classes of protocol such as RPC were falsely assumed to be “secure” because we had blocked port 111 access to the RPC directory server (rpcbind/portmap); but there was nothing actually protecting the RPC services themselves from being connected to. Anyone who was willing to sit on the internet and attempt to try talking to every port number between 30000 and 50000 would eventually stumble across the NIS password database server. Also NFS generally resided on port 2049, whilst too many firewalls were configured only to block ports between 1 and 1023 thus affording NFS no protection at all.

So Blacklists were an architectural wipeout; thus, Whitelists!

Whitelists – and Magic

Whitelists also had their pitfalls, but at least we started from a safe, default-deny stance.

This seemed (and was) a much better bet.

But it turned out that protocols like FTP were far more rich and complex than the small fragments that we used day-to-day – so when we tried filtering FTP we frequently broke non-interactive, expensive-to-test FTP-based tools such as mirror.pl and its replacements.

When – to get around this – we opened up extra windows of port-range connectivity to permit non-passive FTP to work, these windows were exploited by hackers whilst also meaning we had to maintain specially-customised versions of FTP software that would use only ports in the “magic” 20000-21000 range.

Custom software = more cost.

Eventually we developed filtration software which could support FTP without requiring magic and without breaking the protocol too much, but we still all heaved sighs of relief when older, richer protocols were deprecated in favour of passive, easy-to-filter HTTP retrieval.

Statefulness

Along the way to whitelist-based firewalls we also learned about statefulness; this was important because early attempts at packet filtering technology were extremely naive and enforced logic that looked like:

Consider this packet:

IF it is coming in from the internet

AND IF on the internet side it is coming from port 80

THEN allow it to pass

…because surely it must be a reply to an outbound web request?

This logic looks fine but it’s false (ie: insecure) because all an internet-based attacker need do is bind his port-scanner to port 80 and the firewall will simply ignore it.

Nobody in their right mind would bind a source port to port 80, but hackers are not in their right minds. I would know…

Generally speaking: the problem here is that the firewall rules did not reflect accurately the state of the machines that they were protecting; in this case it was insufficient to assume that a packet was a response to some pre-existing query – unless you had seen the query on its outbound path, or could via equivalent behavioural means such as TCP flags be able to infer the same.^[1]

Learnings

So eventually we learned that firewalls must:

whitelist
be clean – a client should either be blocked or not-blocked; if they are not blocked then they should be able to use the whole protocol
be transparent – no client or server customisation or magic is required to support the firewall
accurately reflect or understand the state and behaviour of the services that they protect; essentially the reciprocal of cleanliness and transparency

…or else they will fail; and when a firewall fails it is not merely a failure of security, it is also a cost burden.

If a firewall fails or is of no practical benefit then why have you been adding all the extra latency, administrative hassle and cost to your architecture to support it?

So that’s where we are today. We understand this now, right?

Web Application Firewalls / WAFs

The best I can say is “kinda”.

Web Application Firewalls are faintly trendy – and are treated in a somewhat panacea fashion – but they must follow exactly the same rules as above:

They must be whitelist-based; a WAF must forbid all forms of access to the webserver it protects, except that which is explicitly permitted.
They must be clean; a WAF must permit full access to all parts of the website to those people are permitted to use it.
They must be transparent; a WAF must not require client or server customisation in order for it to be deployed.

…and…

A WAF must accurately reflect or understand the state and behaviour of the services it protects.

…and this is where it all gets complicated, perhaps terminally.

Leaving aside the issue of decapsulating “secure” HTTPS so that the WAF can look at your traffic, there’s a question of how much and how deeply the WAF needs to understand your application.

It’s not just a question of whether the WAF should track which session cookies have been issued to what users – and deny access to invalid session cookies. That’s just basic stuff.

Amongst the many problems is that many webservers (and appservers) diverge greatly in how they treat web traffic and unless your WAF mirrors that precisely then it will be as useless as the “port 80 firewall” example, above. If a malicious hacker sends duplicate cookie-setting requests within a single request:

Set-Cookie: jsessionid=value1 Accept: */* Set-Cookie: jsessionid=value2

…then which one should the WAF honour? The first? The last? Both? Neither? Which one would the application honour? Should we reject the request? Would that break transparency? Would the application have rejected the request?

Aside: An excellent resource for illustrating these kinds of problems is The Tangled Web – it’s a must-read for anyone needing to understand the diversity of HTTP security problems.

Prescriptive security geeks might respond:

It doesn’t matter. We shall program the WAF to provide a cross-check on all traffic, and forbid the traffic that we think is wrong, in order to defend the application”

– but in truth that’s just another way of saying: we don’t mind the service being slightly (majorly?) broken in the name of security, for some people, because we’ll fix it eventually…

Except that we won’t. Eg: The webserver will offer some nifty piece of functionality which the firewall will break, and there will be argument over why it is broken, and how to fix it, and who should bear the cost. Principles of cleanliness and transparency will be broken.

None of these things make for an attractive web service offering.

But imagine then that through some superhuman effort we solve all these problems; that we create a WAF which understands the application’s architecture perfectly, has no bugs and cannot be spoofed, which is missing no functionality and blocks no valid user access.

Well then: by the time a WAF is secure and advanced enough to understand the state and behaviour of the services it protects then in the name of simplicity it might as well just serve the data back to the client itself – at which point the WAF could simply replace the webserver it protects. Plus you could leave HTTPS intact and leverage extra trust off that with client-side certificates, perhaps?

So why ever have a WAF?

I don’t know. I’d rather have a decent appserver and less complexity.

Summary

All questions of security eventually collapse to one of:

Can my service be persuaded to do something that I would consider unexpected or undesirable?

When faced with the richness of transactional state afforded by HTTP, an entire webserver, and its total software stack, my only response will ever be to mitigate on the web server.

To use only simple software of known robustness.

Filtration is simply not a viable at that level.

UPDATE: I haven’t even covered how – in the pursuit of monitoring application state – the WAF webserver versus Application webserver might have different (conflicting) approaches towards parsing URIs and brokering access to web resources; see The Tangled Web for that, it takes a chapter or two to explain.

UPDATE 2: continues in part 2

—
[1] This distinction becomes important in the filtering of TCP (with state flags) versus UDP (without); UDP filtering typically requires session state, timers, heartbeats and luck to work properly.

Written by: Alec Muffett

Meet us at CyberUK 2026

Blacklists

Whitelists – and Magic

Statefulness

Learnings

Web Application Firewalls / WAFs

Summary

Company

Find Us

Useful Links