CASE STUDIES
from the files of
Networking Unlimited, Inc.

WebQuery@NetworkingUnlimited.com
14 Dogwood Lane, Tenafly, NJ 07670
Phone: +1 201 568-7810

Automatic Redundant Firewall Failover

Copyright © 1999, Networking Unlimited, Inc. All Rights Reserved.

A client with very conservative firewall administration policies was suffering because even though they installed redundant firewalls, applications could not automatically take advantage of the redundancy because the static routing used would continue to deliver outbound packets to a down firewall. Networking Unlimited, Inc. designed a router and firewall configuration which allowed firewalls to automatically take over for one another without degrading security or modifying the existing rules on firewall utilization.

Background

The client, a large financial institution with an international private routed network, required all users and networks which were not up to corporate security standards to be isolated by firewalls. Among other constraints on the firewalls, only two modes of access were allowed: firewall proxies and network address translation packet filtering. All routing to addresses on the other side of the firewall was required to be static on the "clean" side of the firewall so that there were no dependencies on correct operation on the "dirty" side of the firewall.

As a result of this policy, any time a firewall was installed, static routes would need to be installed on the serving routers on the "clean" side of the firewall to allow advertisement of reachability of "dirty" side addresses. Since the firewalls maintained state information on connections, it was also essential that all packets in any transaction be delivered to the same firewall. Similarly, each firewall would present a different address range on the "dirty" side which would be translated to the same "clean" addresses. As a result, even though redundant firewalls and routers were installed as a matter of routine, any failure of a firewall would result in users being unable to reach across the firewalls unless addresses (or static routes on the routers) were modified.

While some applications were written to understand that the same server could be reached via two addresses, this approach was frequently not available when using commercial products or a proxy server. Instead, users were presented with two icons on their desktops for each application, one for each set of firewall address combinations. That way, when a firewall was lost, the user would have to restart the application using the "backup" access icon and start over. While this was better than nothing, it was not a particularly user friendly approach.

Technical Approach

Providing automatic failover in the event of firewall failure required Networking Unlimited, Inc. to combine a number of seemingly unrelated router features. Network Address Translation (NAT) was implemented on the border routers serving the firewalls (using Cisco IOS 11.2) to map the different addresses used by the redundant firewalls back to a common set of address pairings. This allowed the users on each end of the connection (on opposite sides of the firewall) to use the same addresses regardless of which firewall was actually used.

Since the firewalls were stateful, this approach required all traffic for a particular user pair to always use the same firewall. So if a firewall was lost, the connection would still be broken. The difference is that if we can get the second firewall to fill in for the first, the user would simply restart the application as if nothing had changed and the NAT would make the change in firewall addresses invisible. So the remaining challenge was to provide a mechanism for switching the static routes to the firewalls automatically.

We could not simply run a routing protocol across the firewalls between the "clean" and "dirty" routers, as security was not willing to trust any routing information from "dirty" sources (a totally valid concern). What we could do, however, was use a routing protocol strictly to determine if a path existed between the "clean" and "dirty" routers. BGP was selected for this purpose as it required minimal processing on the routers and could be very tightly controlled on both sides (after all, from the point of view of some "dirty" networks, they are the "clean" network and the client's network was the "dirty" untrustworthy one).

The key is that rather than propagating the reachability of real destinations, the BGP exchanges were used only to determine if a loopback port on the router on the other side of the firewall was reachable. Based on the dynamically learned reachability of the loopback ports on the other side of the firewall, floating static routes were activated (or deactivated) to direct all traffic to the appropriate router and firewall. This way, a failed firewall could be routed around without requiring either side of the firewall to trust the other.

Bottom Line Results

Networking Unlimited, Inc. designed a win-win solution for the client, providing most of the benefits of allowing dynamic routing through the firewalls without exposing either network to the dangers of allowing dynamic routing through the firewalls. Security was satisfied that the safety of their network was not compromised in any way. At the same time, users were happy that failure recovery could be automatic.

This solution appears to have utility for many applications outside this client. For applications where firewalls are used with static Network Address Translation (NAT) and stateless packet filtering, it should be possible to provide failover from one firewall to the other that is totally transparent to the users and their applications. For applications based on proxy servers, dynamic NAT assignments, or stateful packet filtering, care must be taken when setting up the static routes ensure consistent routing at all times. It should also be recognized that under these conditions, the user application will be interrupted any time the primary firewall changes state, which could be troublesome if the application or supporting systems can not handle "hanging" TCP connections cleanly.


Home Page | Company Profile | Capabilities | Coming Events | Case Studies | White Papers | Book

Copyright 1999-2000 © Networking Unlimited Inc. All rights reserved.