From owner-sage-members@usenix.org Wed Aug 3 13:43:49 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j73KhWGT014393 for ; Wed, 3 Aug 2005 13:43:46 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j73Kg5V6025202 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Aug 2005 13:42:05 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j73Kg3Uo025199; Wed, 3 Aug 2005 13:42:03 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Wed, 3 Aug 2005 13:35:06 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j73KZ3V6024785 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 3 Aug 2005 13:35:05 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j73KZ3Un024784 for sage-members-outgoing; Wed, 3 Aug 2005 13:35:03 -0700 (PDT) Received: from peter.smxy.org (smxy.org [64.32.179.41]) by usenix.org (8.12.10/8.12.10) with ESMTP id j73KYwV6024776 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Wed, 3 Aug 2005 13:35:01 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by peter.smxy.org (Postfix) with ESMTP id DFB35215E for ; Wed, 3 Aug 2005 16:34:55 -0400 (EDT) Received: from smxy.org ([127.0.0.1]) by localhost (peter.smxy.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id 27661-10 for ; Wed, 3 Aug 2005 16:34:55 -0400 (EDT) Received: from [192.168.32.111] (unknown [65.199.146.162]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by peter.smxy.org (Postfix) with ESMTP for ; Wed, 3 Aug 2005 16:34:55 -0400 (EDT) Message-ID: <42F12A63.7070609@smxy.org> Date: Wed, 03 Aug 2005 16:34:43 -0400 From: "Shaun T. Erickson" User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Sage Members Subject: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-20030616-p9 at smxy.org Sender: owner-sage-members@usenix.org Precedence: bulk Status: RO Content-Length: 1686 Lines: 31 I work for a small startup, and we are an ASP. Customers run our software from two webservers in our DMZ (that use public IPs), which communicate with two database servers behind the firewall (that us private, non-routable IPs). The two systems in the DMZ are a Linux box and a Windows 2000 Server box. Ditto for the two database servers - the Linux box runs Sybase, and the Windows box runs MS-SQL. I need to replicate this at an off-site location, so that if our primary site goes down for some reason, customers can access our application from the remote site. I have two primary impediments to doing this: I have virtually zero budget, and I've never done anything like this before. I presume that we can find a suitable place to host our systems, and that I can clone my four primary systems and get them up and running at the other location. I also presume, that with some discipline, and some set procedures, we can remember to update the OSes, and our software, any time the primary systems are updated, to keep them in sync. As for the data in the databases, we cannot afford, at this time, to pay the license fees to be able to set them up to stay instantly up to date with each other, so the idea would be to dump transactions every 15 minutes or so, push them to the remote site, and load them up. That would keep the remote site fairly up to date. What I don't know how to do is to set up the remote site to automatically determine that the primary site is down, and to then take over for it, or how we should handle the switch back, when the primary comes back up. I would appreciate any comments, suggestions and pointers you have. Thanks. -ste From owner-sage-members@usenix.org Wed Aug 3 15:38:55 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j73Mcg91004835 for ; Wed, 3 Aug 2005 15:38:53 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j73MXKV6027819 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Aug 2005 15:33:20 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j73MXJsl027817; Wed, 3 Aug 2005 15:33:19 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Wed, 3 Aug 2005 15:29:08 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j73MT7V6027502 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 3 Aug 2005 15:29:07 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j73MT7pi027501 for sage-members-outgoing; Wed, 3 Aug 2005 15:29:07 -0700 (PDT) Received: from vhost109.his.com (vhost109.his.com [216.194.225.101]) by usenix.org (8.12.10/8.12.10) with ESMTP id j73MT4V6027492 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 3 Aug 2005 15:29:05 -0700 (PDT) Received: from [10.0.1.3] (localhost.his.com [127.0.0.1]) by vhost109.his.com (8.12.11/8.12.3) with ESMTP id j73MT21i008683 for ; Wed, 3 Aug 2005 18:29:03 -0400 (EDT) (envelope-from brad@stop.mail-abuse.org) Mime-Version: 1.0 Message-Id: In-Reply-To: <42F12A63.7070609@smxy.org> References: <42F12A63.7070609@smxy.org> Date: Thu, 4 Aug 2005 00:27:44 +0200 To: SAGE Members Mailing List From: Brad Knowles Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-sage-members@usenix.org Precedence: bulk Status: RO Content-Length: 5069 Lines: 112 At 4:34 PM -0400 2005-08-03, Shaun T. Erickson wrote: > What I don't know how to do is to set up the remote site to > automatically determine that the primary site is down, With just two data sources, that's an impossible problem to solve. You don't know if the problem is really with the secondary site and the primary is fine, or if there is a communications problem between the secondary and the primary (which is fine), or if there is a real problem at the primary, etc.... And then there are problems in the switch-over process that you've got to handle. To be able to solve these kinds of problems, you need additional information from other sources. This means you need to set up additional monitoring installations, so that you have enough pieces of information to be able to have a higher probability of determining which particular scenario is correct. Generally speaking, the minimum number of monitoring installations is five (which presumably includes your primary and secondary data center, meaning you'd need three more around the world). With a total of three, if there is a failure in one of the monitoring installations as well as a failure in one of the two datacenters, you'd back to the same two-site problem. With four installations, you may have a tie. With five installations, you should have a majority which says that everything at the primary is okay, or something is wrong, and you should also be able to have the monitoring installations cross-monitor each other and gather information on the probability of a remote site failure. You will also need to address the issue of where your primary network monitoring/administration facility is located. You can't put it at your primary data center, because you're assuming that may fail. You can't put it at your secondary data center, because that might fail while the primary stays in operation. You may note that they have five supposedly identical computers onboard the space shuttle, and they use a similar scheme. Note also that NTP uses a similar "Byzantine Agreement" scheme. Getting this kind of thing right is really tough. The problem is that if you don't get it right, you run the risk of making the situation much worse than it was with just the one primary site. Compare this to using RAID -- do you really want to trust your vital data to buggy RAID? No. In all probability, a single disk with a cold-spare backup (made every night, requiring manual intervention to physically swap the drives) would be preferable to a buggy RAID implementation. > and to then > take over for it, The take-over is relatively easy, depending on what level that needs to be done at. If you run your own AS (Autonomous System) at these sites, with your own routers speaking BGP to your network providers, then you could use the same sort of "anycast" mechanism that ISC uses for f.root-servers.net. ISC has a number of good white papers at , but you particularly want to take a look at and . That would allow you to use the same public IP addresses everywhere, and the DNS doesn't have to change -- you just have to make live updates to the Internet routing tables for that AS, and that's your failover mechanism. The equivalent to the "cold spare disk" method mentioned above would be to simply periodically copy the data from the primary to the secondary, use low TTLs for the critical public IP addresses, and then use Dynamic DNS to switch the IP addresses when your monitoring system detects a failure. You still need to have additional external monitoring installations, additional external secondary nameservers, and you need to make sure that you have alternative ways of contacting the secondary data center as well as the additional external monitoring installations and additional external secondary nameservers, in the case of a failure at the primary site (this may even be a modem connection to a private dial-in server, but you need to have a backup). It won't be pretty, and you'll probably have to do the most critical bits manually. But you can still learn a lot that may be applicable to this solution by reading the ISC Tech Notes. > or how we should handle the switch back, when > the primary comes back up. That's relatively easy -- you just reverse the takeover procedures. > I would appreciate any comments, suggestions and pointers you have. Thanks. Hope this helps! -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From owner-sage-members@usenix.org Wed Aug 3 17:52:41 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j740qdTC018010 for ; Wed, 3 Aug 2005 17:52:40 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j740frV6000936 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Aug 2005 17:41:54 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j740frZO000928; Wed, 3 Aug 2005 17:41:53 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Wed, 3 Aug 2005 17:38:42 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j740cfV6000622 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 3 Aug 2005 17:38:41 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j740cfOS000621 for sage-members-outgoing; Wed, 3 Aug 2005 17:38:41 -0700 (PDT) Received: from hamhock.hoovers.com (hamhock-outbound.hoovers.com [66.179.38.26]) by usenix.org (8.12.10/8.12.10) with ESMTP id j740ccV5000616 for ; Wed, 3 Aug 2005 17:38:39 -0700 (PDT) Received: from exchange.hoovers.com (gamma.hoovers.com [66.179.38.8]) by hamhock.hoovers.com (HamHock-OUTBOUND) with ESMTP id 1DAB219D9E8; Wed, 3 Aug 2005 19:38:33 -0500 (CDT) Received: from hoovers-59.hoovers.com ([66.179.38.59]) by exchange.hoovers.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2658.27) id PY06MYAP; Wed, 3 Aug 2005 19:38:32 -0500 Date: Wed, 03 Aug 2005 19:38:32 -0500 From: Frank Smith To: "Shaun T. Erickson" , Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Message-ID: <9D2F60224D4810D13E62B0FB@hoovers-59.hoovers.com> In-Reply-To: <42F12A63.7070609@smxy.org> References: <42F12A63.7070609@smxy.org> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 3984 Lines: 55 --On Wednesday, August 03, 2005 16:34:43 -0400 "Shaun T. Erickson" wrote: > I work for a small startup, and we are an ASP. Customers run our software from two webservers in our DMZ (that use public IPs), which communicate with two database servers behind the firewall (that us private, non-routable IPs). The two systems in the > DMZ are a Linux box and a Windows 2000 Server box. Ditto for the two database servers - the Linux box runs Sybase, and the Windows box runs MS-SQL. > > I need to replicate this at an off-site location, so that if our primary site goes down for some reason, customers can access our application from the remote site. I have two primary impediments to doing this: I have virtually zero budget, and I've > never done anything like this before. > > I presume that we can find a suitable place to host our systems, and that I can clone my four primary systems and get them up and running at the other location. I also presume, that with some discipline, and some set procedures, we can remember to > update the OSes, and our software, any time the primary systems are updated, to keep them in sync. As for the data in the databases, we cannot afford, at this time, to pay the license fees to be able to set them up to stay instantly up to date with > each other, so the idea would be to dump transactions every 15 minutes or so, push them to the remote site, and load them up. That would keep the remote site fairly up to date. > > What I don't know how to do is to set up the remote site to automatically determine that the primary site is down, and to then take over for it, or how we should handle the switch back, when the primary comes back up. > > I would appreciate any comments, suggestions and pointers you have. Thanks. > > -ste Are you sure you need automatic failover? As Brad Knowles has pointed out, it is difficult to determine if a site is completely down and not just unreachable from where you are. For a small site it might be much more cost-effective to pay one of the site monitoring services to page you if your site is unavailable from all of their agents, and then you can manually do the changeover if needed (since you may be able to restore your site quickly without having to worry about database synchronization). The switch back is also an issue, and may require another outage while you replicate your database updates back to the main site (since you said you are doing batch updates). Also, you may have some discrepancies between the databases since whatever happened at the old site betwen the last batch was sent and the site died will have to somehow be merged with the batch you move back from the failover site. You also need to decide how you are going to get your users to go to the failover site. If you think a DNS change is going to work, you will be very suprised. DNS changes don't propagate anywhere near as fast as you want (after our last site move we were still redirecting hits at the old IP for nearly a week after the DNS change, and that was with a TTL that had been 1 hour or less for weeks). The DNS results get cached in so many places that you just can't rely on it for failover. A BGP announcement of a new route could be done in a few seconds, but could be much more difficult to implement, depending on your equipment, skills, and network topology. Take a hard look at the anticipated cost of downtime, the likelihood of it happening, and the costs of doing a failover. There's a wide range of costs involved from doing a manual failover with some downtime to a completely automatic and seamless failover that the customer never notices, even while in the middle of a transaction, so look at costs and risks together before making your choice. Frank -- Frank Smith fsmith@hoovers.com Sr. Systems Administrator Voice: 512-374-4673 Hoover's Online Fax: 512-374-4501 From owner-sage-members@usenix.org Mon Aug 8 10:58:19 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78HwIpR008586 for ; Mon, 8 Aug 2005 10:58:18 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Huq4I029722 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 10:56:52 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78HteAD029690; Mon, 8 Aug 2005 10:55:41 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 10:39:02 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Hd14I028119 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 10:39:01 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78Hd060028117 for sage-members-outgoing; Mon, 8 Aug 2005 10:39:00 -0700 (PDT) Received: from peter.smxy.org (smxy.org [64.32.179.41]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Hct4I028111 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 8 Aug 2005 10:38:59 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by peter.smxy.org (Postfix) with ESMTP id 26C3B20DE for ; Mon, 8 Aug 2005 13:38:49 -0400 (EDT) Received: from smxy.org ([127.0.0.1]) by localhost (peter.smxy.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id 72346-02 for ; Mon, 8 Aug 2005 13:38:47 -0400 (EDT) Received: from [192.168.32.111] (unknown [65.199.146.162]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by peter.smxy.org (Postfix) with ESMTP for ; Mon, 8 Aug 2005 13:38:47 -0400 (EDT) Message-ID: <42F79899.7090701@smxy.org> Date: Mon, 08 Aug 2005 13:38:33 -0400 From: "Shaun T. Erickson" User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. References: <42F12A63.7070609@smxy.org> In-Reply-To: <42F12A63.7070609@smxy.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-20030616-p9 at smxy.org Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 1645 Lines: 31 First, my thanks to Brad and Frank for their responses - they gave me a lot to think about. I understand, now, the need for multiple monitoring sites, but we don't want to tackle running 5 or so of them ourselves, so we will likely hire a monitoring service to do that for us. Reccomendations, as to whom we might hire, are welcome. We also decided not to have the backup site automatically take over for the primary site, at least initially, but, instead, have the monitoring service contact us, whereupon we'd try to bring the primary back up first, and failing that, manually have the backup site take over. I understand the issue with nameservers ignoring the TTLs, so we'd rather not rely on DNS to get the customers to go to the backup site. BGP sounds like the way to go, but I'd like to know more about it. I'm ok when it comes to LAN networking, but am lost when it comes to the WAN. All I know about BGP is that it's a routing protocol (I think). :) Question: I presume that, until a takeover, the secondary system would go by a different name than the primary system, since two systems can't go by the same FQDN. That means that after the routing is changed, via a BGP announcement, that we'd have to rename and re-ip the secondary systems, so that they can act as the primaries. I know I could script that for our linux systems, but what about the Windows boxen? I want to keep the steps required to cutover, to a minimum, in case I have to 1) do it fast (we can only be down a max of two hours) and 2) talk a non-IT person through it, if I'm not near a computer at the time (I'm the entire IT dept.). -ste From owner-sage-members@usenix.org Mon Aug 8 11:17:34 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78IHVbx008249 for ; Mon, 8 Aug 2005 11:17:31 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78IG24I000438 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 11:16:02 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78IFwSr000436; Mon, 8 Aug 2005 11:15:58 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 11:12:25 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78ICN4I000139 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 11:12:24 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78ICNxW000138 for sage-members-outgoing; Mon, 8 Aug 2005 11:12:23 -0700 (PDT) Received: from chopin.co-prosperity.org (chopin.co-prosperity.org [24.196.66.98]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78ICL4I000132 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 8 Aug 2005 11:12:22 -0700 (PDT) Received: from chopin.co-prosperity.org (chopin [127.0.0.1]) by chopin.co-prosperity.org (8.12.5/8.12.5) with ESMTP id j78IJRsN029932; Mon, 8 Aug 2005 13:19:27 -0500 Received: from localhost (nmedbery@localhost) by chopin.co-prosperity.org (8.12.5/8.12.5/Submit) with ESMTP id j78IJRNR029929; Mon, 8 Aug 2005 13:19:27 -0500 X-Authentication-Warning: localhost.localdomain: nmedbery owned process doing -bs Date: Mon, 8 Aug 2005 13:19:27 -0500 (CDT) From: nmedbery@museverte.net X-X-Sender: nmedbery@localhost.localdomain To: "Shaun T. Erickson" cc: Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. In-Reply-To: <42F79899.7090701@smxy.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 1054 Lines: 26 Shaun, Windows (2000 and later I believe) has a built-in network shell that can scripted as you desire. I have never actually done much with it, but run: netsh from a command prompt and use the built in help system to figure out what you need. It seems fairly robust. Hopefully it proves useful. -Nate On Mon, 8 Aug 2005, Shaun T. Erickson wrote: > Question: I presume that, until a takeover, the secondary system would > go by a different name than the primary system, since two systems can't > go by the same FQDN. That means that after the routing is changed, via a > BGP announcement, that we'd have to rename and re-ip the secondary > systems, so that they can act as the primaries. I know I could script > that for our linux systems, but what about the Windows boxen? I want to > keep the steps required to cutover, to a minimum, in case I have to 1) > do it fast (we can only be down a max of two hours) and 2) talk a non-IT > person through it, if I'm not near a computer at the time (I'm the > entire IT dept.). > > -ste > From owner-sage-members@usenix.org Mon Aug 8 12:16:57 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78JGteT029834 for ; Mon, 8 Aug 2005 12:16:56 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78JFo4I001947 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 12:15:51 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78JFnj9001943; Mon, 8 Aug 2005 12:15:50 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 12:11:42 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78JBe4I001568 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 12:11:41 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78JBexn001567 for sage-members-outgoing; Mon, 8 Aug 2005 12:11:40 -0700 (PDT) Received: from vhost109.his.com (vhost109.his.com [216.194.225.101]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78JBY4I001559 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 12:11:39 -0700 (PDT) Received: from [10.0.1.3] (localhost.his.com [127.0.0.1]) by vhost109.his.com (8.12.11/8.12.3) with ESMTP id j78JBCQg026153; Mon, 8 Aug 2005 15:11:18 -0400 (EDT) (envelope-from brad@stop.mail-abuse.org) Mime-Version: 1.0 Message-Id: In-Reply-To: <42F79899.7090701@smxy.org> References: <42F12A63.7070609@smxy.org> <42F79899.7090701@smxy.org> Date: Mon, 8 Aug 2005 21:09:27 +0200 To: "Shaun T. Erickson" From: Brad Knowles Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Cc: Sage Members Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 1842 Lines: 43 At 1:38 PM -0400 2005-08-08, Shaun T. Erickson wrote: > Question: I presume that, until a takeover, the secondary system would go > by a different name than the primary system, since two systems can't go by > the same FQDN. To make this sort of switch with BGP, it would be better to have a separate "published" set of IP addresses in a separate Autonomous System (AS), and then change the route announcements for that AS. This way, you can always get to the systems in question via the back door (the private network), and you have control over how the public network is routed. Neither the public nor the private names or addresses change, and you do all internal backups, etc... via the private network. But all data is accessed externally via the public network/names/addresses -- which also don't change, but for which the routing does change, and is controlled by you. > That means that after the routing is changed, via a > BGP announcement, that we'd have to rename and re-ip the secondary > systems, so that they can act as the primaries. Not necessary. See above. You should be able to do the BGP cutover in a matter of just a few minutes. If you use a general-purpose machine as your route server, running a BGP-capable server such as Quagga or Zebra, you should be able to automate that process easily. The actual routing would be handled by real network hardware, but the machine advertising the routes would be the *nix box. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From owner-sage-members@usenix.org Mon Aug 8 12:39:48 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78Jdkpb002191 for ; Mon, 8 Aug 2005 12:39:47 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Jcb4I002716 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 12:38:37 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78JcaQr002714; Mon, 8 Aug 2005 12:38:36 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 12:36:13 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78JaB4I002504 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 12:36:12 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78JaBwS002503 for sage-members-outgoing; Mon, 8 Aug 2005 12:36:11 -0700 (PDT) Received: from duc.auburn.edu (im1.duc.auburn.edu [131.204.2.245]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Ja84I002498 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 8 Aug 2005 12:36:09 -0700 (PDT) Received: from ([131.204.12.5]) by im1.duc.auburn.edu with ESMTP id KP-BXZ15.1569980; Mon, 08 Aug 2005 14:35:42 -0500 Received: from localhost (doug@localhost) by goodall.eng.auburn.edu (8.9.3+Sun/8.6.4) with ESMTP id OAA07598; Mon, 8 Aug 2005 14:35:42 -0500 (CDT) X-Authentication-Warning: goodall.eng.auburn.edu: doug owned process doing -bs Date: Mon, 8 Aug 2005 14:35:41 -0500 (CDT) From: Doug Hughes To: Brad Knowles cc: "Shaun T. Erickson" , Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 2662 Lines: 60 On Mon, 8 Aug 2005, Brad Knowles wrote: > At 1:38 PM -0400 2005-08-08, Shaun T. Erickson wrote: > > > Question: I presume that, until a takeover, the secondary system would go > > by a different name than the primary system, since two systems can't go by > > the same FQDN. > > To make this sort of switch with BGP, it would be better to have > a separate "published" set of IP addresses in a separate Autonomous > System (AS), and then change the route announcements for that AS. > This way, you can always get to the systems in question via the back > door (the private network), and you have control over how the public > network is routed. > > Neither the public nor the private names or addresses change, and > you do all internal backups, etc... via the private network. > > But all data is accessed externally via the public > network/names/addresses -- which also don't change, but for which the > routing does change, and is controlled by you. > > > That means that after the routing is changed, via a > > BGP announcement, that we'd have to rename and re-ip the secondary > > systems, so that they can act as the primaries. > > Not necessary. See above. You should be able to do the BGP > cutover in a matter of just a few minutes. If you use a > general-purpose machine as your route server, running a BGP-capable > server such as Quagga or Zebra, you should be able to automate that > process easily. > > The actual routing would be handled by real network hardware, but > the machine advertising the routes would be the *nix box. > For what it's worth, we do this for a production database system. If you have one provider, you can get a private AS number that they can delegate to you with a small network (we use a single /32 for the DB instance). We have only one instance of Zebra running at a time on each machine in question announcing the BGP route. When that one is down, we bring up the other one. If, for some reason, both machines have BGP running at the same time, we have the routes preferenced so that one will always be the better route than the other (your and/or network group can advise you on how to do this. There are several ways to accomplish the same goal). Our failover time is less than 3 seconds. One machine is in Phoenix, AZ and one is in Rochester, NY. It takes a lot longer than that to arrange all of the DB primary/secondary roles. If you have two separate providers (one at each location), it becomes harder because of IP address assignment and CIDR aggregations unless you have some sort of arrangement with both or a legacy class C (e.g.) address space that you can use. Doug From owner-sage-members@usenix.org Mon Aug 8 12:50:38 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78JoaU8001218 for ; Mon, 8 Aug 2005 12:50:37 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Jlb4I003357 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 12:47:37 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78JlaHj003356; Mon, 8 Aug 2005 12:47:36 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 12:45:41 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Jje4I003028 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 12:45:40 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78Jjedh003027 for sage-members-outgoing; Mon, 8 Aug 2005 12:45:40 -0700 (PDT) Received: from peter.smxy.org (smxy.org [64.32.179.41]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Jjb4I003022 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 8 Aug 2005 12:45:38 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by peter.smxy.org (Postfix) with ESMTP id 974F0216F for ; Mon, 8 Aug 2005 15:45:36 -0400 (EDT) Received: from smxy.org ([127.0.0.1]) by localhost (peter.smxy.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id 74286-04 for ; Mon, 8 Aug 2005 15:45:36 -0400 (EDT) Received: from [192.168.32.111] (unknown [65.199.146.162]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by peter.smxy.org (Postfix) with ESMTP for ; Mon, 8 Aug 2005 15:45:36 -0400 (EDT) Message-ID: <42F7B65F.3030907@smxy.org> Date: Mon, 08 Aug 2005 15:45:35 -0400 From: "Shaun T. Erickson" User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. References: <42F12A63.7070609@smxy.org> <42F79899.7090701@smxy.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-20030616-p9 at smxy.org Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 1396 Lines: 31 Brad Knowles wrote: > At 1:38 PM -0400 2005-08-08, Shaun T. Erickson wrote: > >> Question: I presume that, until a takeover, the secondary system >> would go >> by a different name than the primary system, since two systems can't >> go by >> the same FQDN. > > > To make this sort of switch with BGP, it would be better to have a > separate "published" set of IP addresses in a separate Autonomous System > (AS), and then change the route announcements for that AS. This way, you > can always get to the systems in question via the back door (the private > network), and you have control over how the public network is routed. My apologies if I seems dense, but this is all new to me, and I have to be able to at least explain it to my management, even if only to say "we may need to bring someone in to set this up" ... Ok, so if I understand correctly, the backup systems should have two IP addresses, one set to be only used by us, to keep the systems in sync, to perform backups, and whatever other administrative tasks are needed. The other set of addresses would be the same as used by the primary systems? So if my primary server goes by the name of foo.bar.com with an IP address of 1.2.3.4, then it's backup system would ALSO have the same hostname and IP address, reachable only if the routing were changed to direct Internet traffic to that location? -ste From owner-sage-members@usenix.org Mon Aug 8 13:34:58 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78KYvrV015598 for ; Mon, 8 Aug 2005 13:34:57 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78KXi4I005590 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 13:33:44 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78KXgeM005589; Mon, 8 Aug 2005 13:33:43 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 13:29:36 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78KTZ4I005272 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 13:29:35 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78KTYLV005271 for sage-members-outgoing; Mon, 8 Aug 2005 13:29:34 -0700 (PDT) Received: from mycroft.greatcircle.com (mycroft-eth0.greatcircle.com [66.92.48.198]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78KTW4H005266 for ; Mon, 8 Aug 2005 13:29:33 -0700 (PDT) Received: from [66.92.48.19] (localhost [127.0.0.1]) by mycroft.greatcircle.com (Postfix) with ESMTP id 13A8432C176; Mon, 8 Aug 2005 13:29:31 -0700 (PDT) Mime-Version: 1.0 Message-Id: In-Reply-To: <42F7B65F.3030907@smxy.org> References: <42F12A63.7070609@smxy.org> <42F79899.7090701@smxy.org> <42F7B65F.3030907@smxy.org> Date: Mon, 8 Aug 2005 13:29:28 -0700 To: "Shaun T. Erickson" , Sage Members From: Brent Chapman Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 3815 Lines: 76 At 3:45 PM -0400 8/8/05, Shaun T. Erickson wrote: >Brad Knowles wrote: >>At 1:38 PM -0400 2005-08-08, Shaun T. Erickson wrote: >> >>> Question: I presume that, until a takeover, the secondary system would go >>> by a different name than the primary system, since two systems can't go by >>> the same FQDN. >> >> >> To make this sort of switch with BGP, it would be better to >>have a separate "published" set of IP addresses in a separate >>Autonomous System (AS), and then change the route announcements for >>that AS. This way, you can always get to the systems in question >>via the back door (the private network), and you have control over >>how the public network is routed. > >My apologies if I seems dense, but this is all new to me, and I have >to be able to at least explain it to my management, even if only to >say "we may need to bring someone in to set this up" ... > >Ok, so if I understand correctly, the backup systems should have two >IP addresses, one set to be only used by us, to keep the systems in >sync, to perform backups, and whatever other administrative tasks >are needed. > >The other set of addresses would be the same as used by the primary >systems? So if my primary server goes by the name of foo.bar.com >with an IP address of 1.2.3.4, then it's backup system would ALSO >have the same hostname and IP address, reachable only if the routing >were changed to direct Internet traffic to that location? > > -ste Yes, essentially. Keep in mind that the changeover will _not_ be instantaneous, as the routing change propagates throughout the Internet; it might take a few minutes. During that time, you'll likely have a lot of failed TCP connections, as packets that are part of connections that were in progress with the old servers will get sent to the new servers, who won't know anything about those connections and so will kill them. Of course, if you're making the changeover because the old server has already died (or become unreachable due to networking failures, or whatever), then there won't be any connections already in progress. Attempts to create new connections might also fail for a while, until (from the point of view of the client) the route to the old servers is replaced by the route to the new servers. Anyway, your application needs to be able to cope gracefully with failed and interrupted TCP sessions, through whatever retry and reconnection methods are appropriate. Also, depending on the nature of your service (for instance, if there's state data that needs to be maintained for the user), it might be Very Bad if both old and new servers are accepting TCP connections simultaneously; it might be even worse than having the service be unavailable for a short period of time. So, you might need to set things up such that when you decide to throw the Big Red Routing Switch, you first shut down the old servers to ensure that existing connections get dropped and no additional ones get created, then bring up the new servers so that they're ready to accept new (or re-established) connections, and then finally propagate the routing change. This presumes that you have reliable communications to both sets of servers, through some out-of-band mechanism. In such circumstances, one of the worst possible situation you can get yourselves into is that you can't reach your old servers to shut them down prior to making the routing switch, but some of your customers _can_ and are still using them. -Brent -- Brent Chapman -- Great Circle Associates, Inc. Specializing in network infrastructure for Silicon Valley since 1989 For info about us and our services, please see http://www.greatcircle.com/ Network Automation blog: http://www.greatcircle.com/blog/network_automation From owner-sage-members@usenix.org Mon Aug 8 14:36:22 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78LaKv0016884 for ; Mon, 8 Aug 2005 14:36:21 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78LVG4I007046 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 14:31:16 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78LVBLA007039; Mon, 8 Aug 2005 14:31:11 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 14:28:05 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78LS44I006722 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 14:28:04 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78LS3ja006720 for sage-members-outgoing; Mon, 8 Aug 2005 14:28:03 -0700 (PDT) Received: from vhost109.his.com (vhost109.his.com [216.194.225.101]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78LS14I006715 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 14:28:02 -0700 (PDT) Received: from [10.0.1.3] (localhost.his.com [127.0.0.1]) by vhost109.his.com (8.12.11/8.12.3) with ESMTP id j78LRqt1034268; Mon, 8 Aug 2005 17:27:58 -0400 (EDT) (envelope-from brad@stop.mail-abuse.org) Mime-Version: 1.0 Message-Id: In-Reply-To: <42F7B65F.3030907@smxy.org> References: <42F12A63.7070609@smxy.org> <42F79899.7090701@smxy.org> <42F7B65F.3030907@smxy.org> Date: Mon, 8 Aug 2005 23:10:57 +0200 To: "Shaun T. Erickson" From: Brad Knowles Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Cc: Sage Members Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 2849 Lines: 62 At 3:45 PM -0400 2005-08-08, Shaun T. Erickson wrote: > The other set of addresses would be the same as used by the primary systems? No. You have three sets of addresses. One set of addresses for the private network of the primary systems. One set of addresses for the private network of the secondary systems. Then a third set of "service" addresses which are the ones officially published in the DNS for external access to the services. This third set of service addresses are used on both the primary and secondary systems. When the primary fails, the secondary systems already have the same public IP addresses, and the only thing that needs to change is the routing advertisements. > So if my primary server goes by the name of foo.bar.com with an IP address > of 1.2.3.4, then it's backup system would ALSO have the same hostname and > IP address, reachable only if the routing were changed to direct Internet > traffic to that location? Mostly correct. Think about typical host-level fault-resilient failover. You have two machines with their own private addresses. You have a third set of addresses which are used by the "service". Whichever machine is currently active will have those IP addresses assigned to it, but if that machine should fail then the secondary "steals" the service addresses, forces an arp cache refresh at the routers and other necessary devices on the network, and then continues operation as the new active server. If these two machines are directly connected to the same resources (disk drives, etc...), then you don't have to worry about trying to keep them in sync (although you may have to worry about the filesystem being in a consistent state as of the time of the switchover). You're trying to do the same thing, at a network level. In your case, you don't "steal" the service addresses. In your case, the service addresses are already assigned to the public interfaces of both the primary and secondary sets of machines (which have different private network addresses). Whatever administrative access you need, whatever backups need to be done, etc... all go over the private network. What controls where packets from the outside world get routed to is your *nix box running Zebra or Quagga, and what routes it advertises as to the path that packets should take to get to the previously mentioned public IP addresses. No arp cache timeouts. Just propagation time for BGP route advertisements. -- Brad Knowles, "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See for more info. From owner-sage-members@usenix.org Mon Aug 8 14:49:47 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78LnkKp017272 for ; Mon, 8 Aug 2005 14:49:47 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78LhA4I007657 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 14:43:10 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78Lh9It007656; Mon, 8 Aug 2005 14:43:09 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 14:39:53 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Ldq4I007387 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 14:39:52 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78LdqhR007386 for sage-members-outgoing; Mon, 8 Aug 2005 14:39:52 -0700 (PDT) Received: from peter.smxy.org (smxy.org [64.32.179.41]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Ldn4I007381 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 8 Aug 2005 14:39:50 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by peter.smxy.org (Postfix) with ESMTP id 2B32C2288; Mon, 8 Aug 2005 17:39:49 -0400 (EDT) Received: from smxy.org ([127.0.0.1]) by localhost (peter.smxy.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id 76303-07; Mon, 8 Aug 2005 17:39:48 -0400 (EDT) Received: from [192.168.32.111] (unknown [65.199.146.162]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by peter.smxy.org (Postfix) with ESMTP; Mon, 8 Aug 2005 17:39:48 -0400 (EDT) Message-ID: <42F7D123.8060505@smxy.org> Date: Mon, 08 Aug 2005 17:39:47 -0400 From: "Shaun T. Erickson" User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Brad Knowles Cc: Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. References: <42F12A63.7070609@smxy.org> <42F79899.7090701@smxy.org> <42F7B65F.3030907@smxy.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-20030616-p9 at smxy.org Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 915 Lines: 25 Brad Knowles wrote: > > [snip] > No. You have three sets of addresses. > [snip] Ok, I follow all of that. > What controls where packets from the outside world get routed to is > your *nix box running Zebra or Quagga, and what routes it advertises as > to the path that packets should take to get to the previously mentioned > public IP addresses. No arp cache timeouts. Just propagation time for > BGP route advertisements. Ok, I get this too, except for one thing: do I need one or two of these boxes? If I had just one, at the secondary site, couldn't it just say "go there" until I told it to say "come here"? Or would I do it the way Doug Hughes described, with one at each location, but with only one normally up and running, or yet another scenario? I think I'm actually starting to understand this (not enough to impliment it, but at least enough to describe it now). Thanks. -ste From owner-sage-members@usenix.org Mon Aug 8 15:05:31 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j78M5UAG002402 for ; Mon, 8 Aug 2005 15:05:30 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78LsW4I008321 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 8 Aug 2005 14:54:32 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j78LsVFe008320; Mon, 8 Aug 2005 14:54:31 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Mon, 8 Aug 2005 14:52:03 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Lq24I008008 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 8 Aug 2005 14:52:02 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j78Lq2uY008007 for sage-members-outgoing; Mon, 8 Aug 2005 14:52:02 -0700 (PDT) Received: from duc.auburn.edu (im1.duc.auburn.edu [131.204.2.245]) by usenix.org (8.12.10/8.12.10) with ESMTP id j78Lq04I008001 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 8 Aug 2005 14:52:00 -0700 (PDT) Received: from ([131.204.12.5]) by im1.duc.auburn.edu with ESMTP id KP-BXZ15.1592190; Mon, 08 Aug 2005 16:51:33 -0500 Received: from localhost (doug@localhost) by goodall.eng.auburn.edu (8.9.3+Sun/8.6.4) with ESMTP id QAA07671; Mon, 8 Aug 2005 16:51:33 -0500 (CDT) X-Authentication-Warning: goodall.eng.auburn.edu: doug owned process doing -bs Date: Mon, 8 Aug 2005 16:51:32 -0500 (CDT) From: Doug Hughes To: "Shaun T. Erickson" cc: Brad Knowles , Sage Members Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. In-Reply-To: <42F7D123.8060505@smxy.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 1541 Lines: 36 On Mon, 8 Aug 2005, Shaun T. Erickson wrote: > Brad Knowles wrote: > > > > [snip] > > No. You have three sets of addresses. > > [snip] > > Ok, I follow all of that. > > > What controls where packets from the outside world get routed to is > > your *nix box running Zebra or Quagga, and what routes it advertises as > > to the path that packets should take to get to the previously mentioned > > public IP addresses. No arp cache timeouts. Just propagation time for > > BGP route advertisements. > > Ok, I get this too, except for one thing: do I need one or two of these > boxes? If I had just one, at the secondary site, couldn't it just say > "go there" until I told it to say "come here"? Or would I do it the way > Doug Hughes described, with one at each location, but with only one > normally up and running, or yet another scenario? > > I think I'm actually starting to understand this (not enough to > impliment it, but at least enough to describe it now). Thanks. > we run zebra "on the box" where the service is that we want to be made redundant. One on each box. It advertises the IP route that is shared by both. The easiest way is to have mechanisms so that only one of them is running the route injector (zebra/quagga) at a time. There are other ways to do it, but this is the simplest. (You can do weighted preferences for the route, but then you have to be careful about what happens if the network bifurcates or routing has a problem and both boxes become primary at the same time - reconciliation of updates.) From owner-sage-members@usenix.org Thu Aug 11 21:42:58 2005 Received: from usenix.org (voyager.usenix.org [131.106.3.1]) by eldwist.darkuncle.net (8.12.11/8.12.9) with ESMTP id j7C4guDJ018043 for ; Thu, 11 Aug 2005 21:42:57 -0700 (PDT) Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j7C4gd4I026680 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 11 Aug 2005 21:42:39 -0700 (PDT) Received: from localhost (majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) with SMTP id j7C4gRc1026675; Thu, 11 Aug 2005 21:42:28 -0700 (PDT) Received: by voyager.usenix.org (bulk_mailer v1.13); Thu, 11 Aug 2005 21:28:57 -0700 Received: from voyager.usenix.org (localhost [127.0.0.1]) by usenix.org (8.12.10/8.12.10) with ESMTP id j7C4Su4I026176 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 11 Aug 2005 21:28:56 -0700 (PDT) Received: (from majordomo@localhost) by voyager.usenix.org (8.12.10/8.12.10/Submit) id j7C4Su4r026175 for sage-members-outgoing; Thu, 11 Aug 2005 21:28:56 -0700 (PDT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.192]) by usenix.org (8.12.10/8.12.10) with ESMTP id j7C4So4H026170 for ; Thu, 11 Aug 2005 21:28:51 -0700 (PDT) Received: by wproxy.gmail.com with SMTP id i31so581044wra for ; Thu, 11 Aug 2005 21:28:45 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=ocyJ1ne5Nrf9MvBvoSML/z5wtGJAwgxVwtgJjzxmbK41ne2DL/dEfuGv1Ba9/xAIgo3P9H5TUIMEnYOOP6J2MtGreosVHLkzJzl7xzf63odrpS78ahlLBQbXHGXVSUVVbMiIXabW2tWiS0/Mtk01zTchUM/p/CqeGW1X82RcsvA= Received: by 10.54.52.27 with SMTP id z27mr1650712wrz; Thu, 11 Aug 2005 21:28:45 -0700 (PDT) Received: by 10.54.92.20 with HTTP; Thu, 11 Aug 2005 21:28:45 -0700 (PDT) Message-ID: Date: Fri, 12 Aug 2005 00:28:45 -0400 From: Rodrick Brown To: "Shaun T. Erickson" Subject: Re: [SAGE] Primary site to secondary site fail-over, on a shoestring budget. Cc: Sage Members In-Reply-To: <42F12A63.7070609@smxy.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <42F12A63.7070609@smxy.org> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by usenix.org id j7C4Sp4H026171 Sender: owner-sage-members@usenix.org Precedence: bulk Content-Length: 2484 Lines: 55 On 8/3/05, Shaun T. Erickson wrote: > I work for a small startup, and we are an ASP. Customers run our > software from two webservers in our DMZ (that use public IPs), which > communicate with two database servers behind the firewall (that us > private, non-routable IPs). The two systems in the DMZ are a Linux box > and a Windows 2000 Server box. Ditto for the two database servers - the > Linux box runs Sybase, and the Windows box runs MS-SQL. > > I need to replicate this at an off-site location, so that if our primary > site goes down for some reason, customers can access our application > from the remote site. I have two primary impediments to doing this: I > have virtually zero budget, and I've never done anything like this before. > > I presume that we can find a suitable place to host our systems, and > that I can clone my four primary systems and get them up and running at > the other location. I also presume, that with some discipline, and some > set procedures, we can remember to update the OSes, and our software, > any time the primary systems are updated, to keep them in sync. As for > the data in the databases, we cannot afford, at this time, to pay the > license fees to be able to set them up to stay instantly up to date with > each other, so the idea would be to dump transactions every 15 minutes > or so, push them to the remote site, and load them up. That would keep > the remote site fairly up to date. > > What I don't know how to do is to set up the remote site to > automatically determine that the primary site is down, and to then take > over for it, or how we should handle the switch back, when the primary > comes back up. > > I would appreciate any comments, suggestions and pointers you have. Thanks. > > -ste > Enterprise solutions call for enterprise dollars, I did just this with Veritas Global Cluster Option to Veritas Cluster, we run a steward process on a 3rd remote datacenter to help prevent split brain between the two data center clusters. Right now I can failover a very complex system to a remote datacenter in under 15min. Here is a logical toplogy http://www.rodrickbrown.com/docs/images/datashare.png I know cost was a big concern one thing you can look at possibly is the ip based replication technologies from companies like falconstor or storageengine. -- Rodrick R. Brown Unix Systems Architect The City of New York (DoITT) http://www.nyc.gov/doitt http://www.rodrickbrown.com