High-Availability Linux Project

(Last updated: 1 September, 2004) http://linux-ha.org/

Goals	Communications	IP Address Takeover
Download Linux-HA Software	Commercial Software	Links

about this site	contact	legal	help	security

Friends of Linux-HA
IBM	SuSE	SGI	MSC Linux
Conectiva	Tummy.com	Emageon	Intel

Security Announcement

6/25/2003: A potential remote exploit has been discovered in heartbeat. Please upgrade to verison 1.0.3 or later at your earliest convenience. See the security announcement for more details.

Preview New Web Site

You can get a sneak preview of our new website at http://linuxha.trick.ca/. This site should be in production by the end of 2004.

Goals

The basic goal of the High Availability Linux project is to:

Provide a high-availability (clustering) solution for Linux which promotes reliability, availability, and serviceability (RAS) through a community development effort.

The Linux-HA project is widely used, and is an important component in many interesting High-Availability solutions. We currently have a few thousand installations up in production in the real world. We are also work well with the LVS (Linux Virtual Server) project and expect to collaborate with them in the future, since our goals are complementary. Interest in this project is growing very rapidly. These web pages are averaging more than thousands of hits per day, and we see several thousand downloads of the software per month. Heartbeat now ships as part of SuSE Linux, Conectiva Linux, and Mandrake Linux, MSC Linux, and Debian GNU/Linux. Mission Critical Linux built one of their earlier products on it. Ultramonkey, and several company's embedded systems are also based on it.

Although this is called the Linux-HA project, the software is highly portable and runs on FreeBSD and Solaris as well.

Things are moving along quite nicely. More than 45 people have made substantive contributions to the project, but still have lots of work to do.

There have been several articles written on this project and software. Here are a few examples:

Project leader Alan Robertson wrote an article entitled Highly-Affordable High-Availability which appeared in the November, 2003 issue of the US publication Linux Magazine

Andre Bonhote wrote an article on HA-NFS entitled High-Availability NFS Server with Linux Heartbeat in the August, 2003 issue of the European publication Linux Magazine

There is an article on HA-NFS with heartbeat entitled Heartbeat macht NFS-Server redundant in the July 2003 issue of the German publication Linux Magazin. (I believe this may be the same article as the previous one)

There is an article in the December 2002 issue of Linux Journal entitled High-Availability LDAP Clusters with Heartbeat by Jay Allen and Cliff White. It looks quite similar to the article they wrote for Developer Domain.

There is an article in the September 2001 issue of SysAdmin magazine entitled High-Availability File Server with heartbeat. It is an excellent introduction to basic HA issues with our software. Unfortunately, it fails to mention STONITH, an important component of an HA file server with shared storage.

There is an interview with Alan Robertson in the September 2002 Linux Journal about the history of the Linux-HA project, and the related Open Cluster Framework (OCF) effort. Linux-HA is aiming at becoming the OCF reference implementation.

There is an article in the IBM's Developer Domain online magazine entitled "Non-stop authentication with Linux clusters" about configuring an LDAP server based on heartbeat. This is an excellent article which also shows how to use LDAP's built-in replication to avoid needing shared disks or writing your own replication software.

Ken Bantoft has written an article entitled Highly Available VPNs on Linux which is available on the FreeS/WAN web site.

IBM has published a customer case study called Delivering Ten Second Failover for High-Volume Transactional Telco Applications with IBM DB2(c) Universal Database V 8.1. The paper uses the heartbeat software for their measurements.

IBM has published a Redbook called IBM eServer BladeCenter, Linux, and Open Source: Blueprint for e-business on demand. This redbook documents how to create highly available core internet services using a variety of open source tools including the Linux-HA software.

IBM has published a red paper on Linux-HA entitiled Linux on IBM zSeries and S/390: High Availability for z/VM and Linux. It contains good information on configuring the Linux-HA software properly in the zSeries environment. This paper provides information to help readers plan for and install a high availability solution for Linux for zSeries running under z/VM. It is written for customers, technical presale people, and technical managers, to help them to discuss the high availability possibilities of a Linux for zSeries environment running under z/VM and to guide them in implementing a solution.

Thomas Olausson has written a document entitled Apache failover with heartbeat and Mon, which is a mini-howto for how to configure heartbeat and Mon together to monitor Apache.

There is an article in the November 2000 issue of LinuxFocus on "High Availability systems under Linux". on configuring a High-Availability Apache server with Heartbeat.

Omer Faruk Sen <ofsen at enderunix.org> has written an introductory article on Yuksek Erisilebilir Sistemler (High Availability) in Turkish which explains some concepts of high availability and explains how to install the Linux-HA heartbeat software.

Hsing-Foo Wang <hsing-foo.wang at star-support.com> has written an HA HOWTO.

We have put our foot on the road to being competitive with commercial systems similar to those described in D. H. Brown's 1998 or March 2000 analysis of RAS cluster features and functions. It seems possible for Linux HA software to achieve D. H. Brown level of competitiveness in the next few years.

Components

High-Availability systems need a whole range of components in order to provide complete solutions covering every need. These components are:

Membership services
Communication Services
Cluster management
Resource (I/O) fencing
Resource Monitoring
Storage Sharing/Replication Storage sharing can be accomplished as:
- Storage Sharing (shared SCSI, FDDI, etc.)
- Storage Replication
  - Application protocol (DNS, NIS, etc.)
  - rsync, etc.
  - drbd, nbd, etc.

Not every deployment needs all these components. The last two functions are currently parts of other projects, with links to them below.

We now have a page of reference sites to provide a few real-life examples of how people use heartbeat in production. Submissions for this page are quite welcome.

MSC software commissioned Jamie Cameron to write a webmin module for configuring heartbeat. You can read about webmin at the Webmin home page, or just download the heartbeat module if you already use webmin.

Heartbeat has a TODO list. As always, comments are very much in order. Even better is signing up for the pieces described :-).

Heartbeat is in the process of transforming itself to be aligned with the Open Cluster Framework (OCF).

What Linux-HA can do now

Code ("heartbeat") which implements serial, UDP, and PPP/UDP heartbeats together with IP address takeover including a nice resource model including resource groups. It currently supports multiple IP addresses and a simple two-node primary/secondary model. It is both extremely useful and quite stable at this point in time. The following types of applications are typical:
- Web servers
- LVS director servers
- Mail servers
- Database servers
- Firewalls
- File servers
- DNS servers
- DHCP servers
- Proxy Caching servers
- etc.
For a sample list of production users, please see our reference sites.
Fake provides IP address takeover (courtesy of Horms). Heartbeat incorporates updated versions of this code as part of itself.
Tom Vogt has written a separate UDP HA heartbeat subsystem called Heart. Tom's code provided the initial UDP heartbeat framework used in "heartbeat" above.
Note: The Linux Virtual Server Project recommends "heartbeat" to provide front end router (director) redundancy.

Links

Linux-HA-specific Links

General Project Links
- Heartbeat home page.
- The Linux-HA HOWTO. This is the document which started the Linux-HA project. It is full of good thoughts about High-Availability systems. It was written by Harald Milz.
- Go to the contact page to subscribe to the linux-ha mailing list, or any of our other mailing lists.
- There is a CVS repository for Linux-HA. You can find it at cvs.linux-ha.org. Read only access is via login guest, password guest, module name linux-ha. More details are to be found in the the announcement email. It is also available through the web using viewcvs at http://cvs.linux-ha.org/viewcvs/viewcvs.cgi/linux-ha/. Go to the contact page to subscribe to the CVS commits mailing list.
LAN Mirroring technologies: LAN Mirroring technologies are an important HA technology now, and will remain a key ingredient in many inexpensive HA systems. They allow two machines to mirror a filesystem between themselves with only two NICs and a crossover cable.
- DRBD: Philipp Reisner's system for mirroring filesystems across the LAN. It is RAID and LAN aware with a quick sync option for bringing the mirror up to date quickly. This is the general replication technology which is most commonly used by members of by the Linux-HA community.
- NBD server software - the complement to the in-kernel NBD driver.
- Linux Journal article by P.T. Breuer, A. Marin Lopez and Artuo Garcia Ares on The Network Block Device A network block device (NBD) driver makes a remote resource look like a local device in Linux, allowing a cheap and safe real-time mirror to be constructed.
- ODR: Jerome Etienne's Online Disk Replicator proposal. ODR is similar to to DRBD, but supports more than two nodes, and has support for journalling. Marcelo says it's really cool. For now, here's a link to Jerome's last note on it. More info will show up here, when I have more.
- Enhanced Network Block Device project by P.T. Breuer. ENBD is a project to make the NBD work right. It doesn't combine the NBD and RAID together like DRBD does.
- GNBD: The GFS Network Block Device
- Network RAID mini-mini HOWTO: Marcelo Tosati has documented how to use RAID1 + the Network Block Device + his RAID write-only patch to implement network RAID for Linux-HA. This link is no longer valid.
Volume Managers:
- Heinz Mauelshagen's LVM: Linux Volume Manager. LVM allows the growth and shrinkage of partitions, as one would expect, but also allows the creation of snapshot partitions, enabling consistent online backups or recovery in filesystems and database systems.
- IBM's EVMS Enterprise volume management system. The EVMS Project has the goal of providing unparalleled flexibility and extensibility in managing storage. It represents a new approach to logical volume management, as the architecture introduces a plug-in model that allows for easy expansion or customization of various levels of volume management.
Filesystem technologies: Two types of filesystem technologies technologies are important for Linux-HA: journalling filesystems and cluster filesystems. Journalling filesystems allow takeover of shared/mirrored filesystems to occur rapidly in cases of failover. Cluster filesystems allow disks to be shared between all members of a cluster read-write.
- Open GFS: The (Open) Global Filesystem - fault tolerant, distributed, very cool, etc. GFS is both a cluster filesystem and a journalled filesystem.
- Reiserfs: A tree-structured filesystem which journals updates. Reiserfs is thought to be production quality.
- Ext3: Stephen Tweedie's journalling version of the ext2 filesystem.
- JFS: IBM's journalled filesystem technology.
- XFS: SGI's journalled filesystem technology.
- Timpanogas Research's FENRIS NetWare-compatible file system. Available now. Not sure if it supports journalling or not.
- Timpanogas Research's M2CS cluster file system. Early release copies of their filesystem available in Q3 2000.
- Intermezzo: A distributed file system which lets systems replicate directory trees. It is now in most 2.4 and later kernels.
- Lustre: Lustre is a novel storage and file system architecture and implementation which does not follow the block device paradigm. The central target in this project is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, petabytes of storage, move 100's of GB/sec.
- LinLogFS A log structured filesystem for Linux
Other Linux-HA Management Software
- SGI's Linux FailSafe project. Linux FailSafe is the most complete and functional open source HA software.
- Mission Critical Linux's Kimberlite Open Source High-Availability system.
- UltraMonkey: LVS and heartbeat and a GUI
- Jerome Etienne's VRRP implementation. Jerome implemented VRRPv2 as specfied by rfc2338.
- Fake: Redundant server switch software home page
- Failoverd: Failover daemon home page. Failoverd is now VRRP-based.
- Heart website (Tom Vogt's work -- see Software link)

Monitoring and Administration Software for Linux

lm-sensors: hardware health monitoring project.
SAF-TE SCSI Accessed Fault-Tolerant Enclosure device. If you have a SAF-TE compatible storage enclosure, this program will let read the enclosure configuration fetching things such as the number of fans, power supplies, slots, and also the mapping of slots to scsi ids. This program is really just proof of concept at the moment. SAF-TE spec is on the web and an addendum is at this location.
Linux RAS project. Very ambitious. Quite cool. Mailing list information can be found on the web.
Linux RAM ECC monitoring with a corresponding Mailing list.
Chris Brady's x86 Memory Testing program (memtest86). This ships with newer versions of SuSE Linux.
Mon: service monitoring daemon.
OpenNMS is an open-source project dedicated to the creation of an enterprise grade network management platform.
Spumoni enables any program which can be queried via local commands to be health-checked via SNMP. This allows admins to use enterprise-level monitoring programs such as OpenNMS, Tivoli, OpenView, MRTG and RRDTool for even non-SNMP-enabled applications.
Monit: Monit is a utility for monitoring daemons or similar programs running on a Unix system. It will start specified programs if they are not running and restart programs not responding.
VACM: VA Cluster Manager. VACM provides cluster monitoring and control at a very fundamental level.
PIKT: Problem Informant Killer Tool
NOCOL/SNIPS - system and network monitoring tool
Big Brother - Systems and Network monitor. Maintained by some nice guys!
Nagios - Network monitor (formerly Netsaint)
MAT - Monitoring and Administration Tool. MAT is an easy to use network enabled UNIX configuration and monitoring tool. It provides an integrated tool for many common system administration tasks, including Backups, and Replication
WebRAT - a web based administration tool, to administer several nodes on a network, from a central host (administration server). The main purpose of WebRAT is to administer a network with many nodes, remotely. The more the nodes on the network, the more WebRAT will seem to be irreplaceable.
SRRD is the Service Routing Redundancy Daemon was designed and developed starting September 2002 by Amir Guindehi as a master thesis at the Swiss Federal Institute of Technology in Zurich.
Keepalived: keepalive facility for LVS. The main goal of the keepalived project is to add a strong and robust keepalive facility to the Linux Virtual Server project. This project is similar to the MON project, but it is in C with multilayer TCP/IP stack checks. Keepalived implements a framework based on three family checks : Layer3, Layer4 & Layer5.

HA-Related Links and Information

SCTP Stream Control Transmission Protocol (RFC 2960). From Joe Grimm: The HA features of SCTP revolve around using multi-homed hosts to provide failover paths implemented at the transport layer, essentially this becomes a transparent issue to SCTP applications. This is just one of the features in SCTP but one network standard being driving by the Carrier Grade community to provide 5 nines networking on top of the Internet. The linux kernel project to implement this can be found at: http://www.sf.net/projects/lksctp.
The IETF's Reliable Server Pools working group. The purpose of the working group is to develop an architecture and protocols for the management and operation of server pools supporting highly reliable applications, and for client access mechanisms to a server pool
IP Channel Bonding for Linux. Channel bonding is the technique of using more than one ethernet link to increase bandwidth and availability on IP networks.
Extended Transmission Control Protocol (ETCP). ETCP helps within migration clusters because by using it, it becomes possible to have a network connection follow a process, instead of tcp IO being relayed through the original node. Given application state synchronization, this technique could also useful for failover clusters as well.
IEEE Task Force On Cluster Computing High Availability Information Page
Linux Clustering Information Center
IEEE Task Force On Cluster Computing
Ninja: Highly Available, Scaleable Computing Environment.
High Performance Cluster Computing resource page. High Performance Cluster Computing is a book edited by Dr. Rajkumar Buyya of Monash University's School of Computer Science and Software Engineering.
Mysql Clustering toolkit.
Dominque Chabord's Shaman-X software - providing opensource and free high-availability disaster tolerance, and crisis management software.
Ericsson's Eddie open source High-Availability Server farm project. It is written in Ericsson's programming language Erlang/OTP.
Mnesia is a distributed DataBase Management System designed to handle failovers as part of Eddie.
Clusterit: parallel clustering software for *BSD systems.
Ron Resnick's Taxonomy of High-Availability (originally prepared for Nortel). Basic definitions, and discussion of a few technologies.
Andreas Muller's HA-failover project. It seems like a nice implementation, mainly targeted at Solaris. Here is a summary from email he sent me:

Consists of two components: a daemon that stores state information about services served by the machine, and a shell that checks the state of all the services on remote machines (heartbeat).
Shell is programmable in Tcl.
Heartbeat uses either TCP or UDP, TCP always non-blocking.
Monitoring interfaces: text base, curses and html
The complete documentation is on the web site. There is also a detailed example on the server that shows how to do IP address failover, including sending a gratuitous ARP on Solaris (a tool for sending gratuitous ARPs using DLPI is included, but there is also a less carefully tested Linux version).
As the thing is programmable in Tcl, it doing failover for disk systems or other kinds of services is quite easy to implement. The only missing component is that machine must be able to find out about the state of a disk, and it must publish it in the daemon. I guess Mon can do such things, I usually use scripts written for the occasion.

Andrew Barnett's pl-cluster software. Andrew wasn't happy with heartbeat, so he wrote his own version. It uses multicast, and can run in listen-only mode. According to Andrew: It is alpha at best, YMMV.