Encompassing the Whole World Front Page /// Categories /// Site Index

Beowulf Clusters

In this case, Beowulf is a relatively new way of building supercomputers out of clusters of inexpensive, commodity off the shelf computers, and Wiglaf is the first example of this Extreme way of building massively parallel computers.

In the ancient saga of Beowulf, Wiglaf was his trusted aid.

The Wiglaf Beowulf cluster was started at nasa when they needed a supercomputer for data analysis, but couldn't buy one of sufficient power for a cheap enough price. It was built out of 16 pc's that were 66 Mhz 486 processors, with standard off-the-shelf network cards, and the free operating system Linux.

The Beowulf project got started at nasa in 1994 when they needed a single-user supercomputer for some dataprocessing tasks, but they couldn't buy a good enough one at the price that they could afford. The solution that they came up with was radical. They chose to build a cluster of 16 486 pc's which they networked together and called Wiglaf. These machines were configured as a multi-processor machine, that was found to be of a comparable speed (74 Mflops) to the cm-5 supercomputer (for problems that fit its architecture). It was massively cheaper.

They found that they had various advantages over the Connection Machine (CM5) that they had not thought about. For example, they could have more power simply by adding more processors (one Beowulf cluster has 1000+ pc's), they could speed it up by upgrading the machines (Hrothgar replaced Wiglaf in 1995, and had 16 pentium pc's and processed 280 Mflops). Most of the problems so far encountered have been due to not having the parallel algorithms needed to take advantage of the architecture.

extreme tuxThe Extreme Linux cd is the result of the work of the beowulf project at nasa to produce a cheaper supercomputer. It is the integration of all of the parallel processing tools that they needed with Red Hat Linux 5.0.

You do not have to buy the extreme linux cd however, as mosty of the software is readily available on most of the distributions of linux, with the remaining and newer tools being downloadable from the internet.

It is not now even needed to use linux, as some clusters have been put together using windows NT, but as far as I know, almost all of the developement work for newer software is going on on the linux platform, and beowulf clusters would not have existed as an architecture without linux being freely available with the source code when they needed to build the first one.

Xyroth is currently trying to build a Beowulf cluster using the extreme linux cd, 32 486 pc's, and the book how to build a Beowulf.

Now for the technical stuff. All that you need to qualify as a beowulf cluster is a smp (Symetrical Multi Processing) box with 2 or more processors, or 2 single processor boxes where you can login to both from either, and a "home" directory shared via NFS. There you have the simplest cluster you can build. As you can see, if your machine runs linux with more than 1 processor, you qualify, so the number of clusters out there is increasing dramatically.

The simplest form of parallel processing is where you run multiple copies of the same application on different processors using different data. This takes very little programming know-how, and no program modification. It is difficult to run a specific task on a specific procesor using smp on linux at the moment, but people are working on it. This is easier to do on a multi node system rather than smp at the moment.

You can write your program to have multiple parts, communicating via posix threads, which is very fast, but only really works on smp machines and does not scale up well.

The prefered way is to develop your program using one of the message passing libraries, as if you use the right ones, you can run it immediately on supercomputers, or distributed over various machines in your cluster, or just test it on a smp box. In fact, this is becoming a standard way to develop programmes for a cluster, you develop them on a smp box (probably attached to the network), and when you have it working there, you can then farm it out to the cluster.

A good example of a cluster in use is the Google search engine, which currently (2001/07) runs on a cluster of 8000 red hat based linux machines.

See also Zyra's beowulf review.

link back to site index