Computer cluster

A computer cluster consists of a set of loosely or tightly connected computer that work together so that, in many respects, they can be viewed as a single system. The components of a cluster are usually connected to each other through fast local area networks, with each node running its own instance of operating system.

They are usually deployed to improve performance and availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.

Types of clusters

Computer clusters havea wide range of applicability and deployment, ranging from small business clusters, such as web-service support with a handful of nodes to computation-intensive scientific calculations of the fastest supercomputers in the world today.

“Load-balancing clusters” are configurations in which cluster-nodes share computational workload to provide better overall performance. A web server cluster may assign different queries to different nodes, so the overall response time will be optimized or a high-performance cluster used for scientific computations would balance load with different algorithms.

“High-availability clusters”(also known as failover clusters, or HA clusters) improve the availability of the cluster approach. They operate by having redundant nodes, which are then used to provide service when system components fail. HA cluster implementations attempt to use redundancy of cluster components to eliminate single points of failure.

“Fail-over clusters” consist of 2 or more network connected computers with a separate heartbeat connection between the 2 hosts. The Heartbeat connection between the 2 machines is being used to monitor whether all the services are still in use: as soon as a service on one machine breaks down the other machines try to take over.

Benefits

  • Clusters are primarily designed with performance in mind.
  • Clustering servers is completely a scalable solution. You can add resources to the cluster afterwards.
  • If a server in the cluster needs any maintenance, you can do it by stopping it while handing the load over to other servers.
  • Low frequency of maintenance routines, resource consolidation and centralized management.
  • Capable of enabling data recovery in the event of a disaster and providing parallel data processing and high processing capacity.

Implementation

The GNU/Linux world supports various cluster software; for application clustering, there is distcc, and MPICH. Linux Virtual Server, Linux-HA — director-based clusters that allow incoming requests for services to be distributed across multiple cluster nodes. MOSIX, LinuxPMI, Kerrighed, OpenSSI are full-blown clusters integrated into the kernel that provide for automatic process migration among homogeneous nodes. OpenSSI, openMosix and Kerrighed are single-system image implementations.

Microsoft Windows computer cluster Server 2003 based on the Windows Server platform provides pieces for High Performance Computing like the Job Scheduler, MSMPI library and management tools.

gLite is a set of middleware technologies created by the Enabling Grids for E-sciencE (EGEE) project.

slurm is also used to schedule and manage some of the largest supercomputer clusters.

source: Wikipedia

--

--