Cluster Computing & Grid Computing
Cluster
Computing
A
computer cluster is a group of linked computers, working together closely thus
in many respects forming a single computer. The
components of a cluster are connected to each other through fast local area
networks
J Requirements for computing increasing
fast.
More
data to process.
More
compute intensive algorithms available.
J
Approaches to supply demand
Qualitative:
Optimized algorithms, faster processors, more memory.
Quantitative:
Cluster computing, grid computing, etc.
J Cluster categorizations
High
Availability Cluster
Load
Balancing Cluster
HPC
Cluster
J High Availability Clusters
Failover
Clusters, mainly implemented to improve the availability of service that
cluster provides
They
operates by having redundant nodes, upon failure the standby node take cares
Types
of High availability clusters: one way & two way
Often
used for critical databases, network files sharing and business applications
J Load Balancing Clusters
Multiple
computers connected together to share computational workload
Logically
they are multiple computers but function as single virtual computer
Request
initiated from the user is distributed among all the nodes by one or more load
balancer
J HPC Clusters
HPC
clusters are mainly used to increases the performance by splitting the
computational task into different nodes
Mainly
used in scientific computing
Popular
HPC cluster implementations are nodes running with linux os and free software’s to
implement the parallelism
The
job running on the cluster nodes requires little or no inter nodes
communication is called “Grid Computing”
The
local Scheduling software manages the cluster nodes load balancing
Middleware
such as MPI (Message Passing Interface) or PVM (Parallel Virtual Machine)
permits compute clustering programs to be portable to a wide variety of
clusters
Grid
Computing
J What is Grid?
Grid computing is a term referring to the combination of computer resources
from multiple administrative domains to reach a common goal.
Coordinates
resources that are not subject to centralized control.
Uses
standard, open, general-purpose protocols and interfaces.
Delivers
nontrivial qualities of service.
J Why Grid?
Large-scale
science and engineering are done through the interaction of people,
heterogeneous computing resources, information systems, and instruments, all of
which are geographically and organizationally dispersed.
The
overall motivation for “Grids” is to facilitate the routine interactions of these
resources in order to support large-scale science and Engineering.
Virtual
Organization (VO) refers to a dynamic set of individual and/or institutions
defined around a set of resource-sharing rules and conditions
Multiple
organizations that function as one unit through the use of their shared
competencies and resources for the purpose of one or more identified goals
Example: LHC: 1800 Physicists, 150 Institutes, 32
Countries 100 PB of data by 2010; 50,000 CPUs
J Components of Grid
J Grid Architecture
Grid
Architecture can be described as the layers of building blocks, where each
layer has a specific function, to accomplish Grid Computing Infrastructure
J Grid middleware’s
A
mediator layer that provide a consistent and homogeneous access to resources
managed locally with different syntax and access method
It
provides a uniform interface of the Grid to users and handle all the complexity
generated due to heterogeneous systems.
Middleware
S/W is a layer between grid applications and low level functionality of grid
J Popular middlware
Globus
Toolkit – Globus Alliance
Glite-EGEE
Gridbus-University of Melbourne
Unicore
(Uniform Interface to Computing Resources )-Institutev for Advanced Simulation, Guelich,
Germany
OMII
from the Open Middleware Infrastructure Institute
J Functionalities
1.Security
Information Security
Secure communication
Authentication
Single sign on & Delegation
Authorization
Resource Level
VO Level
Infrastructure Level Security
Host Security
2.
Job Management
Support an open Job Description
Language RSL, JDL, JSDL
Submission, Status Query, Cancel & Destroy, Getting Output &
Error
Transferring input/output data from/to
remote source/destination
Support Serial/ Parallel Jobs
(Heterogeneous & Homogeneous)
Integration with all Local Resource
Managers
3.
Data Management
Two
Basic Categories of Data Management
Data Movement
• Secure • Robust • Efficient • Third
party movement
Data Replication
• One or more copies or replicas •
Survive loss • Easy availability
Reduce
access latency
Performance for distributed
applications
4.
Information System
Provides mechanism for discovery and
monitoring of resources
Designed to provide various
characteristics of resource, computation, service and other entities.
Provide access to static and dynamic
information regardingv
system components
Access to information is subject to
authentication and authorization mechanisms.
Information sources are distributed
J Applications
Sequential Jobs for particular
platform
Concurrent Sequential Jobs for
different platforms
Homogeneous Parallel job for
particular OS
Heterogeneous Parallel Jobs
Bio Informatics applications
High Energy Physics Applications
Weather Modelling and Predicting Ocean
Currents
Disaster Management
Aerodynamic Simulations
J Advantages
Can solve larger, more complex
problems in a shorter time
Easier
to collaborate with other organizations
Make
better use of existing hardware
J Disadvantages
Grid software and standards are still
evolving
Learning curve to get started
Non-interactive job submission
Comments
Post a Comment