Saturday, August 3, 2013

Grid Computing


This article is written with the help of my colleague Mohamed Rasmy


What is Grid Computing?
Grid computing is a processor architecture that clusters computer resources from various domains to attain an objective or task. In grid computing, the computers on the network can work on a task together, thus functioning as a supercomputer. Typically, a grid works on different tasks within a network, but it is capable of working on specialized applications collectively. It is designed to solve tasks that are too very huge for a supercomputer while maintaining the flexibility to process numerous more minuscule tasks. Computing grids distribute a multiuser infrastructure that accommodates the discontinuous demands of giant information processing.

To say simply in Grid computing the resources of each of the computer can be shared with every other computer in the network. Disk drives, mass storage, printers and RAM are shared over the network. Grid computing is much different than conventional super computer because in super computers many processors connected by a local high-speed computer bus while in grid computing many complete computers are connected together via a network. And also the computers which are connected to grid network do not need to be physically close or in a same building.

Open Grid Services Architecture
Open Grid Services Architecture (OGSA) is a set of code, which defines how the information sharing should take place in huge heterogeneous grid systems.
OGSA definitions and criteria apply to hardware, platforms and software in standards-predicated grid computing. The OGSA addresses issues and challenges such as authentication, sanction, policy negotiation and enforcement, administration of accommodation-level accidences, management of virtual organizations and customer data integration.
For a Web service to be considered a grid service, it must sanction clients to facilely discover, update, modify and remove information about the service's state, define how the service evolves and ascertain perpetual compatibility with other accommodations. The goal is to optimize communication and allows disparate information systems from multiple parties to readily work together and exchange data among resources of all types. 

Open Grid Services Infrastructure
Open Grid Services Infrastructure defines mechanism for creating, handling and exchanging information among Grid Services. OGSI intends to provide the Infrastructure layer for OGSA. OGSI includes Web services, extension of Web services boundaries, asynchronous notification of state change, references to orders of services, collections of service orders, and service state data that augments the limitation capabilities of XML Schema definition.

Grid Middleware
Middleware is a type of software which interconnects two separate and different software by creating a link. Middleware sanctions data contained in one database to be accessed through another. Middleware strengthens and simplifies complex distributed applications. It contains web servers, application servers and messaging that support application development and distribution. Middleware provides services beyond those provided by the operating system to enable the diverse mechanisms of a grid system to communicate and manage data.
      Example of middleware
ARC
Advance Resource Controller is an open source middleware introduced by NorduGrid. It provides a common interface for computational tasks to different cluster of computing systems, so it enable the grid infrastructure of different size and complexity to work together. This middleware was first used in Large Hadron Collider experiment in CERN.
EMI
European Middleware Initiative is middleware for high performance grids. This software is developed and distributed by EMI project. EMI middleware supports broad scientific experiments and initiative such as Worldwide LHC Computer Grid  


Malaysian Grid - KnowledgeGrid
The Knowledge grid Malaysia is known as Malaysian grid which is launched by leading Malaysia’s applied research organisation MIMOS in the year of 2007, to arrange for a national infrastructure that take full advantage of high performance computing resources to speed up research and industrial development for national wealth and value creations. To implement the Grid, MIMOS bound with Malaysians’ local universities, foreign research institutions, multinational companies and local industries, hence able to build a strong data basement.
This knowledge grid Malaysia approached in three phase. The first one is based on Service Oriented Architecture (SOS) consisting middleware, grid application and security as well as establishing research collaboration with local universities. Therefore currently MIMOS enrolled with 13 local universities and also specially partnered with few multinational companies to provide necessary tools and resources for the implementation of project. In the 2nd phase focus on enhancing the further knowledge and the final phase involve in rolling out the grid in nationwide.    
KnowledgeGrid is goverened by National Grid Coordinating committee, with the assistance of two sub committee namely National Streering ang International Advisory Panal.


Worldwide LHC Computing Grid
Worldwide LHC Computing Grid is an international collaboration of projects consists of 170 computing centres in 36 countries, mainly in European Continent, as in 2012. This is a project of CERN (European Nuclear Research Facility) to handle the enormous volume of data produced by Large Hadron Collider experiments. This grid enables the physicist all around the world to contribute to the Higgs Boson Project.
This is considered to be the largest grid with around 200000 processing cores and 150 petabyte. This grid analyses the data produced by Large Hadron Collider in CERN. Around 25 petabyte of data being analysed by this grid every year. The data stream from the detectors provides approximately 300 GB/s of data, which after filtering for "interesting events", results in a "raw data" stream of about 300 Mb/s. The CERN computer center, considered "Tier 0" of the LHC Computing Grid, has a dedicated 10 GB/s connection to the counting room.
This grid used Advance Resource Controller as middleware at the beginning and they switched to European Middleware Initiative to get a high performance.

GARUDA Grid Computing System
GARUDA is a collection of scientific researchers and experimenters on nationwide grid of computational nodes, mass storages and scientific instruments that aims to provide technological advances required to enable data and compute intensive science for 21st century.
Garuda Grid Computing tend to pool the research and engineering of technologies, architecture and application in Grid computing, develop a national wide grid computing, create foundation for next generation by tackling long term research issues in grid computing, enable the user to provide access to supercomputing facilities to run their application.
GARUDA Grid deploys Globus Toolkit, version 4.0.7 (GT4), for the operational middleware functionality. The resource management and scheduling in GARUDA is based on the deployment of industry grade schedulers in a hierarchical architecture. At the cluster level, scheduling is achieved through Load Leveler for AIX platforms and Torque for Solaris and Linux clusters. At the Grid level, Grid way meta-scheduler enables reliable and efficient sharing of computing resources managed by different LRM (Local Resource Management) systems.
This grid is managed by 22 regional authorities from 11 cities all over India. It is also under supervision of Indian Grid Certification Authority (IGCA).

CPU Scavenging
We can call CPU-Scavenging as cycle-scavenging or shared computing. This method was commercialized in 1997 by distributed.net and later in 1999 by SETI@home in order to solve CPU- intensive research problems by combined the power of worldwide networked PCs. Therefore it works through by creates a grid from unused resources in a network of members. The members might be whether worldwide or internal to an organization. Specifically CPU-scavenging uses instruction cycles of desktop computers to gain benefits when the computers not used at night, during lunch and also scattered second through the day such as when the computer waiting for user inputs or slow device processes. When the implementation of CPU-scavenging the participating computers also contribute of disk storage space, RAM, and network bandwidth for the purpose to get the maximum usage. 
Contemporary many volunteer computer projects use the CPU scavenging models. Such as BIONIC, They use CPU-scavenging, when the nodes are likely to go offline from time to time. But the company or owners use their resources for their primary purposes.  
  
References
  1. 1.       Sukeshini, Garuda user manual (2010)
  2. 2.       CPU scavenging in the grid era, National Technical University of Athens, Greece
  3. 3.       The Globus Project, Retrieved from http://www.globus.org
  4. 4.       Middleware, retrieved from en.wikipedia.com/Middleware
  5. 5.       Worldwide LHC Computing System, retrieved from en.wikipedia.com/LHC-Computing-System
  6.         Grid Computing, retrieved from en.wikipedia.com/Grid-Computing

No comments:

Post a Comment