Grid Talk
What is a grid?
A mechanism (software/middleware) which allows using multiple computers
and multiple data sources to for general purpose computation.
We already can do that! (think clusters)
But not for multiple "domains"
New requirements:
- Diverse locations -
- WAN involved
- secure communication required across network.
- search methods needed
- computation and (multiple) data resources likely to be at
different sites
- require (transparent) data transport
- platform and file system heterogeneity must be assumed
- Multiple policy domains
multiple authentication/authorization mechanisms must be handled
- N vs N2
- single signon
- scheduling & dispatching jobs, retrieving results, ...
- a
2002 definition by Ian Foster
- Include instruments as an additional dimension.
Our Original Grid Project
- The NC BioGrid Bioinformatics
The project was centered at NCSC/MCNC, and was started
under the leadership of former
MCNC VP Dr. Thom Dunning, this project built
on the major modern emphasis on Genomics and Bioinformatics
at all of the Triangle Universities - and the interest of the
NC Biotechnology Center (NCBC)
and local biotechnology companies, it was decided to focus on
applying grid technology to this area.
The lead architect for this project was Phil Emer, and Chuck Kesler
led the systems work. The Working Group
involved people from NC universities, corporations and non-profit
organizations. It was emerging from the testbed phase
described in a GridToday article and started into the production phase.
As a result of organizational changes, the work has moved to the
NC Bioportal Project
- Initial (test bed) NCBG configuration (purchased by NCSC/MCNC)
- Duke - Linux cluster (IBM)
- NCSC - IBM p690 Linux cluster, IBM e1300 cluster, and Sunfire 3800
- NCSU - Sunfire V 880 - with 8 cpu's
- UNC-CH - Linux cluster (IBM)
- Initial middlware - Globus or Avaki or both
- Test bed operated
with some Genomics/Bioinformatics applications -
Commonality with mainstream IT development efforts
Grid computing is composed of many IT threads.
Few, if any are new. What's new, is putting them together.
- compute power, data, instruments
- (possibly) different platforms
- (possibly) different sites / administrative domains
- full site control of policies (authorization)
- secure transmission
- single authentication - O(N) rather than O(N2)
- "discovery" methods for compute power and data
- coordination of job submission and output
- "delivery of services" rather than "operation of computers"
Theresa-Marie's Visualization in Bioinformatics
Genomics and Bioinformatics Applications
Genomics and Bioinformatics - related reading:
- I3C the Interoperable Informatics
Infrastructure Consortium (which announced "Life Science Identifier
(LSID), which defines a common, logical naming convention for biological
data.")
Services on the Grid:
Other grid projects:
Middleware developers:
- Globus (not for profit) Open Source
(also see links under Vendors below)
IBM has a very good "Redbook"
Introduction to Grid Computing with Globus discussing grid computing
generally, in the context of Globus grid middleware.
- Avaki (corporation) which grew
out of Legion
- Sun (corporation)
(homogeneous - Solaris/Linux on SPARC) Particularly their
Grid Engine Software
Note: Sun uses a slightly different terminology than I use above.
They use the term grid to include a "cluster grid", in which the grid
consists of one cluster. When referring to a grid of the more general
type as described above, they use the term "global grid".
- UNICORE Forum "is an open,
non-profit association which promotes the development and distribution
of the UNICORE GRID system."
- A
screensaver "grid" open source
Vendors:
Grid-like?
- TurboWorx scalable
computing, including "compute farms"
- Xgrid Apple Uses
Rendezvous
to automatically "network" computers and other devices into a
cluster-like arrangement
Additional reading:
Copyright 2002, 2003, 2004, 2005, 2006 by Henry E. Schaffer
Comments and suggestions are welcome, and should go to hes@ncsu.edu
Last modified 2/5/2006
Disclaimer - Information is provided for your use. No endorsement
is implied.