This column follows on my previous column named Meta Clusters. In that column I described the cluster and meta cluster concepts and some of the major components and challenges of a meta cluster environment. I also referenced some current sites where meta clusters are being used.
This time I'll look at an operational meta cluster, and some that are in their early stages of development, and what may come in the future as global meta clusters become a standard in research and business.
Meta clusters were first driven by the need for complex analysis of extremely large data collections in scientific studies, such as particle physics. In fact, the current Internet grew out of an effort by a group of physicists to access data on remote systems. Like the birth of the Internet, meta clusters will expand from a science specialty into many aspects of our lives.
Meta clusters are currently being expanded to run massive scientific simulations that study the weather, transonic flight, the evolution of the universe and other challenging scientific problems. Maybe this is where the Enterprise's warp drive will come from.
The next step in meta cluster use will move into business processing. As business grows its use of this resource, prices will decline and other uses will be found, evolving into the oft predicted 'Information Utility'. In a close analogy to the development of electric power utilities and electric motors, the complex of competing meta clusters will deliver new services into each home and business.
Between now and then, we'll have to jump a few hurdles - in security, multi-level systems management, reliable software and the 'last mile' delivery of high speed bandwidth. By this I mean 100 megabit to the home, 1 gigabit to a small business, not DSL, Broadband or Satellite digital feeds.
To gain perspective on where we are, let me look back to the 1960s where being a computer programmer was unusual and automatically made people think you were a genius. NASA's computing environment then was a classic mainframe aided by weird (to programmers) things like analog tapes and satellite data transmission. I had brief contact with the OGO-F (Orbiting Geophysical Observatory) satellite data in the late 60s.
Back then, OGO-F data was transmitted on an analog channel to a high speed analog recorder at Greenbelt, Maryland. This analog signal was then converted to digital data, and decomutated (demultiplexed) on a Univac 1108, a mainframe class machine. The net result was dozens of reels of magnetic tape, at 556 or 800 bpi, from each analog tape. To process the data, you wrote the processing program (FORTRAN and Assembler on cards!), selected the appropriate tapes from a printed catalog on a job form, and submitted the run to the computer center.
Then you waited, hoping that no glitches would happen. Eventually, a paper printout was placed in a mailbox and you collected your results. Storage of the tape reels took an immense amount of space and people support, and tapes got lost or became unreadable despite their best efforts. In retrospect, it's amazing we got as much done as we did. It cost a lot of late nights and lost hair follicules.
Now take a look at NASA's current IT environment, the IPG. The Information Power Grid (IPG) is a national collection of supercomputers and clusters, connected by gigabit channels, involving both government and academic research centers. By clicking on 'About IPG' and then on 'Concepts', there is a good overview of what IPG is designed to accomplish.
"IPG is an example of a Grid computing environment [1], and the vision for IPG is to revolutionize the use of computing in NASA's science and engineering activities."
By October 2000, the prototype IPG was operational, including 600 processors and more than 30 terabytes of data. There is a very good conceptual diagram of the IPG in a PDF file. Click on 'Presentations' and select "Grids as Production Computing Environments: The Engineering Aspects of NASA's Information Power Grid." You will find the diagram on page 6.
IPG uses the Globus toolkit and works with other groups to extend the prototype. They coordinate their efforts with others through the Grid Forum.
The major participants in the IPG are:
The six references at the bottom of the 'Concepts' page are a first class reference to the essentials of an MC system. The 'Vision' button accesses a 4 mb PDF file which shows an example of IPG potential. It shows how IPG could be used to simulate a national aircraft safety system with IPG's distributed resources.
What is the future of NASA's IPG? The challenges are substantial just in terms of data volumes. From the Vision document:
Two TB per day and increasing. This makes the OGO-F satellite look like a drop in the IPG bucket. Just handling and storing that much data is difficult. The computational requirements for analysis exceed even our biggest single clusters available now.
Ultimately, PetaFlop performance, (1 PF= 1 million Gigaflops) will be required to keep up with analysis of the data flow, and much more for simulation and combined analysis such as the Vision document proposes. It's clear that this is good news for those companies supplying supercomputers and cluster hardware, and lots of software opportunity for infrastructure and middleware developers.
The future home of physics data may be developing in the Particle Physics Data Grid (PPDG). PPDG is a project of the Next Generation Internet (NGI) and is a collaboration between nine organizations.
PPDG is very ambitious, even for physicists. It's first objectives is to develop the infrastructure for multi-petabyte (1 PB = 10**15 bytes or a million gigabytes) data sets to be analyzed, with support for hundreds to thousands of experimenters. The second objective is to build on that infrastructure to support large scale collaborative science.
While this may seem extreme, it is important to remember that today's Internet is a direct result of some physicists who wanted easy access to the experimental data at other sites. From that, HTTP and browsers were born. Like a lot of big projects, this is expected to take time and develop in stages. For a detailed project plan, see PPDG Proposal.
Currently, the first stage involves the Globus toolkit, now available, and integration with software at SLAC (Stanford Linear Accelerator Center) and other participants.
The planned next phase involves the PPDG development. Experiments and measurements are under way at SLAC to explore the requirements for very large data transfers.
The final phase envisioned is the Grid Physics Net (GriPhyN), an international network of multi-gigabit networks with collaboration and analysis software, plus access to the thousands of petabytes of collected physics data.
This infrastructure and network would equally well handle other scientific environments such as chemistry and bioscience. Development of such a network would change the way discoveries are made in many sciences, benefiting us in ways we cannot foresee today.
In the last column I wrote briefly about the Distributed Teragrid Facility (DTF) and IBM's plan to build a Meta Cluster for commercial access.
"IBM will invest US$4 billion to build 50 computer server farms worldwide, a computing power grid that will allow customers to buy computing power and storage capacity over the Internet on demand."
In light of operational MCs today and those proposed for tomorrow, what can we forecast from existing and planned MCs?
In ten years or so, life will start to look like a science fiction show today. But, after all, it will be the future.
[30]