The recent Supercomputing 2000 show highlighted everything from very-high-end networking to super displays for data, and the new approach to distributed computing - Information Grids.
SC2000, the eleventh annual exposition of supercomputing technology, infrastructure and applications, held November 4 through 10 in the Dallas Convention Center, set new conference records for both computer and network performance. The scientific network is described here and includes a useful diagram.
Not surprisingly, a supercomputer conference results in a large amount of information, much of it quite technical in nature. The papers in the proceedings describe the concepts and give further references. To explore the relevance these scientific studies have to our lives and systems of today, I'll be taking selected proceedings and relating them to what it means to us. Some of the effects are more than a year away, some as many as five years out.
For those who are interested in the gritty technical details, all of the conference papers are available from Papers. Scroll down to one of the papers listed below. All of the papers are available in PDF V4 format only. If you have PDF V3, some colors will translate to black which renders the graphics much less valuable, though the text is fine. Because the primary site uses frames instead of direct links, access to items may require selection from the left hand menu and scrolling down to the appropriate title.
In part one of two (this report), I'll be looking at how supercomputers handle data, networking, I/O and very large databases. I'll also take a first look at the Grid concept, an innovative structure to connect a wide variety of systems. Part two will look at special applications of supercomputers and how they will affect our future. Grids are a major development in information technology and have important implications for our future. Don't miss the Information Grids section at the end.
SC2000 was a great example of high performance networking. Aggregate bandwidth between SC2000 and seven super POPs (Major commercial network nodes) exceeded 10 gigabits/second for the production network. Internally, there were four separate but interconnected networks operating from gigabit to OC-48 (2.4 Gb/s) speeds. A separate demonstration was the experimental Xnet, using prototype 10 gigabit Ethernet hardware and software.
During the conference, there were demonstrations of cross country high performance applications with two award winners in excess of one gigabit sustained performance. In the SCINET (SC2000 network) document, there are full details of the network infrastructure for the show, including a block diagram. Only major Internet nodes even come close to this capacity.
For a list of the network performance tests and other details, go to SC2000 Network. Also available are webcasts from Webcasts You will need realplayer to view the webcasts.
While networking is an important area of development, it was just another component in the SC2000 conference. What was more important was the demonstration of sustained high speed Internet connections and the use of those connections in enabling future technologies such as distributed science applications and Grids.
High resolution displays with full color rendition (1600 x 1200 pixels x 32 bit color) represent the practical economic limit for a high performance desktop, yet this is inadequate for displaying real time models from supercomputers such as climate modeling. Past solutions have used other supercomputers for graphics display, separate from the modelling calculations. Another approach built special large displays with a custom interface. The expense and lack of portability of these solutions has limited their application.
One reason very large displays are so important is that there is no computer or program that can recognize patterns and deduce causes like an experienced human. Large scale simulations can generate hundreds of times more data than would fit on a single screen. Enabling humans to see the whole dynamic picture on a large display can lead to insights unobtainable by any other method.
The paper "Distributed Rendering for Scalable Displays" by Greg Humphreys, Ian Buck, Matthew Eldridge, and Pat Hanrahan, reports on the results of a technique called 'WireGL', where the OpenGL calls for display are intercepted before being rendered. WireGL then separates the commands by spatial location and sends them to separate display engines, each driving a 1024 x 768 monitor.
Experiments on an 8x4 matrix of displays, an effective 8192 x 3072 pixel display, showed that for commands with good partitioning, graphic speedups were proportional to the number of displays. Where rendering overlapped displays, performance diminished due to multiple communications and rendering operations. There is a planned extension to this experiment that will use parallel extensions to WireGL to eliminate the bottlenecks in scalable performance.
In our world, multiple displays have been the province of the financial types on Wall Street, with two or four separate screens driven by a workstation with special software and graphics adapters. It's too expensive for most people, but now that may change. The 'divide and conquer' approach of WireGL can also lend itself to a pure integrated circuit technique. With a special chip implementing WireGL on the display card feeding multiple graphics chips, a single display card could drive a matrix of flat screens as one large screen right on your desktop. Quad 18 inch flat screens anyone?
On my workstation are two 4 GB drives with over 40,000 files stored on them. If there was no pattern to where files were stored, I would be hard pressed to find anything short of a major search effort. By splitting categories by drive letter, such as OS on C, D and E, executables on F, development on G and most data on H, I can usually find files with only a little searching. At least once a week, I use the OS/2 search application to list files by name pattern and/or text strings. Sometimes even that doesn't find what I'm looking for.
Now consider a Unix system without drive letters, and scientists (or users) with no control over where their files are placed. Add to this the typical super application which generates thousands of files and hundreds of gigabytes each run, and file naming and reuse can be a logistical nightmare. To rescue scientists and users like me, Scientific Data Manager (SDM) has been created.
The paper "Integrating Parallel File I/O and Database Support for High-Performance Scientific Data Management", by Jaechun No, Rajeev Thakur, and Alok Choudhary, reports on SDM system goals, design and performance. The goals of the SDM project were: High performance I/O; A high level API; Convenient data retrieval.
The basic concept of SDM is to separate the details of the file name, size and location from the metadata about the content of the files. This enables SDM to hide the details, yet makes the retrieval of the data simple by use of the metadata which relates to the content.
It is like my being able to ask "Find the text document I wrote for Byte.com last year about solving Y2K and backup problems, first draft." This is all content related. I don't really care where the file is, I just want to get it back. Suppose I had written a thousand columns (!), with several drafts each, on hundreds of topics, not all mentioned in the title. That's roughly equivalent to the lookup problem of a scientist from a set of scientific runs on one application.
SDM operates by storing metadata (data about content from the command stream) in a database, linking that to the specific file names and locations. A single run of a science application is for a specific set of parameters. At certain points during the run, a snapshot is written of each dataset, and separate sets of visualization data and restart data may also be written. A single run may snapshot every four steps for several thousand steps, and a single experiment may require dozens to hundreds of runs.
With a total dataset size of terabytes over tens of thousands of file sets, finding a set of snapshots in this huge dataset makes the traditional needle in a haystack look easy. SDM hides the irrelevant details of file name and location, providing easy access, for example: "Parameter set 6, snapshots 16 thru 32 of visualization set." The SDM database retrieves the records that match the request and makes the files available.
SDM is conceptually simple and enables multiple ways to retrieve a set of files based on meaningful content. I wonder when it will be available for my desktop.
Email is still the Internet's killer application, and will probably remain so for quite a while. Replacing email with say, instant faxing, would be almost as effective if we eliminate paper and fax machines, and just send computer to computer. Except for one little detail - each transfer requires a separate call setup, connect and disconnect. The phone system would collapse under the multi-billion call load of a typical email day.
So, even just for mail, there is really no substitute for the Internet as a communications medium. Most Byte readers know that TCP/IP is the underlying protocol for the Internet, and everyone implicitly counts on its reliability. So it was with some concern that I read "The Failure of TCP in High-Performance Computational Grids", by W. Feng and P. Tinnakornsrisuphap.
Everyone has experienced delays and outages of the Internet in local situations. Some of this has been reported as failure of hardware, software or due to a farmer with a backhoe, and some is due to overload and congestion. I always assumed that TCP (Transmission Control Protocol) did not have inherent failure modes. It turns out that TCP itself has failure modes due to the nature of the windowing algorithm. This problem first appeared in high performance networks.
The specific failures of TCP are due to the nature of its congestion control in the currently popular version called TCP Reno. This control method keeps doubling the transmission size (TS) until packets are lost, then cuts TS in half, increasing TS by one each successful send. On a local Ethernet, this does not cause a problem. But when the line is very fast and response time is slow, then a one gigabit line with a 0.1 second response has 12.5 MB in flight before TCP finds out some packets didn't make it.
The recovery process is part of the problem. The packets to be retransmitted are an additional load and increase the total size of the next set of transmissions, which tends to increase the probability that packets will again be lost. In short, the congestion control induces bursts rather than smoothing the flow of data. It is these bursts of data which flow somewhat like waves that make high speed TCP traffic perform poorly.
The faster the line and the longer the delay, the worse the problem becomes. New congestion protocols are being tested and one called TCP Vegas shows improvements over the Reno version. But TCP Vegas is a ways from being fully tested, and even longer from being implemented in the Internet. So now in addition to being worried about the farmer with a backhoe (I live in farm country), I wonder if the new high speed lines will get here before the improved protocol does.
Grids are a new concept in our industry. Just as your desktop system is made up of components like a CPU, disk, memory, motherboard and display, so a supercomputer system is made up of groups of components, including multiple compute, networking and storage systems. A supercomputer site, such as SDSC (San Diego Supercomputer Center), is a group of supercomputer systems, together with supporting elements like power, housing, administration and cooling.
A Grid takes this concept of a system made up of components to a new level. A Grid is a distributed group of connected sites, usually with a common entry point called a Portal. At the level of the Grid, new technical and management challenges demand the development of new support tools.
The paper "Computing and Data Grids for Science and Engineering," by William E. Johnston, Dennis Gannon, Bill Nitzberg, Leigh Ann Tanner, Bill Thigpen, and Alex Woo defines the Grid concept. They describe the goals and requirements for grids in general and NASA's Information Power Grid (IPG) prototype project.
Grids enable new classes of projects to be undertaken. Projects, such as Seti@home, which use resources distributed over multiple sites are currently faced with a difficult and time consuming effort to allocate resources and coordinate processing. The grid is a way to enable routine use of these disparate and widely distributed resources, and enable them to be applied to complex and very large problems.
The grid environment will enable scientists and engineers to easily collect resources and attack such problems as:
Several Grid projects are now underway:
Grids are the leading edge of harnessing information technology to solve problems. Today they are being applied to massive scientific and engineering challenges, but tomorrow they will change how we do computing. Current grids are a prototype of the future "Information Utility," a ubiquitous source of information and processing, as far ahead of the Internet as the Internet is ahead of ENIAC and mechanical punch card equipment.
The information utility will operate like the electricity distribution system, also called an 'electric grid'. Who, what and where the processing and storage is done will be as relevant as who, what and where your electric power comes from, and probably be billed in the same manner. Scattered throughout your house will be invisible computers, much like the invisible electric motors that drive your refrigerator, dishwasher, disposal, washer, dryer, air conditioner, disk drives, printers, VCRs, CD and DVD drives, heating system and many car accessories.
The invisible embedded computers will see to our health, entertainment, comfort, education, communications, work, and play. They will usher in the next great change in living environments, of which the personal computer and the Internet is but a pale imitation. Personal computers won't disappear, they will simply be the most visible outlets of this invisible essential, the Information Utility.
Next time, in part two, I'll explore the super applications of supercomputer at SC2000.
Update: The SC2001 site can be found here.