In the scientific community, portals bring a common interface to a diversity of hardware and software resources. They simplify the logistics of access and scheduling. The objective of the portal is to provide easy Web access to distributed resources, including software, data storage and computational facilities.
Portals don't simply tie together a random assembly of computers, they address groups of them, called Grids. Grids are coordinated sets of computer resources which may be distributed over geographic areas and across different institutions. To be part of a grid means the systems have a basic set of common functions that support remote access and control.
Portals and Grids are a recent development in the large scale scientific computational field. Rapidly increasing computer resources coupled with exploding demand from astronomers to zoologists for new computer models has led to scientists having to master computer arcana to get their work done. If they wished to transfer to a different system, the work had to be done all over again.
Handling thousands of large jobs for a big complex like NASA or SDSC computer centers led the managers to create control software and a network to connect scientists to a remote system. The software was later enhanced to mask some of the differences between systems, then web interfaces were built and the prototype portal and grid appeared. Computational evolution in action.
There is some confusion about the differences between the Internet and a Grid. A Grid is not simply a more powerful Internet, or Internet 2. The Grid is on the Internet because it uses standard protocols, but it has some unique characteristics. Those characteristics are Dynamic Sharing (DS), Single Authentication (SA) and virtual collaborations (VC).
Dynamic sharing is not P2P or client-server. DS is an ability to add or change sharing during operation of an application, either in response to changing requirements, or from a participant's command. For example, a task force evaluating earthquake protection may bring a materials specialist into the process for specific data or computation, and then detach his connection when done while the task force continues to work.
Single authentication means that you identify yourself once, and that authorization is automatically forwarded where needed. This capability exists within a single network entity of trusted systems and users, but the SA capability is capable of global authorization from a single login. Thus across geographic and organization boundaries, your authentication enables automatic access to all functions that you are entitled to access.
Virtual collaborations are a major result of a Grid. A VC can exist for an hour, a week, or several years. It may involve one site and small amounts of data, or be like the high-energy physicists VC involving worldwide operation on Petabytes (10**18 bytes) of data from colliding particles.
Most often it will involve groups of specialists pooling their knowledge, data and software to solve a specific problem. During its existence, the collaborators will share limited amounts of their knowledge under specific security and sharing rules, accessing agreed data and software and interpreted by the domain experts. When it is finished the results are delivered and the VC disappears until the next time it is needed, perhaps then with other participants.
The features of dynamic sharing, single authentication and virtual collaborations cannot be built on the current Internet. Each of these capabilities depends on functions that do not exist in the Internet. The required functions are:
These tasks are particularly difficult because they mustt operate seamlessly over multiple networks, heterogenous computer systems, distributed resources of people and systems, and in spite of interruptions in one or more systems.
The base that a Grid is built on starts with a low level interface for communicating on a common protocol, such as TCP/IP, and being able to query for all of the information required for those functions. This query language becomes a higher level protocol for discovering resources and enabling secure access and other functions. Because it is a query-response language, each system can implement it in any convenient manner that meets the language standards. One such existing language is LDAP, the Lightweight Directory Access Protocol, which could be used to identify resources specified with standard grid nomenclature.
Built on top of the query language would be tools to manage the specific functions listed above. On top of those tools would reside specific administration and control policies which would be set by each organization. Finally, above the policies are the collaboration building tools which by human direction manipulate the lower level tools and functions to accomplish what the group needs done.
All this looks more complex than necessary, but that appearance is deceptive. The layering approach is a well established technique to create clean interfaces that can be used in various ways, some of which may not be anticipated by the original designers. More importantly, once a set of layers and functions is established, new systems usually have to build only the lowest level layer, and are able to reuse existing higher layers. This has the important feature of enabling rapid access to new systems.
Much additional information is available from the NPACI (National Partnership for Advanced Computational Infrastructure) search page. Check the Online and EnVision boxes and enter the string 'Grid Computing' in search. I show 53 hits on that phrase, which is a great start to discovery.
Portals have been in existence for some years as a window on a particular server. Yahoo and many others created customizable screens so that every individual could have their important information conveniently displayed. These portals are useful for both customer and supplier, because the customization saves time for the user and provides marketing information for the supplier.
Grid portals differ in several ways from current customizable portals. The objective of a grid portal is to provide a consistent, easily used interface to a complex environment so the user can rapidly get his or her work done without being a computer expert. Where single server Internet portals are designed to tie individuals to a service, grid portals are designed to enable easy access to a diverse set of resources. Naming both types as 'Portals' could create confusion. Further reference to Portals in this column will mean grid portals.
The development of grid portals has become such a hot subject in the scientific community that there is already a toolkit to assist in the development of a Grid Portal. Not surprisingly, it is called GridPort. It is a collection of tools for the development of user portals on computational grids. The toolkit and GridPort application information is available at: GridPort. More information about GridPort and other portal supporting tools is available in a news release at: News.
"GridPort brings the ideal of the computational grid a huge leap closer to reality," said Sid Karin, director of SDSC and the National Partnership for Advanced Computational Infrastructure (NPACI). "With GridPort, application portal developers can worry a lot less about the underlying grid technologies and more quickly put researchers on the path to tomorrow's scientific discoveries."
This portal building toolkit uses advanced security and computing tools to provide secure services. User portals built with GridPort tools enable access to any components of the computational grid through a web interface from any web browser. While this toolkit is aimed at scientific computation facilities and is designed to connect to a grid environment, in principle this toolkit approach could enable building a common user interface across many different systems in corporations.
Current portals, such as the PACI GridPort, are built with several other components. The portal uses GridPort toolkit and the CA Client from SDSC, Globus and the Grid Index Information Service from the Globus Project, and the community-supported Grid Security Infrastructure. A news release with more detail is at: PACI News .
Several applications have already been built with the GridPort toolkit:
Just how big a grid could get was demonstrated in 1999 with 300 computers at 15 sites across Europe and The United States. The press release has an interesting quote:
"A prototype for future 'computational grids,' which will provide supercomputing power on demand, just as a power grid provides electricity, was demonstrated in San Jose ..."
NASA, which has a number of distributed supercomputer systems in its "Information Power Grid" is also using the NPACI-based code for its Information Power Grid (IPG) HotPage.
In summary, Grid Portals are sprouting like a field of flowers in spring, all over the world. To get more information on this and related subjects, you can use the search engine at: NPACI search. Check the top four boxes for current information and enter search strings of 'Grid Portal' and 'Information Power Grid' as well as other keywords that turn up in the results. There is a lot more information about these subjects easily available there.
[30]