I've been writing about meta class systems for some time now, see:
Meta is often used as a prefix to other words, such as metadata, which means higher level data, or data about data. Analogously, metatools are tools which control tools. This hierarchy may be extended beyond two levels. Meta Tools (MT) are tools that enable use and control of resources. This implies that a MT is logically a higher level function than the resources themselves.
In this column, I use Meta Tools to define a class of tools that enable the use of complex distributed resources. These individual resources may themselves be complex entities that use a different set of tools to control its own components.
I'll start with an open implementation of the Lightweight Directory Access Protocol (OpenLDAP). This provides an interface to a database of information about people and resources, defining permissions and access from one to the other. It's use is not limited solely to that application.
Next will be the Open Archive Initiative (OAI), which defines an interface for harvesting metadata from an archive, usually containing published scientific papers. It too is not limited to that one specific application.
Finally I'll explore some of the capabilities of the Globus Toolkit which is a set of tools that enables the operation of a collection of distributed computer systems as a single resource. Each of these systems may be multiple different computers and storage in a local network.
LDAP is a lightweight open protocol for accessing information resources. It is an alternative to the X.500 Directory Access Protocol (DAP) for use on the Internet. It uses the TCP/IP protocol rather than the complex OSI protocol.
OpenLDAP treats a directory as a database rather than a rigidly defined set of names and pointers to files. The information in this database is usually related to people and their permissions to use resources defined there. However, this is not the limit of its application.
Staying within the broad definition of resource directory, resources for computers as well as people can be defined. This could also be used as a powerful library card file replacement, where books and other types of information sources could be defined, and access enabled through the powerful search functions.
OpenLDAP is designed to be used in a global environment. It has replication capabilities to distribute information that was entered locally, and can access information from other LDAP servers. A combination of OpenLDAP using UDDI could operate as a directory of computer services that can be discovered and used by computer agents, enabling completely automated operation.
Open LDAP can be found here.. The current release of OpenLDAP is 2.0.22. Downloads of the LDAP server, replicator and libraries, including Java Class libraries contributed by Novell, are available at that site.
Version 3 of LDAP is a "Proposed Standard" and is documented by the following RFCs:
Copies of these RFCs are available. Version 3 of the standard is currently being revised for publication as a "Draft Standard," and will replace the current V2.
In essence, OpenLDAP is a powerful enabler of services, starting with its own service of search and replication. It could replace the function of DNS (Domain Name Service), though the overhead of a general purpose DNS directory would be larger than a special purpose DNS server. The big difference is that OpenLDAP is not limited to one function, and can be easily extended without forcing a complete software upgrade.
It's uses are only bounded by our imagination.
The Open Archive Initiative (OAI) is an organization that is developing tools that enable metadata harvesting with a standard interface. This means that search and directory tools can collect metadata from any OAI compliant system for access by people or computers.
The initial release of OAI 1.0 was in early 2001. It was specifically planned as a prototype to develop experience for a year before beginning work on an enhanced version. Work on V2 has just been organized, but OAI has already achieved significant use and is a part of a number of tools that are freely available.
One of the original motivations for OAI was to solve a problem with scientific papers posted (archived) on the web. The problems existed both for the users of the archive and the archive supporters. Each group would have to develop its own tools and metadata, and each user would have to get or build tools to access that metadata.
Once there were more than a few archives, this would be so expensive as to prevent broad use of this new source of information. People involved in the development of Digital Libraries (see [column]) discovered this early on and started the development of OAI. Here are some links to selected OAI web sites from members of the OAI community:
OAI is now used by many organizations, such as arXiv, who uses software developed internally. Eprints software is currently used by many other organizations - a link list is on their site. Version two of eprints is now in alpha and is expected to be released about the time this column hits the web (late February).
OAI is designed for large collections of peer reviewed information, typically scientific papers. OAI can be used by individuals, but that was not its original purpose and it is more complex than needed for that purpose.
In place of having to develop or adapt a full OAI server, a part of the Digital Library Group at Old Dominion University has developed an open software implementation named Kepler. Kepler is a OAI compliant package for individuals that is tested for Windows, Solaris and Linux. It is written in Java and includes the tested Java runtime.
One of the major issues that Kepler solves is the problem of unreliable access to the small archives they call archivelets. An organization that handles OAI professionally can support high levels of uptime. The individuals and small organizations who are likely to use Kepler don't have the resources for sustained uptime. Kepler addresses this through an architecture designed to be robust in the face of unreliable access.
The installation instructions for Kepler are less than a page long. An article in D-Lib Magazine for April 2001 has a good overview of Kepler. This package should fit the 'small community' model for shared access to published information while making part or all of it available more broadly.
The Globus Project has developed a toolkit needed to build computational grids (meta computing). Grids are virtual environments, collections of computers with a common interface. This toolkit enables computer and information resources to be used from any location despite their being owned by organizations in geographically distributed locations.
The Globus Toolkit (GT) has just reached the first public beta of version two. There is a detailed list of enhancements . Current support includes toolkits for Linux 2.x, IRIX 6.5, and Solaris 2.8.
Along with the toolkit, the Globus Project has developed a specification for integrating grids and web services named Open Grid Services Architecture (OGSA). This new architecture will be the design goal for version three of the GT.
A prototype of the OGSA toolkit was demonstrated on Jan 29, 2002 at Argonne National Laboratory during the Globus Toolkit tutorials. This demonstration of the integration of a Weather service was implemented and exposed on XMethods.com, which has an extensive list of remotely accessable services, their type, function and implementation.
If you're interested in what kinds of facilities and research has been done about Globus, and with the grid capability, the Globus Project has a list of technical papers that cover five areas:
Currently, over 75 papers are listed in PS and/or PDF formats, and range from The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration to Enabling Technologies for Web-Based Ubiquitous Supercomputing.
The new features in the Globus Toolkit 2.0 are centered around three core areas that provide the main functions. They are the Data Grid, the Metacomputing Directory Service (MDS), and the Globus Resource Allocation Manager (GRAM). The whole system has been enhanced by substantial packaging and installation enhancements.
The Globus Data Grid project is currently developing the following core capabilities for data and communications:
The Metacomputing Directory Service (MDS) is the information services component of the Globus Toolkit. Major feature enhancements include the following:
The Globus Resource Allocation Manager (GRAM) is the resource management component of the Globus Toolkit. The major feature change in GRAM 1.5 vs. earlier versions is the addition of new GRAM protocol and API features to support more robust job submission and management capabilities. The API specification is not yet available on the web.
In addition to these areas of development, Globus has made installation and configuration substantially easier by a complete repackaging of the products. They have now enabled installation from binary or source, whole systems or selected parts, patches to track third party software changes, and easier configuration. This enhanced packaging system for the Globus Toolkit provides easier installation for Grid builders, application developers, and end users.
After more than ten years of open software development in the scientific community, open software now holds a preeminent place in the operation of the computing community. The three products I have written about simply scratch the surface of the powerful tools available. OpenLDAP and OAI both enable a wide variety of sharing and automated access.
The Globus Toolkit is a major accomplishment in integration of diverse systems into a virtual global supercomputer. While much has been accomplished, much remains to be done. The way ahead for Globus is pointed to by the Open Grid Services Architecture document.
As an observer of all this activity in the software space, I continue to be excited about the potential for our country and the world in the near future. We are approaching the capability to model the real world accurately, with benefits for everyone.
With specific reference to weather, we will be able to model storms, tornados and hurricanes well enough to understand their structure in detail, and learn ways to steer their paths away from populated areas. For cancer and other illnesses, the development of models of the disease should lead to better understanding, and treatment, possibly even cures.
The key to all of these breakthroughs is understanding. Computer models assist this process by processing data and images, testing theoretical models against reality, searching real data and models for ways to change them, and just the intellectual effort needed to develop an accurate model.
I can't wait to see what develops next!
[30]