On dependability of corporate grids

Ubiquity, Volume 2005 Issue December | BY Kemal A. Delic

Full citation in the ACM Digital Library

As many other great technologies, Computing Grids originated in academic research evolving from the curiosity to serious scientific applications. Decomposable applications which could be instantiated in several parallel threads were the principal grid applications. Sharing of underutilized resources and virtualization of common services are the key architecting principles of the computing grids. These principles provided a new computational efficiency, usage flexibility and cost reductions. Consequently, grid computing standards emerged recently as the most important discussion subject.

It seems now, that in the parallel world of commerce, corporate computing grids are gaining closer attention and rising importance. As we have witnessed fifty years of evolution of business computing on mainframes via distributed computing to omnipresent mobile computing devices we project that the corporate grids might be a future replacement for mainframes whilst offering better cost/benefit ratios and comparable or superior dependability.

Academic grids were not concerned very much with dependability, as their applications have not been business critical (entrusting multibillion dollar business to highly reliable computing fabrics) nor exposed to bad guys (representing the major threat for the privacy and intellectual property). Therefore, I see necessity of major developments in the fields of grid dependability followed by the development of important corporate grid applications.

Grid Dependability

Grid dependability in general can be expressed as the trust to confine business critical applications and private datasets to shared fabrics while ensuring integrity and flawless operations. Dependability should be well thought in advance (during Architecture and Design phases) and maintained constantly (through sound Engineering).

Grids could be abstracted into three architectural layers representing a common typology of grids as well - computing grids, resource grids and data grids. This corresponds to service-centric, computational-centric and data-centric views of grid computing (Figure 1.). As computational infrastructure, grids provide collaboration, communication and coordination services. They can be thought of as abstracted IT services delivered from IT infrastructure. In three indicated layers we observe importance of architecture, design and engineering as the ultimate factors of the overall grid dependability. I have indicated security as the separate concern from dependability to stress its ultimate importance for any serious business use, while technically speaking; it is a component of dependability.

Figure 1. Corporate Grid - Conceptual Architecture

Architecting dependable grids means aiming to create them resilient or able to survive major disaster. One should mark the importance of avoiding architecture and design errors as they may have a very large impact in later phases. Dependable grid design may mean to make grids fault-tolerant (able to cope gracefully with faults). Grid engineering will imply making grids always available either in a full or reduced functionality. These dependability aims will hopefully stimulate development of the whole scale of new technologies termed as: adaptive, autonomic, utility, self* (recovery, healing, configuration etc.).

Loose definition of software dependability relates to the importance of the task, which we would like to execute with the system's reliability, security, safety and availability characteristics. It can be thought of as the composite, joint indicator of application criticality related to the resilience factors of computing infrastructure.

Software dependability research has created important advances in the practice of dependable systems. Also, dependability of the very large-scale, distributed systems in particular has been explored in the past. This goes well with the forthcoming need for the research of grid dependability. Some notable institutions in the US and the EU have already launched such research. One way to study dependability of grids will be to reuse insights from the investigation of large-scale software systems now being seen as a mixture of open-source software, proprietary software (known as COTS) and glueware (customization software). Together, this makes a composite "gridware".

At the conceptual and functional level, computing grids are often likened to power and communication grids. They should be seen as the producers and consumers of computing power, storage and services. Hundreds of years of technological advances and engineering practice has resulted in the reliability of power grids (typically 99.9 %), phone-systems (99.99 %) and availability of communication networks (99.999 %), as an indication of the maturity.

Grids will pose also some specific problems for dependability modeling due to heterogeneity, geographical dispersion, very large-scale, vivid dynamics and tricky (virtual) ownership sharing. Generally speaking, grid modeling will have to address problems of the scale and high uncertainty related to inherent grid complexity.

An interesting research question will be to explore scalability of the known modeling methods and techniques versus development of the grid-specific approaches. As we know, modeling research will nearly always precedes sound, practical advances.

Corporate Grid Applications

To make inroads into corporate computing, grids need a few remarkable, suitable applications having a success in the market, which will stimulate further developments and ultimately lead to wider, profitable business deployments of corporate grids.

We can think roughly about two ways of creating grid applications: porting existing applications or creating entirely new (hopefully breakthrough) applications. They will address some important problems characterized by scalability challenges, massive data sets (range of PetaBytes) and/or collections requiring a huge computing power (range of PetaFlops).

Corporations today are actually virtual organizations encompassing customers, clients, partners and suppliers. Using grids to host virtual businesses (ecosystem) will pose huge financial risks. Therefore, consideration of grid dependability is a must. To illustrate the scope of this challenge, I will describe an imaginary case of the service business hosting several clients and customers on the corporate grid.

Managed services business is providing contractual IT management services either for specific and narrow domain or covering entire IT operations (Fig. 2). To achieve economy of scale, we provide shared infrastructure for cost-concerned customers and dedicated infrastructure for security-worried clients. Aiming at standardization, we limit number of applications to standard set of IT services.

Fig. 2 Corporate Grid for Managed Services Business

Customers are typically globally dispersed and require a certain amount of customization. We observe that infrastructure and services are constantly evolving and adapting to changing circumstances. As customers share the same IT fabrics, we should provide the separation of clients and customers, ensure security and protect their privacy. The challenge is to deliver IT services from corporate grid infrastructure to wide variety of customers, while respecting contractual SLA (service level agreements) clauses, performing to SLO (service level objectives) and achieving financial goals.

At the yet other high level of abstraction, corporate grids should be thought of as service rendering engines called differently by different IT vendors: utility computing, on-demand computing, adaptive enterprise etc.

Another potential use of corporate grids will be for the content rendering for various media companies. Other content items are financial artifacts for banking industry , insurance and investment artifacts (risk management for very large populations of customers), biomedical and genetic research, medical assessments and diagnosis assistance, computational biology, oil industry seismic data analysis, weather forecast etc. etc. Generally speaking, ideal grid applications will be those running several times with different datasets rendering various artifacts and services for profit with guaranteed qualities.

Corporate grids success will be marked by their ability to run a couple of major, complex applications for many clients and customers in varying circumstances across the global geography. Thus, seeing the large ERP, CRM, SCM applications running on corporate grids, will be the clear sign of success of grids in corporate environments.

Corporate Grids: Emerging Computational Fabrics

The key strength of mainframe computing lies in its ability to share resources among a very large number of applications without degrading performances thanks to sophisticated resource management systems and sheer computing power. Also, the quality of mainframe computing embodies accumulation of wisdom and knowledge of several generations of computing professionals, offering thus unprecedented reliability and cost/price advantages. This means also highly profitable business.

In my view, corporate grids have all the characteristics of the new, emerging computational fabrics which exhibits structural and performance adaptability able to host and support various types of businesses. Blade servers, modular storage and open-source software are crucial technological components of contemporary commercial grids. Dependability of these key constituents will provide overall, composite grid dependability.

Dependability of the corporate grids will be a key for this to happen in the not-so-distant future. This will require scientific exploration of dependability of large-scale systems and technological advances. It will follow with interest in grid monitoring and management which is closely related to aggregation and abstraction of computing resources. Questions of the control and grid management have not posed serious issues for academic grids as they are based on shared usage and loose control (cooperative). This, however, might be a key problem/obstacle for the commercial use of grids.

Concluding Thoughts

Development of dependability models for grids is currently non-existent or at the very beginnings. Research advances in this domain may create conditions for breakthrough grid applications. This should be followed by the important advances in grid monitoring, control and maintenance being closely related to grid dependability. It is also very likely that the corporate grids may spawn not only novel technologies but also some innovative business models.

Concerns about grid dependability are closely related to grid complexity. However, I would guess that by the very nature, corporate grids will have higher resilience and may gradually replace mainframes. This might happen in the next decade or two when corporate grids achieve higher dependability and order-of-magnitude improvements of the price-performance ratio.

Study of dependability of grids is precondition for gradual replacement of mainframes with computing grids. Grid management cockpit is envisioned as the distinctive, novel approach to grid management. Dashboards are collating information flow into telling business indicators and are presentation-oriented. Business cockpits are providing functionality of dashboards with decision-making orientation closing thus the action loop. Consequently, decision-makers are not only aware of the risk nature and magnitude but also able to act upon it.

Developments improving grid dependability over time will evolve into trust in novel technology, wider grid acceptance and ultimate business success.

About the Author

Kemal Delic [[email protected]] is a lab scientist with HewlettPackard's operations R&D and a senior enterprise architect with experience in knowledge management, Bayesian nets modeling and realtime intelligent systems.

COMMENTS

Articles

On dependability of corporate grids