Dedicated and Desktop Grids, two members of a bigger computing ecosystem
What is Grid Computing about?
Grid’s goal is about sharing resources that belong to multiple administrative entities and providing them as a unified resource to end-users, that may themselves belong to multiple administrative domains. CERN is betting heavily on this concept with the LHC Computing Grid project (also known as “LCG”), in an effort to tackle the huge storage and computational effort that relates to the outpour of data projected to come out from LHC detectors. The LCG example, consisting of Tier1 and Tier2 clusters, is a dedicated grid. In the most holistic approach, Grid could be the world-wide infrastructure encompassing any Networking, Storage and Computing infrastructures that are available globally for shared usage. In this ecosystem we could envision included, any supercomputers accessible by means of digital certificates, big and small clusters, storage devices, scientific instruments of any kind, networking facilities and … perhaps desktop systems. But, could it be possible that the
machine you are reading this article on, have anything to offer next to modern, petaFLOP scale, super-capable clusters and the constellations thereof?
Recycling PCs
The idea of recycling PCs that sit mostly idle on office desks is not new at all. Already ‘since the previous millennium’, the SETI@HOME project, spawned by Prof. David Anderson from University of California at Berkeley, is analyzing radio signals hoping to find interesting patterns in them. No alien discovery has taken place, other than the realization that volunteered PC
resources can actually help science somewhat. The fact is, over two million cpu*years in aggregate computing time have been donated. That infrastructure now has evolved to serve multiple scientific disciplines and has also increased in scale: it hums at 4.725 petaFLOPS as of Q1 2010 with the help of more than 600.000 donating hosts [1]. Another related project, olding@Home,
has at this moment also exceeded the 4.2 sustained petaFLOPS mark, mostly thanks to contributions in kind: Playstation 3 units and GPUs’ cycles. To be sure, there is still more to come, as more and more people invest in gaming technology with strong GPUs, hook on the network, read the news and … ride the wave.
Where do Desktop Grids fit in the picture?
LCG project is driven by CERN and is based on gLite, a grid middleware stack developed over the last decade in an effort to incorporate data and computing centers across different countries in a single infrastructure, following specific standards. Since all this equipment is performing under a common umbrella, for a common target and in a continuous basis, this is called a
Dedicated Grid. Dedicated Grids have certain parameters well-tuned: confidentiality, reliability and software compatibility included. The very opposite extreme is apparently donated resources like SETI@HOME: low trust, low reliability and availability and often questionable compatibility for any particular software library or stack. The gap in the spectrum in between these two extremes is where desktop grids may flourish: Collection of PC resources that are not completely random in their reliability & trust parameter space (even if low!) and maybe even have somewhat more homogeneous appearance in software through the use of Virtual Machines. Counter-acting on these varying factors has been already largely explored by the relevant academic community and there is a good body of knowledge to count on. For instance, a common counter-argument for desktop systems is their trustworthiness and reliability issues; this is commonly treated with very short jobs, job replication or
resubmission policies, voting techniques, reputation management etc. Current grid job framework workflow tools support many of these facilities already in a list of default features, since Dedicated Grids themselves are not 100% reliable. The more Desktop PCs will grow in their capabilities the more attractive this computing style can become; and gaming industry seems to make
sure that this is happening rapidly.
What is this LiveWN fuss about?
There are two ends that must meet each other: the donating user and the domain scientist. For this to happen, the burden of the connection should be with the fabric, rather than needing to re-train the people. One such effort is LiveWN [3], standing for “Live Worker Node”, which tries to address both needs in a user-friendly way. The idea is, that we can distribute a re-configured OS image on a CD, DVD or USB disk, that should be easy to use and require, if possible, zero-knowledge from an end-user. In this case, end-user would be either the donating user or the scientist. LiveWN started as as spin-off activity trying to assist ATLAS/CERN physicists located at NTUA in Athens, Greece. In a recent publication in the Journal of Grid Computing [4], it is
described how it is possible to deploy such a technology in the Greek School Network, so that the better subset of its 60.000 desktop machines can not only become part of the existing HellasGrid country-wide multiple-sites grid infrastructure but actually be a component of the global grid environment. This attempt takes compatibility and dependability aspects in mind, in
particular in the context of the 3rd generation multi-lambda GRNET network backbone which includes certain “corners” of working grid sites. The research team, in which a CSCS/ETHZ employee is involved, has done the preparatory work
to show that this model is viable and the prototype does deliver the promised advantages. At this stage, what remains to be done is deployment at large and … give incentives to municipalities for the tiny -but noticeable in a frugal budget- electricity costs.
What makes LiveWN an interesting technology?
The particular aspect of LiveWN which is highlighted is its ability to function in virtually any environment: within Virtual Machines, behind firewalls, in private address space networks, but still providing a full service environment to the maximum of the available resources (cpu, network, storage). This should be clarified: the scientist can assume, for example, that the same datasets that are available in the default grid environment can also be available in the desktop resource in the same manner and tools, even when constrained by bandwidth or other factors. In fact, LiveWN can be used itself as a prototyping environment because it implements a User Interface: A scientist can initiate his workflow on a LiveWN instance itself. This is of particular interest in the case of Switzerland because the existing Shibboleth/AAI infrastructure, allows a working out-of-the-box implementation. Think of this for a moment: you boot with a disk from your pocket, at any place of the world, you supply your institutional credentials and you can have direct access to supercomputers, clusters on the grid and desktop resources
all at once! You can give and take resources being unbound from any physical constraint. If end-user security can be enhanced by a hardware-token carried along, that could be also taken advantage of!
What could be the practical benefit of this approach?
Is there a problem case where a research scientist will have to use these differentiated types of resources in the same problem? Perhaps, consider the following scenario: a meteorologist wants to spawn a new computational model. Naturally, due to the tight computation nature of the problem, an HPC system is the appropriate environment for controlling the model with full features enabled. Once the basic parameters of it have been established, the model can be scaled-down for smaller multi-core systems commonly found in clusters and be run in a parametric manner spanning longer time periods and geographical zones. Once this step is done, the results could be fast-correlated with existing datasets by using the many disk spindles and cheap parallel capacity of a desktop grid. One area in which a desktop grid can have very interesting performance is, in general, data mining. Data mining may imply searching big volumes of text, scientific datasets, financial reports, data from High Energy
Physics detectors etc. The reason it is so, is that even though processor’s performance is thin, multiple memories and disks are exploited in parallel. GPUs may also emphasize the advantages. When the population size exceeds five digits or so, all these very interesting applications can emerge on a desktop grid and perhaps even make sense financially.
Not only has data mining always been an interesting area for Desktop Grids,
but the sweet-spot for ecological data sorting (sorted data per unit of energy)
was recently broken again by an Atom/SSD-based system, now heading for
an ACM SIGMOD metal! [5]
Is this concept going to fly? We will only know if we try.
[1] http://en.wikipedia.org/wiki/Berkeley_Open_Infrastructure_for_Network_Computing
[2] http://en.wikipedia.org/wiki/List_of_distributed_computing_projects
[3] LiveWN and gLiteDVD: true grid scavenging solutions http://www.isgtw.org/?pid=1000655
[4] A Grid-enabled CPU Scavenging Architecture and a Case Study of its Use in the Greek School Network
http://springerlink.com/openurl.asp?genre=article&id=doi:10.1007/s10723-009-9143-2
[5] http://itnews.com.au/News/170749,netbook-processors-break-power-efficiency-benchmark.aspx
http://sortbenchmark.org/
http://hardware.slashdot.org/story/10/03/29/0251207/Atom-Processors-Set-New-Record-For-Power-Efficient-Sorting?art_pos=1&art_pos=8&art_pos=8