In SPIN
By Tom Kennedy
Scientists of all kinds, from astronomers to DNA code breakers, require huge computational power to make sense of enormous volumes of data. The linking of powerful computers, through a grid, known as grid-computing, a process facilitated by the HEA, which set up a high capacity network called HEAnet, is providing Irish based scientists with the awesome computational powers they require to store massive amounts of data, run predictive models, and even test out concepts of the origin of matter.
Not only can researchers share enormous amounts of data, but complexity is no longer the barrier it was. Scientists can construct virtual molecules, climate change can be modelled, and physicists now have sufficient computational power to test out concepts on the origin of matter. As Professor Luke Drury, DIAS, one of Ireland's champions of super-computing, remarks, science having gone from in vivo to in vitro, has now moved on to in silico.
The average computer sitting on the desk is an amazingly fast machine, performing millions of calculations a second, and storing gigabytes of data in memory. Processing speed continues to rise, doubling every year, but even so, the most powerful office machine has only a fraction of the processing power required for such tasks as molecular modelling or simulation of natural processes, such as earthquakes.
Scientists were quick to learn that machine computing is a very useful tool indeed, but the processing power of ordinary computers could never match their ambitions. Linking individual machines can overcome some of those problems, because processing of data can be shared. By forming clusters, researchers in a number of institutions found that they could harness the processing capacity of a dozen or more machines. Instead of lying idle for some, or even most of the time, the processors on otherwise unused machines could be set to work. The application of parallel processing, made a big difference to scientists, and it meant that they were no longer constrained by the limitations of a single machine.
However, linking up scores of PCs has its limitations, and while Irish scientists were doing what they could through clustering, they realised that their computing resources were quite limited compared to the best in international standards. In theory, any number of machines can be run in parallel, but in fact, technical and financial considerations limit how large such a system can become. This pooling of PCs, while very useful, can never be much more than a dedicated facility at a specific location.
For universities and institutions, such as DIAS and Met Éireann, this form of clustering was never going to be enough, so they invested in larger, more powerful machines. These machines are powerful mainly because they already incorporate a number of processors working in parallel.
However, even these machines do not have the processing power required for advanced computing, but like PCs, they too can be linked into clusters, and this approach remains the basic approach of grid-computing. Grid-computing was the real break through into high-performance computing.
The existence of HEAnet facilitated the linking up of sites. HEAnet was developed as a high-capacity backbone for education and academic users, and apart from providing a high enough capacity in bandwidth, the development of grid computing gave operators an opportunity to gain experience in managing more complex systems.
The existence of a high capacity academic network made it possible for a few different groups to enter into the realm of grid-computing. In this sharing out of high-end computing resources, an analogy could be made with an electrical power distribution grid. As in this, peaks in demand at one point can be supplied by the excess capacity elsewhere, so resources are used efficiently, while the grid itself remains stable. With a computing grid, the computers are used more efficiently, and through good management, operators can schedule jobs to use some, or, if required, all of the available capacity.
In Ireland there is now a fairly large and growing community of high performance computer users, and various groups have emerged. At TCD the Centre for High Performance Computing began in 1997.
Collaboration between the Geological Survey of Ireland, the Marine Institute and four universities led to the development of the MarineGrid. With HEA, PRTLI funding, the Institute for Information Technology and Advanced Computing was established.
One of these grid-computing groups, CosmoGrid, brings together a number of partners working on the computational physics of natural phenomena. CosmoGrid, led by DIAS, involves the participation of DCU,NUIG, UCD, HEAnet, Met Eireann, Armagh Observatory, and Grid Ireland. CosmoGrid was established under the HEA PTRLI 3, with funding of €12 million to cover the purchase, commissioning and support of a 256 computer processor (CPU) node cluster. Like the other grid computing groups, CosmoGrid developed around a core of researchers working on themes that have something in common, but as experience has shown, the groups themselves have similar needs. This, in turn has led to a great deal of convergence into the emergence of a powerful national grid infrastructure.
As Prof Drury reports, setting up CosmoGrid provided a platform on which to build grid computing expertise. Up to now 18 research students have embarked on, or have finished PhDs, over 100 peer reviewed articles have been published, and a number of training courses in computational methods have been run.
Grid computing is driven by two main groups; the scientists who want to use more powerful tools, and the IT experts who develop and provide the technology. By working together in groups, such as CosmoGrid, scientists and computer specialists have gained valuable experience, and as Prof Drury has pointed out, this is an essential part of building up the national infrastructure. Grid computing has progressed rapidly in the US and Europe, and unless scientists here have access to a strong enough infrastructure, Irish research could be left out on the periphery of international science.
The success of CosmoGrid and the other groups, led DIAS to propose the development of e-INIS, the Irish National e-Infrastructure. The idea of this, explains Prof Drury, is to consolidate, and build on the initiatives to date. Without having groups such as CosmoGrid, he said, we could not now be thinking of taking a favourable position in the international arena. The initial progress, he explains, made it possible for Irish researchers to link up with similar groups abroad, and with further development that position would improve. Certainly this is a view shared by Science Foundation Ireland, which, like the HEA, has been a major supporter of grid-computing. Grid computing is now looked upon in much the same way as buildings and laboratories were seen as vital building blocks to support the emergence of Irish science.
Prof Drury believes that grid-computing will move Irish science up into another higher level, and he said we can expect it to open up a lot of new opportunities. For example, it is now possible to link observatories to combine data on deep space. Irish astrophysicists already have valuable experience in grid-computing, so Ireland could become one of the points in an international network of observatories.
Because they have gained experience in high-end grid computing, Irish researchers have been able to participate in European projects such as JETSET, an astrophysics network, and VOLUME, a geophysics group involved in modelling large scale geological processes. This ability to become involved in computing intensive projects is being fostered, so Irish researchers are not likely to be left out of the in silico revolution. Greater recognition is being given to the skills needed by scientists to enable them to benefit from high-end computing. Workshops and training courses have become more common, and the universities plan to back these initiatives up with graduate courses for scientists in advanced computational methods.
Apart from researchers, Prof Drury would like everyone to know what grid computing can do for science. To create more awareness, an exhibition showing how advanced computing is being used in astronomy and studies of climate change, is to be launched next year.
To reach beyond the needs of specialised groupings, the Irish Centre for High-End Computing, ICHEC, was formed to bring all the major players together. ICHEC, which is funded by SFI and the HEA (through PRTLI support of CosmoGrid), has been built through collaboration, on existing projects. ICHEC is a distributed project, operating remotely between Dublin, Cork, and Galway. There is a great deal of sharing, not just of resources, but of people. Until last year, when Dr Jean Desplat came in from the Edinburgh Parallel Computing Centre to head up ICHEC, Prof Luke Drury from CosmoGrid was the Acting Director.
No matter what discipline scientists are working in, their computational needs are usually much the same. As Prof Luke Drury explains, physicists studying a whole range of natural phenomena, from earthquakes to supernovae explosions, face the same computational challenges. The calculations applying to fluid dynamics also govern convection in the Earth's core, waves in the atmosphere, and blasts from supernovae. Until the development of grid computing, the scientists involved in these studies had little opportunity to interact. CosmoGrid not only provided a common platform; it also created a 'virtual department' for scientists working at the higher end of computing.
How fast is fast? With one Gibaflop, a processor can perform 1,000,000,000 basic multiplications, referred to as floating point operations, a second. To be up with the rest in international computing, Irish researchers need to be 1,000 times faster before entering the Teraflop class. A Teraflop represents 1,012 floating point operations per second.
Two years ago, Dr Andy Shearer, an astrophysicist at NUI Galway, and one of those involved in ICHEC, noted that Ireland had an installed capacity of 6,000 Gigaflops, and up to twenty installations had 64 Gigaflops or more. Since then that capacity has been rising fast, as has the experience to undertake computational intensive projects. Prof Drury at DIAS said that we are now in a position to think seriously about joining international projects as equal partners.