Skip to the content | Change text size

Productivity and Innovation - Distributed & High Performance Systems

Productivity and Innovation

Projects within the Distributed and High Performance Systems theme include research into a unified grid programming methodology for global e-science, and new approaches to debugging, which will be applicable to very large 'petascale' parallel supercomputers.

Research in this area in the Faculty of Information Technology is primarily led by researchers in the Centre for Distributed Systems and Software Engineering (DSSE).

Researchers from our Faculty working in this area include: Professor David Abramson, Associate Professor Shonali Krishnaswamy (Director, DSSE), Associate Professor Arkady Zaslavsky and Associate Professor Andrew Paplinski.

Adaptive data stream processing in heterogeneous distributed computing environments using real-time context

Researchers: A/Prof Shonali Krishnaswamy, A/Prof Arkady Zaslavsky (Lulea University of Technology, Sweden),  Dr Mohamed Gaber
Centre: DSSE
Funding: ARC Discovery 2008-2010
Project outline: This project falls within the ARC research priority goal, Smart Information Use. more...
The innovative contributions of this project through the development of adaptive data stream mining algorithms for heterogeneous devices will have an impact on a range of emerging application areas such as: Meeting time-critical, intelligent information needs of the mobile workforce (e.g. mobile healthcare professionals, stockbrokers); Improving Intelligent Transportation Systems via in-vehicle analysis and crash prevention; and facilitating 'on-board' analysis in sensors that monitor the environment and patients. The project will enhance Australia's leading international role in the area of data stream processing in distributed computing environments. less...
Key outcomes:  

A Unified Grid Programming Methodology for Global e-Science

Researchers: Prof David Abramson
Centre: DSSE
Funding: ARC Discovery 2007-2011
Project outline: Modern science requires huge computational resources and has become Global e-Science. Going far beyond individual super computers, more... 
Grids harness geographically distributed resources: dozens of super computers, workstations, clusters of computers, data bases, together with scientific instruments, such as telescopes or synchrotrons. Currently, Grids are difficult to use because they lack key software infrastructure. We shall develop this by both extending available Grid services and by building new software tools. Australian e-Science case studies will be pursued in environmental sciences, life and health sciences, and geo-sciences and will link to global Grids extending Australia's scientific capabilities globally. less...
Key outcomes:  

A Scalable Debugging Framework for Petascale Computers

Researchers: Prof David Abramson (David.Abramson@monash.edu), Donny Kurniawan, Minh Ngoc Dinh (Ngoc.Minh.Dinh@monash.edu), Luiz de Rose, Bob Moench
Partners: Cray Inc.
Centre: DSSE
Funding: ARC Linkage 2008-2011
Project website:  https://messagelab.monash.edu.au/FundedProjects
Project outline: This project concerns a new approach to debugging, which will be applicable to very large 'petascale' parallel supercomputers. Our approach will make it possible to find errors in programs as they are moved from smaller systems to the new generation of enormous machines. more...

Building on our prior successful ARC funded Discovery grants, we will devise techniques that scale to tens of thousands of processors. The outcomes will be a range of new debugging mechanisms, as well as a commercial quality implementation that can be exploited by our industry partner. Over the years, supported in part by two ARC Discovery projects and commercial sponsorship, we have devised (and patented) a new technique for debugging programs, called ‘relative debugging’. Relative debugging is very different from conventional techniques – it helps identify problems that arise when a program is modified, or ported to another system. It does this by allowing a user to run two copies of a program at the same time (possibly on different machines), and to automatically compare data structures across the programs. This makes it possible to identify the first point at which the two programs diverge, and this can be used to isolate errors in the new code. Importantly, relative debugging augments existing techniques. It is extremely effective because it focuses on data rather than control, and thus, the user’s experience scales up as machine size increases. We have performed many case studies in which errors have been isolated extremely quickly, in some cases by programmers who were not familiar with the programs they were debugging. We have produced a number of research prototypes of relative debugging (called Guard), and have a vested the IP in a start-up company called Guardsoft. Relative debugging techniques address many important issues in debugging programs on petascale machines. However, our current implementation strategies have not been designed to scale to a very large number of processors, and the techniques for analysing (and comparing) large distributed data structures do not scale beyond some tens of processors. We believe that there are a number of ways to expand this dramatically to meet the petascale challenges cited, but it is critical that we perform research on the most appropriate and effective techniques. In this proposal we aim to address the shortcomings of our current implementation strategies. Specifically, we plan to:
• design scalable strategies that allow distributed data to be compared and analysed on very large machines;
• explore a range of new comparison techniques (and metrics) that might be more appropriate for large machines;
• design a debugging engine that scales to tens of thousands of processors;
• build a proof of concept implementation and test this on Cray’s Cascade systems. less...
Key outcomes:  

A Scalable Development Environment for Peta-scale Computing

Researchers: Prof David Abramson, Greg Watson
Partners: US Department of Energy
Centre: DSSE
Funding: US Department of Energy
Project website: https://messagelab.monash.edu.au/FundedProjects
Project outline:

Traditional sequential debuggers provide functions that allow a user to control an executing program, and then to observe the internal state of the system. These actions are typically provided by breakpoints and associated inspection commands. more...
Debugging typically follows a process of iterative refinement in which the user recursively compares the state of the program at various staging points with their expectations. The process can be long and complicated, and may involve multiple iterations in which the code is restarted. If the errors are non-deterministic, then debugging becomes extremely complex because the same error may not occur more than once. Debugging parallel programs is significantly more difficult than debugging sequential ones for two reasons. First, most parallel programs have multiple independent threads of control (or tasks), and any of these can cause an error. Second, the state of the art in parallel debuggers is even more basic than sequential ones. Almost all parallel debuggers simply reproduce the same functions found in sequential ones, and also allow a user to control more than one thread of execution. Thus, a user needs to plant breakpoints in multiple processes, and then examine the state of each independent process. If a data structure has been decomposed and distributed across multiple processors, then the user must not only have an understanding of the expected state of the structures, but also the location of the sub-components. In spite of this, parallel debuggers have been produced and have been effective on relatively small programs. However, it is extremely unlikely that the approaches taken to date will scale to the numbers of processes expected in the next generation of parallel machines. In this proposal, we aim to explore debugging constructs that can be used on much larger parallel machines, and plan to investigate a data driven approach rather than the current task centric view of debugging. We aim to provide mechanisms that provide collective operations for debugging multiple processes. The basic idea is that in most circumstances users will not want to interact with an unrelated set of parallel control threads, but will want to examine and reason about large distributed data structures. This idea stems from prior work on relative debugging, but our intention is to apply the ideas to debugging single parallel programs. less...

Key outcomes:  

Adaptive Data Stream Classification for Wireless Sensor Networks

Researchers: Dr Mohamed Medhat Gaber, Mr Ary Shiddiqi
Centre: DSSE
Project outline: The project is concerned with developing a novel adaptive data stream classification in wireless sensor networks. more...
The new technique is coined RA-Class in reference to its resource awareness and adaptability. The technique has many important applications in classifying detected events in wireless sensor networks. Experimental results have proven the validity of the technique with high accuracy and adaptability. To the best of our knowledge, RA-Class is the first deployed distributed in-network classifier in wireless sensor networks. less...
Key outcomes:  

Improved Models of Calcium Ion Dynamics

Researchers: Prof. David Abramson, Anna Sher, David Gavaghan, Denis Noble, and Penelope Noble(Oxford University)
Centre: DSSE
Project outline: This work aims to improve existing ventricular cell models by replacing their description of Ca2+ dynamics with the local Ca2+control models. more...
This required that the parameters of the Ca2+subsystem be re-fitted. Nimrod/O was used to optimise these. less...
Key outcomes:  

Robust Face Detection and Recognition for Computer-based Security Surveillance

Researchers: A/Prof Andrew Paplinski, Prof Bala Srinivasan and Dr J Sherrah
Centre: CRIS, DSSE
Funding: ARC Linkage 2004-2007 (Completed)
Project outline: The research aims at improving the existing and creating new automated face detection and recognition methods by making them invariant, firstly to head pose, orientation, scale and rotation, and then to occlusion, lighting conditions and facial expressions. more...
A robust face detector will be developed first and then a new face recognition algorithm that continues to learn identity-specific discriminants on-line by collecting incremental face exemplars. The result of the research will be an algorithm that can improve its performance on-line adapting in a stable learning process each identity model to the correct facial examples. The research has significant practical implication in visual surveillance increasing the robustness of identification of person identity, state and intent. less...
Key outcomes: