Parallel and Incremental Analysis

This project examines the performance of trace analysis tools and the scalability to multiple cores. Doctoral
student T4D1 will study the problem of constructing a model of the traced system incrementally despite
missing information, because of lost events, or missing initial values not present in the trace or not yet
computed by other parallel analysis tasks. He will then propose new algorithms to tolerate the missing
information, determine which state information can still be computed, and label attributes with missing or
uncertain values. The missing or uncertain information may be corrected later using subsequent information, or information from parallel analysis tasks that become available at a later time. This will enable a
more robust modeled state computation algorithm. It will also facilitate the decoupling of the state computation for different time interval sections of the trace, allowing the otherwise sequential processing to
be efficiently performed in parallel. This project has three research axes:

  • Provide a framework to measure the interactive analysis performance of the Tracing and Monitoring Framework. He will then characterize the scalability of the algorithms in terms of different parameters such as the trace duration, the degree of overlap for the state intervals, and the model attribute tree size. He will study the effect of different possible trade-offs such as generating a partial state history tree, trading storage and bandwidth for computation costs, and propose a more efficient configuration and architecture, pipelining different tasks in paralle threads.
  • Propose new algorithms to split the state history database along different dimensions that can be computed in parallel and stored separately, such as separate attribute subtrees, subtrees for different resources, or separate time intervals. The challenge for time intervals is to insure that a sufficient proportion of states can be computed without the initial state values at the beginning of the interval, such that the parallel computation can be performed efficiently, minimizing the 
    need for fixing up later the missing information once state for ealier intervals is computed. In addition, the disk based format for the state history tree will need to be extended in order to allow for the parallel computation of the state for different sections.
  • Test, integration and performance analysis of the proposed algorithms on large industrial problems [With the help of Ericsson].