This page describes the work being undertaken in the NWO funded project Foundations for Massively Parallel on-chip Architectures using Microthreading - Microgrids. This is a four-year project exploring sone novel foundations for multi- and many-core chips. The project started in September 2005 and will finish in August 2009. This project has already been succesful in its outcomes and the model of microthreading has been adopted as the SANE Virtual Processor (the SVP model), where SANE stands for Self-Adaptive Network Entity, in the FP6 European Integrated project AETHER. Work has also started on the FP7 European STREP project Apple-CORE in developing an infrastructure of compilers and tools for the SVP/Microthread model.
This project is addressing a number of fundamental research questions in its attempt to provide a systematic approach to the development of many-core chips. Although very ambitious, the manner in which these questions are posed is incremental and can be seen to have a direct impact on current developments as well as on providing a framework for future developments right up to the end of silicon scaling. The research questions can be summarised as follows:
Is it possible, through the introduction of simple and explicit concurrency controls, to develop a systematic approach to:
all within the context of ten to fifteen years of silicon-technology scaling (i.e. over a thousand fold increase in chip density)?
The issues that are being researched ion the Microgrids project are:
Microthreading is an execution model that breaks code down into fragments that can execute simultaneously. It provides data-driven synchronisation close to the processor in a distributed register file, which manages dependencies in pipeline operations. Memory is assumed to be slow and is synchronised in bulk using a barriers on the model's families of threads. Recently we have extended the model to one in which complete programs can be decomposed into a parallel control structure over many threads. This control structure is built dynamically, is constrained by resources and a fragment may be as small as a single instruction.
Microthreading can be implemented in any instruction set by adding support for the following instructions:
In addition to these instructions some form of control stream is required to identify thread end points and context switch points. An example of the the use of create for executing a loop concurrently is shown to the right. Create can also be used to represent task concurrency and instruction-level concurrency.
Implementations of microgrids use a tiled floorplan partitioned into clusters of processors forming allocation units. These units are allocated dynamically at any level of create, allowing concurrency to recursively unfold over a chip (or many chips) according to resource utilisation models and the dynamic metadata associated with each level of create instruction. The key features of this model are:
We have performed extensive simulation of a microgridand these results are cycle accurate. The results below are for the FFT . They show speedup for an FFT of length 2^8, 2^12, 2^16, 2^20 for n processors against the performance of a single processor. The same results are plotted on two different scales for clarity. These results are translated into performance in the final figure assuming a 1.5GHz clock.