Hardware cannot continue to scale at the rate users demand. Current technology has hit frequency and power walls. To continue improving power usage and application performance beyond these walls, we must move to platforms with multiple processing elements running at reduced frequency and power.
As new hardware is developed with 10s of cores, and in the near future 100s of cores, the software engineer is presented with the problem of programming these systems! How will the programmer determine an optimal mapping of computation to processing elements?
The rise of heterogeneous multi-core systems adds an additional dimension to the problem. Processing elements in the same system can have varying capabilities, instruction set architectures, modes of communication and synchronization, models of computation, and operating frequencies. How will the programmer orchestrate these heterogeneous cores using their individual strengths to increase application performance and reduce power consumption?
Chimera is a set of tools that helps the programmer face the challenges that arise when developing applications for heterogeneous multi-core systems. Chimera tools assist parallelization of existing and/or new software and automate the orchestration of low-level compilers to achieve construction of optimized hybrid code.
The solution consists of four main elements:
- Source-level program analysis supporting "guided" parallelization and fragmentation of code into concurrently executable code snippets (sourcelets).
- A compiler orchestration engine and binary-level component framework for heterogeneous code composition.
- Code planning engine for mapping computation to processing elements.
- Run-time code optimization engine for increasing power/performance efficiency.
Program analysis is used to slice the program into sourcelets that can be compiled into codelets targeting one or more of the available heterogeneous processors. The source-to-source transformation engine achieves this with pluggable transformation strategies designed to transform programmer's source-code to source-code that can be compiled by the underlying platform's target compilers.
The ability to orchestrate multiple existing compilers relies upon the ability of Chimera to form hybrid binaries together with a loader/execution framework that can deploy and execute the individual components appropriately. Chimera's hybrid code architecture provides abstractions of general memory transfer, execution control and resource control which any target processing element can provide. As the main line executes, the binary interface integrated with the target codelets, distributes and controls concurrently executing code.
The Chimera code planning and optimization engines work hand-in-hand to map codelets to processing elements in a way which maintains user defined power and performance thresholds, through a combination of run-time analysis and machine-learning. Although run-time analysis is used, the Chimera tool supports "stamping" out of optimized applications at any point during optimization. Stamped out solutions, represented as fragments of source code (i.e., sourcelets) and a modified main line program that incorporates hooks into sourcelet execution, allow Chimera-built systems to be statically verified for the purposes of system certification.
Chimera orchestrates existing compilers for each individual processing element and optimizes hybrid code for heterogeneous multi-core systems. It embraces an open architecture approach to systems development and can help new and existing applications quickly take advantage of the significant performance benefits of next generation multi-core heterogeneous platforms.