🤖 AI Summary
This work addresses the limitations of traditional application-specific hardware accelerators, which suffer from large area overhead and low utilization, as well as the inability of existing reconfigurable processors to support microcode-level dynamic control flow—such as loops, conditional branches, and exception handling—hindering their efficiency on compute-intensive tasks with complex control logic. To overcome these challenges, this paper introduces, for the first time, a complete dynamic control flow execution mechanism at the microcode level of a runtime-reconfigurable processor. This enables flexible switching of accelerator configurations during execution and facilitates efficient collaboration between general-purpose cores and configurable accelerators. The proposed approach significantly enhances system flexibility and applicability, achieving substantial speedups over conventional general-purpose processors in diverse domains including object detection, ocean simulation, artificial intelligence, and security.
📝 Abstract
As the need for more computing power grows, traditional methods are hitting limits. To boost performance, we're expanding Central Processing Unit (CPU) capabilities and using specialized hardware accelerators. For example, mobile devices usually have cameras, video encoding, and audio accelerators. To perform the different tasks, these accelerators execute microcode programs. These accelerators, however, take up space and often sit idle. Reconfigurable processors offer a solution. They have a normal core connected to several accelerator slots. These accelerator slots can be filled during runtime to accommodate the application running. Once one application finishes and another application is running, the accelerators can be switched. For example, playing music after using the camera.
In this work, we introduce dynamic control-flow execution for the microcode of runtime reconfigurable processors, i.e., support for loops, conditional jumps, and exception handling. We benchmark using four different applications from four domains (object detection, ocean movement simulation, artificial intelligence and security) that all are compute-intensive and would require the dynamic control-flow when executed on reconfigurable processors. We show that the dynamic control-flow allows different applications to be executed with significant speedup in comparison with execution on general-purpose processors.