The problem of partitioning the elements of a control flow graph of a computation between dedicated hardware blocks and hard processors/digital signal processors (DSPs) of a system on programmable chip (SOPC) is dealt with in this paper.
The crux of the proposed algorithm is to locate all possible single-entry/single-exit partitions of the control flow graph. For each such partition, the improvement in latency due to moving from software to hardware and the cost in hardware area are computed by sorting the partitions in order of roughly increasing area and decreasing latency improvements. The intersection point of these two curves is a partition such that choosing any other partition either leads to a suboptimal latency improvement or a suboptimal area allocation. Repeated iterations lead to allocation of partitions to hardware blocks until no more hardware area is left. The results show that this algorithm performs better than previously known algorithms although it can, in comparison to a genetic algorithm, take longer to do the partitioning.
The paper is an interesting if somewhat dense read for those interested in SOPCs and algorithms to automatically partition program elements across the processor and programmable hardware. It does, however, propose a heuristic without any proof that the heuristic can be applied or results in close to optimal partitions. It does, however, offer a very thorough evaluation.