Improving FPGA Performance and Area Using an Adaptive Logic Module

This paper showed how the key building block of logic in modern FPGAs, the look-up table (LUT), could be changed to allow a more efficient device. Prior work had shown that look-up tables with 4-inputs resulted in the most dense programmable logic, as a 4-LUT best balanced the silicon area needed and the amount of logic that could be implemented in a single LUT. Larger LUTs were known to enable faster designs, but creating an FPGA based on them would lead to a significant area penalty.

The authors of this paper proposed a new logic structure, an adaptive logic module (ALM), that could achieve the speed of a larger LUT architecture, while matching or beating the density of a 4-LUT architecture. The key innovation was adding a small amount of circuitry to a 6-LUT, plus two extra routing ports to the programmable interconnect, so that it could be programmed to be either a single 6-LUT, or “fractured” into two 5-input or smaller LUTs that together used no more than 8 inputs. The authors simultaneously changed the FPGA synthesis flow to understand that using smaller LUTs cost less area, and a 6-LUT should be used only when the timing or logic reduction gain justified it.

The net impact was a more efficient FPGA: 15% faster than a 4-LUT architecture, while simultaneously saving area. Altera/Intel FPGAs from Stratix II onwards and Xilinx FPGAs from Virtex5 onwards have used logic elements based on some variation of the 6-input fracturable LUT proposed and evaluated in this paper.

Endorsement by: Vaughn Betz, Professor, University of Toronto