Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Design of a Low Power and Area Efficient Bfloat16 based Generalized Systolic Array for DNN Applications
Ist Teil von
2022 32nd International Conference Radioelektronika (RADIOELEKTRONIKA), 2022, p.1-5
Ort / Verlag
IEEE
Erscheinungsjahr
2022
Quelle
IEEE Electronic Library Online
Beschreibungen/Notizen
Nowadays demand for artificial intelligence (AI) enabled mobile platforms is increasing. From healthcare services to defense and from remote to urban area, there is a huge demand of secured and power efficient devices. The performance of these platforms can be enhanced by providing an efficient compute engine. These compute engines perform a huge amount of matrix operations. The most popular choice for large matrix computation is a systolic array. In general, the systolic array performance degrades for the large input matrices, due to the trade off between resource utilization and computation delay. To address this issue, we need a systolic array with a control unit to re-configure the array according to the requirement of the computation. Computation array can be further improved by handling the negative weights and reduce the MAC operations. In this paper, we proposed a generalized bfloat16 based systolic array in which the sign of the partial sum (PS) is predicted before computation. The PS sign aids in network pruning which enhances system performance. The proposed system is implemented on a Virtex-7 FPGA board and it performs 2.21× and 4.19× better in terms of area and power compared to single-precision based systolic array.