UB Paderborn / Katalog / Suche / Details

Ergebnis 5 von 45

IEEE solid-state circuits letters, 2018-12, Vol.1 (12), p.217-220

2018

Autor(en) / Beteiligte

Titel

A Scalable Multi-TeraOPS Core for AI Training and Inference

Ist Teil von

Ort / Verlag

Piscataway: IEEE

Erscheinungsjahr

2018

Link zum Volltext

Quelle

IEEE Electronic Library (IEL)

Beschreibungen/Notizen

This letter presents a multi-TOPS AI accelerator core for deep learning training and inference. With a programmable architecture and custom ISA, this engine achieves >90% sustained utilization across the range of neural network topologies by employing a dataflow architecture to provide high throughput and an on-chip scratchpad hierarchy to meet the bandwidth demands of the compute units. A custom 16b floating point (fp16) representation with 1 sign bit, 6 exponent bits, and 9 mantissa bits has also been developed for high model accuracy in training and inference as well as 1b/2b (binary/ternary) integer for aggressive inference performance. At 1.5 GHz, the AI core prototype achieves 1.5 TFLOPS fp16, 12 TOPS ternary, or 24 TOPS binary peak performance in 14-nm CMOS.

Sprache: Englisch
Identifikatoren: ISSN: 2573-9603
eISSN: 2573-9603
DOI: 10.1109/LSSC.2019.2902738
Titel-ID: cdi_proquest_journals_2296109007

Format: –
Schlagworte: Accelerators, artificial intelligence, Bandwidth, CMOS, Computer architecture, dataflow, Deep learning, deep learning (DL), Floating point arithmetic, Hardware, Inference, machine learning, Model accuracy, Network topologies, Neural networks, Solid state circuits, Training

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX