Advanced CNN Accelerator IP

Optimized for Convolutional Neural Network

Related Applications

The Lattice Semiconductor Advanced CNN Accelerator IP Core is a calculation engine for Deep Neural Network with fixed point weight. It calculates full layers of Neural Network including convolution layer, pooling layer, batch normalization layer, and fully connected layer by executing a sequence of firmware code with weight value, which is generated by Lattice SensAI™ Neural Network Compiler. The engine is optimized for convolutional neural network, so it can be used for vision-based application such as classification or object detection and tracking. The IP Core does not require an extra processor; it can perform all required calculations by itself.

Higher Throughput – 64 bit data path engine for Avant, and 32bit data path engine for Avant and CPNX FPGAs.

Faster Run Time – Vector ALU for enhanced pixelwise operations, and accelerated pre/post ML image processing algorithms.

Improved Performance – Supports 1 to 4 convolution engines.

Features

  • Support for convolution layer, max pooling layer, global average pooling layer, batch normalization layer, and full connect layer
  • AXI4 for external memory interface
  • Configurable number of memory blocks for tradeoff between resource and performance
  • Support for lookup table-based activation functions after convolution and fully connected layers
  • Support for new Filter module in Vector ALU for specialized kernels like min/max/median

Jump to

Block Diagram

Resource Utilization

Configuration FPGA clk, aclk (MHz)2 Registers LUTs LRAMs EBRs DSP MULT (in terms of 18x18 MULT)
VE_TYPE = “VE64” (64b engine)
LRAM_NUM : 8
SPD_NUM : 8
CONV_NUM : 1
CONV_M4 : 1
EN_VE : 0
EN_VE_FILT : 0
EN_CONV5 : 1
EN_CONV7 : 1
EN_ARGMAX_POOL : 0
EN_MAXPOOL_S1 : 0
(Both Scale and FC LUT activations disabled)
Avant-E-500 200, 100 50512 46039 (Avant has no LRAMs, so ML SPD is made of EBRs) 407 282 (Uses DOT PROD DSP primitive)
VE_TYPE = “VE32” (32b engine)
LRAM_NUM : 8
SPD_NUM : 8
CONV_NUM : 1
CONV_M4 : 1
EN_VE : 0
EN_VE_FILT : 0
EN_CONV5 : 1
EN_CONV7 : 1
EN_ARGMAX_POOL : 0
EN_MAXPOOL_S1 : 0
(Both Scale and FC LUT activations disabled)
Avant-E-500 200, 100 27393 25574 (Avant has no LRAMs, so ML SPD is made of EBRs) 271 74 (Uses DOT PROD DSP primitive)
VE_TYPE = “VE32” (32b engine)
LRAM_NUM : 7
SPD_NUM : 8
CONV_NUM : 1
CONV_M4 : 0
EN_VE : 1
EN_VE_FILT:0
EN_CONV5 : 1
EN_CONV7 : 1
EN_ARGMAX_POOL : 1
EN_MAXPOOL_S1 : 1
(Both Scale and FC LUT activations disabled)
LFCPNX-100 96, 96 36546 50006 7 174 91

1. Performance may very when using a different software version or targetting a different device density or speed grade.
2. The clk and sclk numbers are from timing closuer in ML demo and refernce designs release with sensAI 6.0.

Ordering Information

Device Family Part Number
Multi-site Perpetual Single Machine Annual
Avant-AT-E CNNADV-ACCEL-AVE-UT CNNADV-ACCEL-AVE-US
CertusPro-NX CNNADV-ACCEL-CPNX-UT CNNADV-ACCEL-CPNX-US

Documentation

Quick Reference
TITLE NUMBER VERSION DATE FORMAT SIZE
Select All
Advanced CNN Accelerator IP Core - User Guide
FPGA-IPUG-02224 2.0 1/9/2024 PDF 615.2 KB

*By clicking on the "Notify Me of Changes" button, you agree to receive notifications on changes to the document(s) you selected.