Convolutional neural network(CNN) has achieved great success in various scenes of computer vision tasks. Nowadays, many computer vision applications need to be deployed on embedded devices. However, due to the huge amount of parameters and computation of CNN, many embedded devices are not competent for this requirement. Field programmable gate array(FPGA) has the characteristics of parallel computing and low power consumption, which makes it suitable for the deployment of the CNN model. In this paper, we use the high level synthesis(HLS) tool to design a convolution IP(Intellectual Property) core and a pooling IP core that can adapt to different kernel sizes and step sizes, and then deploy them on the ZYNQ7020 heterogeneous FPGA platform. Using these two basic modules, the inference of the CNN model can be well accelerated. Therefore, to verify our circuit architecture, this papre trains a CNN model for handwritten digit recognition. And after data quantification, this CNN model is finally deployed on this heterogeneous FPGA system. Under the 100MHz clock frequency, FPGA only needs 74ms to recognize a handwritten digital picture, and the accuracy rate is 98.89%.
|