Volume 17, Issue 3 (7-2025)                   itrc 2025, 17(3): 19-33 | Back to browse issues page

XML Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Sadough A, Gharaee H, Amiri P, Maghami M H. CNN Accelerator Adapted to Quasi Structured Pruning and Dense Mode. itrc 2025; 17 (3) :19-33
URL: http://ijict.itrc.ac.ir/article-1-730-en.html
1- Department of AI, Donders Center for Cognition, Radboud University Netherlands
2- ITRC , gharaee@itrc.ac.ir
3- Department of Electrical Engineering Shahid Rajaee Teacher Training University Tehran, Iran
Abstract:   (470 Views)
In recent years, Convolutional Neural Networks (CNN) have been extensively used in machine learning
algorithms related to images due to their exceptional accuracy. The multiplication-accumulation (MAC) in
convolutional layers makes them computationally expensive, and these layers account for 90% of the total computation.
Several researchers have taken advantage of pruning the weights and activations to overcome high computation
bandwidth. These techniques are divided into two categories: 1) unstructured pruning of the weights can achieve heavy
pruning, but in the process, it unbalances data access and computation processes. Consequently, compression coding
for indexing non-zero data increases, which causes much more memory volume. 2) Structured pruning by the specified
pattern prunes the weights and regularizes both computations and memory access but does not support high pruning
amounts compared to unstructured pruning. In this paper, we proposed Quasi Structured Pruning (QSP) that profits
from the high pruning ratio of unstructured pruning. The load balancing property in structured pruning has also been
included in the QSP scheme. Implementation results of our accelerator using VGG16 on a Xilinx XC7Z100 indicate
616.94 GOP/s and 1437.7 GOP/s at just 7.8 watts power consumption for dense and sparse mode, respectively.
Experimental results show that the accelerator is 1.38×, 1.1×, 2.77×, 2.87×, 1.91×, and 1.18× better in terms of DSP
efficiency than previous accelerators in dense mode. As well, our accelerator has achieved 1.9×, 2.92×, 1.67×, and 1.11×
higher DSP efficiency besides 4.52×, 5.31×, 10.38×, and 1.1× better energy efficiency than other state-of-the-art sparse
accelerators.
Full-Text [PDF 1114 kb]   (128 Downloads)    
Type of Study: Research | Subject: Network

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.