1- Department of Computer Engineering, Bonab Branch, Islamic Azad University, Bonab, Iran
2- Iran Telecommunication Research Center (ITRC), Tehran, Iran , itrc.ahmadkhademzadeh@gmail.com
3- Department of Computer Engineering, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
Abstract: (354 Views)
One of the most important issues in the design of CNN accelerators pertains to the accelerator's ability to effectively leverage the available opportunities in the type and processing of input data, and the task of achieving this objective mostly lies with the dataflow. Equal channel size in the input feature map and filter of CNNs is one of these opportunities, which makes it desirable to design dataflow as Channel Dimension Stationary (CDS). On the other hand, the complexity of designing computations based on the Cartesian product (due to its all-to-all nature) is lower, especially in CDS dataflows. But, since the Cartesian product method causes the generation of useless products and, as a result, reduces performance and energy efficiency, there is less desire for this type of design. This paper presents a frame called FUCA for Cartesian product-based dataflows, which avoids operations leading to useless products. The analysis revealed that FUCA reduces runtime and energy consumption in the Cartesian product-based dataflow by 1.5x, potentially surpassing the sliding window-based dataflow.