Upd 2D Conv[DepthWise] kernels
Context
Part of #31
Update 2D Convolution kernels Conv
and ConvDethWise
Operators to perform faster computation.
Protocol
Tests were made on x86-64
CPU architecture. Code was compiled with GCC 9.4.0
/ Clang 10.0.0
compilers. The number of trials was 50 with no warm-up and the time measured was the CPU time using clock()
function from the ctime
library.
My results are presented measuring the value new_kernel_time / old_kernel_time
.
I oberved that time could vary up to 20% between two sets of trials spaced in time.
Convolution
Each trial is based on a convolution with the following parameters:
Conv2D(in_channels = 16,
out_channels = 16,
kernels = [3,3],
stride = [1,1],
dilation = [1,1])
input = Tensor([8,16,64,64]) # shape
Then, a single parameter is tweaked from this base convolution:
- kernel size
- dilation size
- stride size
- number of inpt channels
- number of output channels
- number of batchs
- size of each dimensions of the feature map
Additionaly, a comparison was also performed with parameters that appeared the least favorable in the new convolution implementation. Even though this case is rare, I found it relevant to show. It is titled "special" in the results.
Conv2D(in_channels = 16,
out_channels = 16,
kernels = [5,5],
stride = [3,3],
dilation = [2,2])
Convolution depth-wise
The parameters and protocole are the same as for convolution. The only exception is for in_channels
and out_channels
that are merged into a single parameter channels
.
Results
Conv
ConvDepthWise
Merge request reports
Activity
added EnhancementPerformance ⚡️ TopicOperator labels
assigned to @pineapple
added 5 commits
-
bfc12b95...2a745b7d - 2 commits from branch
dev
- e7eec22f - improve conv kernel speed
- 9349f51e - [Upd] Conv[DW] 2D kernels
- 890f2cfa - [Add] new test cases for conv[dw] and upd 2D kernels
Toggle commit list-
bfc12b95...2a745b7d - 2 commits from branch
changed milestone to %aidge_backend_cpu v0.4.0
enabled an automatic merge when the pipeline for 890f2cfa succeeds
mentioned in commit 382a5508
mentioned in issue aidge#228 (closed)
mentioned in merge request !124 (merged)