Upd 2D Conv[DepthWise] kernels
Compare changes
Files
4- Maxence Naud authored
@@ -157,51 +157,66 @@ void ConvDepthWiseImpl2D_cpu_forward_kernel(const std::array<DimSize_t, 2>& stri
Part of #31
Update 2D Convolution kernels Conv
and ConvDethWise
Operators to perform faster computation.
Tests were made on x86-64
CPU architecture. Code was compiled with GCC 9.4.0
/ Clang 10.0.0
compilers. The number of trials was 50 with no warm-up and the time measured was the CPU time using clock()
function from the ctime
library.
My results are presented measuring the value new_kernel_time / old_kernel_time
.
I oberved that time could vary up to 20% between two sets of trials spaced in time.
Each trial is based on a convolution with the following parameters:
Conv2D(in_channels = 16,
out_channels = 16,
kernels = [3,3],
stride = [1,1],
dilation = [1,1])
input = Tensor([8,16,64,64]) # shape
Then, a single parameter is tweaked from this base convolution:
Additionaly, a comparison was also performed with parameters that appeared the least favorable in the new convolution implementation. Even though this case is rare, I found it relevant to show. It is titled "special" in the results.
Conv2D(in_channels = 16,
out_channels = 16,
kernels = [5,5],
stride = [3,3],
dilation = [2,2])
The parameters and protocole are the same as for convolution. The only exception is for in_channels
and out_channels
that are merged into a single parameter channels
.
Copyright © Eclipse Foundation, Inc. All Rights Reserved. Privacy Policy | Terms of Use | Copyright Agent