Add support for the MatMul operator
Compare changes
Files
10@@ -74,6 +74,12 @@ namespace Aidge {
@@ -74,6 +74,12 @@ namespace Aidge {
The goal of this MR is to allow the quantization of models containing MatMul
operators, using the PTQ pipeline.
It is important to note that there are in fact two very different cases where the MatMul
operator is used :
The first one is to represent a FC
node which has no bias. To handle this case without adding complexity to the PTQ pipeline, we can use the fuseMatMultoFC()
recipe. But we must first ensure that the weight is connected to the input 1
of the MatMul
node (and no the input 0
). That's why a reorderMatMulInputs()
recipe is needed.
The second one is the case where the two inputs of the MatMuls
are actual data (i.e. not weights). In this case we need to modify the different steps of the PTQ pipeline to ensure the scaling ratios are correclty flowing in the graph. The general idea is to multiply the two input scaling ratios that come from the branches that are merged by the MatMul
operator.
To handle the first case :
isAffine()
function to catch MatMul
nodes connected to a weight Tensor (using the isWeighted
tag)reorderMatMulInputs()
recipe that ensure that the weight Producer
is connected to input 1
MatMuls
that are connected to a weight without replacing them with FC
nodesTo handle the second case :
isMerging()
function to catch MatMul
nodes not connected to a weight Tensor (using the isWeighted
tag)normalizeParameters()
and normalizeActivations()
function to multiply the two input accumulated ratiosquantizeNormalizeNetwork()
function to rescale the MatMul
scaling twice as muchOverall :
Files changed : mostly PTQ.cpp
, but also various PTQ related files (CLE.cpp
, headers, ...)
Note : Several other changes were made to improve the code quality (e.g. hasAttr()
, addAttr()
, ...)
Copyright © Eclipse Foundation, Inc. All Rights Reserved. Privacy Policy | Terms of Use | Copyright Agent