PTQ signature is too big
Current PTQ signature is a little overwhelming:
void quantizeNetwork(std::shared_ptr<GraphView> graphView,
std::uint8_t nbBits,
std::vector<std::shared_ptr<Tensor>> inputDataSet,
Clipping clippingMode,
DataType targetType,
bool noQuant,
bool optimizeSigns,
bool singleShift,
bool useCuda,
bool foldGraph,
bool bitshiftRounding,
bool verbose);
Proposed modifications:
- verbose: Should be deprecated in favor of aidge log system
- useCuda: Should be removed, using cuda should be decided locally
- foldGraph: Shouldn't be a parameter of attribute use should run it afterward but the PTQ method should not handle it
- nbBits: isn't it redundant with targetType?
- noQuant: this parameter seems badly named, what is it even used for?
- optimizeSigns: More documentation is required I don't understand what is does ...
- singleShift: How are we gonna do when we introduce fixed point? Add another boolean? I think we should create an enum ScalingType and use it as an argument instead
- bitshiftRounding: I would be in favor to remove this argument and use the proposed ScalingType enum instead.
Edited by Cyril Moineau