Skip to content

PTQ signature is too big

Current PTQ signature is a little overwhelming:

void quantizeNetwork(std::shared_ptr<GraphView> graphView, 
        std::uint8_t nbBits,
        std::vector<std::shared_ptr<Tensor>> inputDataSet,
        Clipping clippingMode,
        DataType targetType,
        bool noQuant,
        bool optimizeSigns,
        bool singleShift,
        bool useCuda,
        bool foldGraph,
        bool bitshiftRounding,
        bool verbose);

Proposed modifications:

  • verbose: Should be deprecated in favor of aidge log system
  • useCuda: Should be removed, using cuda should be decided locally
  • foldGraph: Shouldn't be a parameter of attribute use should run it afterward but the PTQ method should not handle it
  • nbBits: isn't it redundant with targetType?
  • noQuant: this parameter seems badly named, what is it even used for?
  • optimizeSigns: More documentation is required I don't understand what is does ...
  • singleShift: How are we gonna do when we introduce fixed point? Add another boolean? I think we should create an enum ScalingType and use it as an argument instead
  • bitshiftRounding: I would be in favor to remove this argument and use the proposed ScalingType enum instead.
Edited by Cyril Moineau