Skip to content

Draft: Dev ptq

Cyril Moineau requested to merge DevPTQ into master

TODO

  • Normalize layers
    • Get weight max value for each layer (Fc/Conv)-> alpha
    • Divide weights by alpha
    • Divide bias by alpha
    • Add PerOutputChannel normalization
      • TODO : add steps to do this ...
  • Normalize activation
    • Normalize stimuli between [0;1]
      • or [-1,1]
    • Forward on validation dataset
    • Get max value of activation for each layer -> beta
      • Develop hook sytsem
      • Develop get max activation hook
    • Add scaling factor beta to the activation
    • Add scaling factor beta to the bias
    • Rescale activation by parent bias scaling
  • Quantization
    • Input: multiply by (2^n-1)-1 (singed) or (2^n-1) (unsigned)
    • Weights: Multiply by (2^n-1)-1 + round + store as a signed integer of nbbits size
    • Biases: Multiply by (2^n-1)-1 + Multiply by (2^n-1)-1 (singed) or (2^n-1) (unsigned) + store as a signed integer of nbbits size
    • Activation scaling:
      • Input scaling: (2^n-1)-1 + Multiply by (2^n-1)-1 (singed) or (2^n-1) (unsigned)
      • Output scaling: (2^n-1)-1 (singed) or (2^n-1) (unsigned)
      • Activation sclaing : divide by (input scaling divided bu output_scaling)
  • Activaiton clipping
    • generate histogram of outputs for each layer (using validation dataset)
    • MSE
      • TODO : add steps to do this
    • KL-Divergence
      • TODO : add steps to do this
  • Weights clipping
    • TODO : add steps to do this
  • Better scaling operation
    • Fixed-point scaling
    • Single shift scaling
    • Double shift scaling
  • Bind method in Python
  • LeNet integration test
  • Refactor OutputRange hook to use Tensor Getter/Setter
  • Add docstring to OutputRange hook
  • Refactor hook system to not use registrar ?
Edited by Cyril Moineau

Merge request reports