Channel Wise Quantification into the PTQ
The first step in enabling support for transformers in Post-Training Quantization (PTQ) is to implement channel-wise quantization for the Producers. This involves calculating the scaling factor independently for each channel. As a first step, we will focus on implementing channel-wise quantization for the constant parts of the graph, specifically the Producers of weights and biases. The implementation of channel-wise quantization for activations will be addressed in a separate issue, accompanied by its own Merge Request