Skip to content
Snippets Groups Projects

Aidge Compression Module

A module for compressing Convolutional Neural Networks (CNNs).

This module decomposes convolutional layers into multiple smaller convolution operations, reducing the total number of computations and improving inference speed.

It provides a C++/Python interface for compression.

Usage

Example usage with resnet18, where the first convolution is ignored, and manual ranks are assigned to specific layers (-1 ignores the layer, [0.,1.] represents a weakening factor):

import aidge_core
import aidge_onnx
import aidge_compressor

model = aidge_onnx.load_onnx("resnet18.onnx")

ignores = {"conv1_Conv"}
ranks = {
    "layer1_layer1_1_conv1_Conv": -1.,
    "layer2_layer2_0_conv1_Conv": -1.,
}

aidge_compressor.Compressor(ignores, ranks, 0.8).compress(model)

aidge_onnx.export_onnx(
    model,
    "resnet18_0.8.onnx",
    inputs_dims={
        list(model.get_input_nodes(aidge_core.InputCategory.Data))[0].name(): [
            [1, 3, 224, 224]
        ]
    },
    outputs_dims={list(model.get_output_nodes())[0].name(): [[1, 100]]},
    opset=18,
)

A more complete set of examples to use this module is provided in the examples/ folder.

Compression methods

This module compresses CNNs through tensor decomposition, replacing a single convolutional tensor with multiple smaller tensors, reducing computational complexity.

Module internal pipeline

The process starts with rank selection for each layer's decomposition:

  • Without a training dataset: The module uses VBMF to estimate the rank for each layer.

  • With a training dataset: It leverages inference on the dataset to assess the impact of decomposition on each layer.

  • With a partial dataset: The method can still use the available data, but it's preferable to ensure the dataset represents various cases the network will encounter.

Following rank selection, each layer undergoes Tucker-2 decomposition or WeightSVD:

Tucker2 decomposition.

  • WeightSVD: Applied to 2D tensors, replacing a single tensor with two smaller ones.

  • Tucker-2: Applied to 4D tensors, replacing a 4D tensor with two 2D tensors and a smaller 4D tensor.

TODO list

  • Tucker-2 implementation.
  • VBMF rank automatic selection.
  • Weight SVD for linear tensors.
  • Validation dataset aided rank selection.
  • CP decomposition
  • Automatic hybrid selection between CP / Tucker
  • Robust network decomposition