Skip to content
Snippets Groups Projects

Draft: CMSIS_NN integration

Closed Wissam Boussella requested to merge wboussella/aidge_export_arm_cortexm:master into dev
13 unresolved threads

Context

#6 (closed) This merge request contains tools for exporting quantized neural networks to Arm Cortex M targets, using the Arm CMSIS_NN library.

It contains :

  • New templates for generating code (Fully Connected, Conv, Max Pooling, Relu)
  • A merge of the Conv Relu Scaling and Fully Connected Relu Scaling nodes
  • A function to convert a scaling parameter into a shift and a multiply.
  • Unit tests in the form of small layer exports, with expected outputs in the console

This export is optimized for CMSIS_NN 6.0

For the moment, only 8-bit quantization is available; a 4-bit and 16-bit version will soon be available.

Note: several TODOs remain, including a Datatype problem not yet solved, an input problem after quantization, etc.

Modified files

  • operator.py add Cmsis_nn option for many operators
  • export.py, merging operator, copy CMSIS_nn files, delete identity layer

TODO

  • Int4 Int16 for CMSIS_NN
  • Use get_opertor.Datatype()
  • No input after quantization
  • Operator LSTM, Softmax, SVDF, Reshape
Edited by Grégoire Kubler

Merge request reports

Approval is optional

Closed by Olivier BICHLEROlivier BICHLER 4 months ago (Jan 16, 2025 2:40pm UTC)

Merge details

  • The changes were not merged into dev.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
198 str(Path(export_folder) / "include" ))
199 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_math_types.h"),
200 str(Path(export_folder) / "include" ))
201 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_types.h"),
202 str(Path(export_folder) / "include" ))
203 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nnsupportfunctions.h"),
204 str(Path(export_folder) / "include" ))
205 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "Internal" / "arm_nn_compiler.h"),
206 str(Path(export_folder) / "include/Internal" ))
207
208
209
210 def cmsis_nn_fuse_node(model):
211 for node in model.get_nodes():
212 if node.type() == "Producer":
213 node.get_operator().set_attr("Constant", True)
  • Cyril Moineau
  • 429
    430 fuse_op = aidge_core.GenericOperator("ConvScaling", nb_data=1, nb_param=2, nb_out=1, name=f"ConvScaling_{cpt}")
    431 fuse_op.get_operator().set_forward_dims(ConvReluScaling(fuse_op).compute)
    432
    433 for name, value in attributes.items():
    434 fuse_op.get_operator().set_attr(name, value)
    435
    436 for i in range(len(producers)):
    437 producers[i].add_child(fuse_op, 0, i + 1)
    438
    439 new_nodes = set([fuse_op] + producers)
    440 aidge_core.GraphView.replace(node_to_replace, new_nodes)
    441
    442
    443 def remove_identity(model):
    444 # Remove Identity (not necessary with the next patch of aidge_quantization)
  • 1 import aidge_core
    2 import aidge_onnx
    3 import aidge_backend_cpu
    4 import aidge_export_arm_cortexm
    5 import aidge_quantization
    6 import numpy as np
    7 import aidge_export_cpp
    8
    9 import torch
  • 1 import aidge_core
    2 import aidge_onnx
    3 import aidge_backend_cpu
    4 import aidge_export_arm_cortexm
    5 import aidge_quantization
    6 import numpy as np
    7 import aidge_export_cpp
    8
    9 import torch
    10 import torch.nn as nn
    11 import torch.onnx
    12 import os
    13 current_dir = os.path.join( os.getcwd(),"uni_tests")
  • 104 samples = np.random.rand(*dims)
    105 samples = normalize_and_convert_to_int8(samples, -128, 127)
    106
    107 print("Input Values: ", samples)
    108 input_array = np.reshape(samples, dims)
    109 output_array = propagate(aidge_model, scheduler, input_array)
    110
    111 aidge_export_cpp.generate_input_file(array_name="input_uni_test", array=samples.reshape(-1), export_folder="uni_tests")
    112
    113 print("Output values: ", np.round(output_array, 2))
    114
    115 return aidge_model
    116
    117
    118
    119 def uni_test_fully_connected():
  • @wboussella We need to resolve every thread before merging :smile:

  • I think unit_test should be addressed before merging. The changes are not that big and should take few time (half a day).

  • 95 ##############################################
    96
    97
    98 def copy_cmsis_nn_file(export_folder):
    99 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nnfunctions.h"),
    100 str(Path(export_folder) / "include" ))
    101 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_math_types.h"),
    102 str(Path(export_folder) / "include" ))
    103 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_types.h"),
    104 str(Path(export_folder) / "include" ))
    105 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nnsupportfunctions.h"),
    106 str(Path(export_folder) / "include" ))
    107 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "Internal" / "arm_nn_compiler.h"),
    108 str(Path(export_folder) / "include/Internal" ))
    109
    110 # Dictionnaire des tokens et leurs réponses associées
  • 103 162 FIXED_MULT32 = 2
    104 163 SINGLE_SHIFT = 3
    105 164 DOUBLE_SHIFT = 4
    165 CMSIS_NN = 5
    166
    106 167
    107 168 def __init__(self, scaling_factor=0.0, nb_bits=8) -> None:
    108 169 self.scaling_factor = scaling_factor
    109 170 self.nb_bits = nb_bits
    171
    172
    173 def calculate_shift_and_multiplier(self,scaling_factor):
    174 """
    175 This function takes a floating-point scaling factor and transforms it into a shift and multiplier for the CMSIS NN library.
    176
    177 Parameters:
  • 266 391 self.datatype = aidge_datatype2ctype(node.get_operator().get_output(0).dtype())
    267 392 self.scaling = Scaling()("no_scaling")
    268 393 self.activation = "Linear"
    269
    394 self.batch = 1
  • 901 bias_name=self.inputs[2].name(),
    902 output_name=self.name,
    903 debug = True
    904 ))
    666 905
    667 906
    668 907 return list_actions
    669 908
    670 909
    910 @operator_register("ConvScaling")
    911 class ConvReluScaling_ARMCortexM(Conv_ARMCortexM):
    912 def __init__(self, node, board, library):
    913 super(Conv_ARMCortexM, self).__init__(node, board, library)
    914
    915 if self.operator.has_attr("BeginEndBorders"):
    916 self.padding = self.operator.get_attr("BeginEndBorders")
  • 934 self.operator.get_attr("quantizedNbBits"))("floating_point")
    935
    936
    937
    938 @operator_register("ConvReluScaling")
    939 class ConvReluScaling_ARMCortexM(Conv_ARMCortexM):
    940 def __init__(self, node, board, library):
    941 super(Conv_ARMCortexM, self).__init__(node, board, library)
    942
    943 self.activation = "Rectifier"
    944 self.activation_min = 0
    945 self.activation_max = 127
    946 if self.dataformat == "int8_t":
    947 self.activation_max = 127
    948 elif self.dataformat == "int16_t":
    949 self.activation_max = 255
  • 694 979 super(FCScaling_ARMCortexM, self).__init__(node, board, library)
    695 980
    696 981 self.activation = "Rectifier"
    982 self.activation_min = 0
    983 self.activation_max = 127
    984 if self.dataformat == "int8_t":
    985 self.activation_max = 127
    986 elif self.dataformat == "int16_t":
    987 self.activation_max = 255
  • I added a review of operator.py which I skipped in my first review...

    The code quality doesn't looks very good with what looks like mistakes in the code and a lot of hardcoded values ?

    Since I am doing a massive refactoring of the export I didn't went in depth about the code quality changes. We should look at this when updating this module with the new refactored export.

  • Grégoire Kubler changed the description

    changed the description

  • Cyril Moineau marked this merge request as draft

    marked this merge request as draft

  • Closing this MR, as it is outdated and now superseeded by !18.

  • Please register or sign in to reply
    Loading