Draft: CMSIS_NN integration
Context
#6 (closed) This merge request contains tools for exporting quantized neural networks to Arm Cortex M targets, using the Arm CMSIS_NN library.
It contains :
- New templates for generating code (Fully Connected, Conv, Max Pooling, Relu)
- A merge of the Conv Relu Scaling and Fully Connected Relu Scaling nodes
- A function to convert a scaling parameter into a shift and a multiply.
- Unit tests in the form of small layer exports, with expected outputs in the console
This export is optimized for CMSIS_NN 6.0
For the moment, only 8-bit quantization is available; a 4-bit and 16-bit version will soon be available.
Note: several TODOs remain, including a Datatype problem not yet solved, an input problem after quantization, etc.
Modified files
-
operator.py
add Cmsis_nn option for many operators -
export.py
, merging operator, copy CMSIS_nn files, delete identity layer
TODO
-
Int4 Int16 for CMSIS_NN -
Use get_opertor.Datatype() -
No input after quantization -
Operator LSTM, Softmax, SVDF, Reshape
Merge request reports
Activity
added Feature 🚀 label
requested review from @cmoineau and @pineapple
assigned to @wboussella
Hello @wboussella,
- Is there differencies with this MR : !1 (closed) ?
- The TODO should be issues for future work
I will review this asap.
Cheers, Cyril
Both are pretty similar, but I did it for organizational reasons because I wanted to adapt to @vtemplier's last big commit. As for TODO, they're essentially come from the problem that after quantization, we still don't have access to the datatype and dataformat attributes of the layers.
#18 I created an issue to do this later
198 str(Path(export_folder) / "include" )) 199 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_math_types.h"), 200 str(Path(export_folder) / "include" )) 201 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_types.h"), 202 str(Path(export_folder) / "include" )) 203 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nnsupportfunctions.h"), 204 str(Path(export_folder) / "include" )) 205 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "Internal" / "arm_nn_compiler.h"), 206 str(Path(export_folder) / "include/Internal" )) 207 208 209 210 def cmsis_nn_fuse_node(model): 211 for node in model.get_nodes(): 212 if node.type() == "Producer": 213 node.get_operator().set_attr("Constant", True) Maybe we can do this in the refactoring task here: aidge#128 (closed)
- Resolved by Cyril Moineau
429 430 fuse_op = aidge_core.GenericOperator("ConvScaling", nb_data=1, nb_param=2, nb_out=1, name=f"ConvScaling_{cpt}") 431 fuse_op.get_operator().set_forward_dims(ConvReluScaling(fuse_op).compute) 432 433 for name, value in attributes.items(): 434 fuse_op.get_operator().set_attr(name, value) 435 436 for i in range(len(producers)): 437 producers[i].add_child(fuse_op, 0, i + 1) 438 439 new_nodes = set([fuse_op] + producers) 440 aidge_core.GraphView.replace(node_to_replace, new_nodes) 441 442 443 def remove_identity(model): 444 # Remove Identity (not necessary with the next patch of aidge_quantization) - uni_tests/operator_unitest.py 0 → 100644
1 import aidge_core 2 import aidge_onnx 3 import aidge_backend_cpu 4 import aidge_export_arm_cortexm 5 import aidge_quantization 6 import numpy as np 7 import aidge_export_cpp 8 9 import torch - uni_tests/operator_unitest.py 0 → 100644
1 import aidge_core 2 import aidge_onnx 3 import aidge_backend_cpu 4 import aidge_export_arm_cortexm 5 import aidge_quantization 6 import numpy as np 7 import aidge_export_cpp 8 9 import torch 10 import torch.nn as nn 11 import torch.onnx 12 import os 13 current_dir = os.path.join( os.getcwd(),"uni_tests") Please use pathlib instead of os for handling path see: https://gitlab.eclipse.org/groups/eclipse/aidge/-/wikis/python%20path
- uni_tests/operator_unitest.py 0 → 100644
104 samples = np.random.rand(*dims) 105 samples = normalize_and_convert_to_int8(samples, -128, 127) 106 107 print("Input Values: ", samples) 108 input_array = np.reshape(samples, dims) 109 output_array = propagate(aidge_model, scheduler, input_array) 110 111 aidge_export_cpp.generate_input_file(array_name="input_uni_test", array=samples.reshape(-1), export_folder="uni_tests") 112 113 print("Output values: ", np.round(output_array, 2)) 114 115 return aidge_model 116 117 118 119 def uni_test_fully_connected(): Python unit test should use the unit test framework see wiki: https://gitlab.eclipse.org/groups/eclipse/aidge/-/wikis/python%20test
@wboussella We need to resolve every thread before merging
95 ############################################## 96 97 98 def copy_cmsis_nn_file(export_folder): 99 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nnfunctions.h"), 100 str(Path(export_folder) / "include" )) 101 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_math_types.h"), 102 str(Path(export_folder) / "include" )) 103 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nn_types.h"), 104 str(Path(export_folder) / "include" )) 105 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "arm_nnsupportfunctions.h"), 106 str(Path(export_folder) / "include" )) 107 copyfile(str(ROOT / "_CMSIS-NN" / "CMSIS-NN" / "Include" / "Internal" / "arm_nn_compiler.h"), 108 str(Path(export_folder) / "include/Internal" )) 109 110 # Dictionnaire des tokens et leurs réponses associées 103 162 FIXED_MULT32 = 2 104 163 SINGLE_SHIFT = 3 105 164 DOUBLE_SHIFT = 4 165 CMSIS_NN = 5 166 106 167 107 168 def __init__(self, scaling_factor=0.0, nb_bits=8) -> None: 108 169 self.scaling_factor = scaling_factor 109 170 self.nb_bits = nb_bits 171 172 173 def calculate_shift_and_multiplier(self,scaling_factor): 174 """ 175 This function takes a floating-point scaling factor and transforms it into a shift and multiplier for the CMSIS NN library. 176 177 Parameters: Python docstring should follow: https://gitlab.eclipse.org/groups/eclipse/aidge/-/wikis/Writing%20documentation#python-docstring
901 bias_name=self.inputs[2].name(), 902 output_name=self.name, 903 debug = True 904 )) 666 905 667 906 668 907 return list_actions 669 908 670 909 910 @operator_register("ConvScaling") 911 class ConvReluScaling_ARMCortexM(Conv_ARMCortexM): 912 def __init__(self, node, board, library): 913 super(Conv_ARMCortexM, self).__init__(node, board, library) 914 915 if self.operator.has_attr("BeginEndBorders"): 916 self.padding = self.operator.get_attr("BeginEndBorders") 934 self.operator.get_attr("quantizedNbBits"))("floating_point") 935 936 937 938 @operator_register("ConvReluScaling") 939 class ConvReluScaling_ARMCortexM(Conv_ARMCortexM): 940 def __init__(self, node, board, library): 941 super(Conv_ARMCortexM, self).__init__(node, board, library) 942 943 self.activation = "Rectifier" 944 self.activation_min = 0 945 self.activation_max = 127 946 if self.dataformat == "int8_t": 947 self.activation_max = 127 948 elif self.dataformat == "int16_t": 949 self.activation_max = 255 694 979 super(FCScaling_ARMCortexM, self).__init__(node, board, library) 695 980 696 981 self.activation = "Rectifier" 982 self.activation_min = 0 983 self.activation_max = 127 984 if self.dataformat == "int8_t": 985 self.activation_max = 127 986 elif self.dataformat == "int16_t": 987 self.activation_max = 255 I added a review of
operator.py
which I skipped in my first review...The code quality doesn't looks very good with what looks like mistakes in the code and a lot of hardcoded values ?
Since I am doing a massive refactoring of the export I didn't went in depth about the code quality changes. We should look at this when updating this module with the new refactored export.
Closing this MR, as it is outdated and now superseeded by !18.