[Add] benchmark scripts
-
Review changes -
-
Download -
Patches
-
Plain diff
Context
The aim is to add an universal system to measure the accuracy and time performances of each Aidge module, wheter they are backend or export modules. It is part of #262.
Usage
To explain how this benchmarking system works, we will generate the benchmark of the current Conv2D operator implementations.
1. Create a test configuration file
The first step to benchmark an Operator is to list paramters and input configurations that aims at being tested. For that, we use a JSON configuration file.
Several test config files are provided in this MR, among which one for Conv2D. A test configuration is made of 4 parts:
1. Description of the tested Operator
This description provides parameters for the ONNX file generation.
"operator": "Conv", // operator type
"opset_version": 21, // opset
"initializer_rank": 1, // number of data inputs (does not include initializers)
2. Meta-data about the test
Exports are designed to infer a stream of data so their current implementation does not support multiple batchs. For now this part only tells if inputs have multiple batchs.
"test_meta_data": {
"multiple_batchs": false
},
3. Base configuration
Each individual test should tweak a single parameter to assess performance and output impact of this parameter (for example the number of input channels). Any other parameter that is not impacted should be set to its base configuration value.
"base_configuration": {
"input_shapes": [
["input_0", [1, 10, 200, 200]],
["weight_1", [10, 10, 3, 3]],
["bias_2", [10]]
],
"attributes": {
"kernel_shape": [3, 3],
"strides": [1, 1],
"dilations": [1, 1]
}
},
4. Tested parameters
This part lists the values of tested parameter, grouped by parameter names. Each value leads to an independant test.
Bellow, the other_parameters
section allows to update operator arguments and input shapes accordingly.
"test_configuration": {
"main_parameters": {
"feature_map_size": [
10,100,500
],
"kernel_shape": [
[1, 1],
[3, 3],
[5, 5]
],
"strides": [
[1, 1],
[2, 2],
[3, 3]
],
"dilations": [
[1, 1],
[2, 2],
[3, 3]
]
},
"other_parameters": {
"feature_map_size": {
"10": {
"attributes": {},
"input_shapes": [
["input_0", [1, 10, 10, 10]]
]
},
"100": {
"attributes": {},
"input_shapes": [
["input_0", [1, 10, 100, 100]]
]
},
"500": {
"attributes": {},
"input_shapes": [
["input_0", [1, 10, 500, 500]]
]
}
},
"kernel_shape": {
"[1, 1]": {
"attributes": {},
"input_shapes": [
["weight_1", [10, 10, 1, 1]]
]
},
"[3, 3]": {
"attributes": {},
"input_shapes": [
["weight_1", [10, 10, 3, 3]]
]
},
"[5, 5]": {
"attributes": {},
"input_shapes": [
["weight_1", [10, 10, 5, 5]]
]
}
},
"strides": {
"[1, 1]": {
"attributes": {},
"input_shapes": []
},
"[2, 2]": {
"attributes": {},
"input_shapes": []
},
"[3, 3]": {
"attributes": {},
"input_shapes": []
}
},
"dilations": {
"[1, 1]": {
"attributes": {},
"input_shapes": []
},
"[2, 2]": {
"attributes": {},
"input_shapes": []
},
"[3, 3]": {
"attributes": {},
"input_shapes": []
}
}
}
}
2. Assess modules
The main script to run the benchmark is benchmark/benchmark.py
.
usage: benchmark.py [-h] --config-file CONFIG_FILE --module-to-bench MODULE_TO_BENCH
[--compare-with-onnxruntime] [--time] --results-directory RESULTS_DIRECTORY
[--results-filename RESULTS_FILENAME]
Operator Kernel Performance Benchmarking
options:
-h, --help show this help message and exit
--config-file CONFIG_FILE, -cf CONFIG_FILE
Path to configuration JSON with operator type, attributes, and input sizes.
--module-to-bench MODULE_TO_BENCH, -mtb MODULE_TO_BENCH
Name of the module containing the inference functions
--compare-with-onnxruntime, -cwo
Compare output with ONNXRuntime
--time, -t Compute inference time
--results-directory RESULTS_DIRECTORY
Directory to save the results
--results-filename RESULTS_FILENAME
Name of the saved result file. If not provided, it will default to the
'<operator_name>_<module_to_bench>.json'. If a file with that nae and at tha location
already exists, it will be overrided with elements individually replaced only if new
ones are computed
Here, lets assess aidge_backend_cpu
, aidge_backend_cuda
, aidge_export_cpp
, onnxruntime
and torch
libraries
, I will only show it for aidge_backend_cpu
:
python aidge/aidge_core/benchmark/benchmark.py \
--time \
--compare-with-onnxruntime \
--config-file ./aidge/aidge_core/benchmark/operator_config/conv2d_config.json \
--results-directory ./aidge/aidge_core/benchmark/results/ \
--module-to-bench aidge_backend_cpu
This outputs the following:
'aidge_backend_cpu' module successfully imported
Starting tests...
▷ feature_map_size -- 10
├┬─Measuring kernel inference time...
│└─[ time = 1.80e-05 ± 1.21e-06 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ feature_map_size -- 100
├┬─Measuring kernel inference time...
│└─[ time = 7.82e-04 ± 9.67e-06 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ feature_map_size -- 500
├┬─Measuring kernel inference time...
│└─[ time = 2.02e-02 ± 3.05e-03 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ kernel_shape -- [1, 1]
├┬─Measuring kernel inference time...
│└─[ time = 5.70e-04 ± 1.45e-04 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ kernel_shape -- [3, 3]
├┬─Measuring kernel inference time...
│└─[ time = 2.95e-03 ± 4.63e-06 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ kernel_shape -- [5, 5]
├┬─Measuring kernel inference time...
│└─[ time = 4.74e-02 ± 3.76e-04 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ strides -- [1, 1]
├┬─Measuring kernel inference time...
│└─[ time = 2.99e-03 ± 2.19e-04 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ strides -- [2, 2]
├┬─Measuring kernel inference time...
│└─[ time = 1.96e-03 ± 1.91e-05 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ strides -- [3, 3]
├┬─Measuring kernel inference time...
│└─[ time = 8.89e-04 ± 3.84e-06 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ dilations -- [1, 1]
├┬─Measuring kernel inference time...
│└─[ time = 2.97e-03 ± 5.48e-05 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ dilations -- [2, 2]
├┬─Measuring kernel inference time...
│└─[ time = 2.00e-02 ± 6.51e-05 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
▷ dilations -- [3, 3]
├┬─Measuring kernel inference time...
│└─[ time = 1.96e-02 ± 9.84e-05 seconds ]
└┬─Assessing results are the same as 'onnxruntime' module...
└─[ o ]
Printing results to JSON './aidge/aidge_core/benchmark/results/conv2d_aidge_backend_cpu.json'
As requested, you get the average time over 50 iterations and the result of the comparison with ONNXRuntime. If results are not equal, you [ x ]
instead.
3. Generate a graph to view results
Using generated results JSON files, you can generate a bar plot comparing implementations relative performances for each parameter with generate_graph.py
python file.
You must have results for the reference library and for each other library. Here is the command line and the generated image
python benchmarks/generate_graph.py \
--operator-config benchmarks/operator_config/conv2d_config.json \
--ref benchmarks/results/conv2d_onnxruntime.json \
--libs benchmarks/results/conv2d_torch.json \
benchmarks/results/conv2d_aidge_backend_cpu.json
benchmarks/results/conv2d_aidge_backend_cuda.json
benchmarks/results/conv2d_aidge_export_cpp.json
Major modifications
- add: some first operator configuration files
- add: benchmark python script
benchmark.py
- add: inference and output scripts for
torch
andonnxruntime
libraries - upd: new main cpp scripts for export
- add:
generate_graph.py
to generate a bar plot of the results
Next improvements
- choose the number of warmup and iterations in the command line parameters of
benchmark.py
- add compatibility with complete models
- allow to run
benchmark.py
file with several libraries at once to avoid creating and loading an ONNX model which can be heavy with big input Tensors