Skip to content

Add MatMulTiling recipe

Olivier BICHLER requested to merge matmultiling into dev

Add MatMulTiling recipe. The goal is to tile any matrix multiplication to several fixed size matrix multiplications. For instance, for a MatMul of size 80x80 and a tiling of 16x16, this will tile the MatMul operator to 25 (5 by 5) MatMul operators of size 16x16, with Slice operators inserted at the inputs and Concat operators inserted at the outputs.

This is especially useful when matrix multiplication must be mapped to fixed maximum size hardware TPU (Tensor Processing Unit) or MMA (Matrix Multiplication Accelerator). This recipe can be combined with the ConvToMatMul recipe in order to convert convolutions to matrix multiplication beforehand, and ConstantFolding recipe to fold sliced constant tensors.

Detailed list of changes:

  • Fix Concat forward bug with negative axis;
  • Fix SliceImpl registration bug on backend CPU;
  • Fix removeNode recipe to use SinglePassGraphMatching, as GraphRegex is bugged on complex cases;
  • Added MatMulTiling recipe.

READY TO BE REVIEWED

Here is the graph generated by a single step of the MatMulTiling recipe (after the very first matrix multiplication split):

%%{init: {'flowchart': { 'curve': 'monotoneY'}, 'fontFamily': 'Verdana' } }%%
flowchart TB

Producer_7(<em>Producer#7</em>):::producerCls
MatMul_1(<em>MatMul#1</em>)
Concat_0(<em>Concat#0</em>)
Producer_1(<em>Producer#1</em>):::producerCls
Producer_2(<em>Producer#2</em>):::producerCls
Producer_3(<em>Producer#3</em>):::producerCls
Producer_4(<em>Producer#4</em>):::producerCls
Producer_5(<em>Producer#5</em>):::producerCls
Producer_6(<em>Producer#6</em>):::producerCls
Identity_0(<em>Identity#0</em>):::rootCls
Slice_0(<em>Slice#0</em>)
Producer_0(<em>Producer#0</em>):::producerCls
MatMul_0(<em>MatMul#0</em>)
Identity_1(<em>Identity#1</em>)
Slice_1(<em>Slice#1</em>)
Producer_7-->|"0 [2]&rarr;4"|Slice_1
MatMul_1-->|"0 [2, 3, 64, 80]&rarr;1"|Concat_0
Producer_1-->|"0 [2]&rarr;2"|Slice_0
Producer_2-->|"0 [2]&rarr;3"|Slice_0
Producer_3-->|"0 [2]&rarr;4"|Slice_0
Producer_4-->|"0 [2]&rarr;1"|Slice_1
Producer_5-->|"0 [2]&rarr;2"|Slice_1
Producer_6-->|"0 [2]&rarr;3"|Slice_1
Identity_0-->|"0 [2, 3, 80, 80]&rarr;0"|Slice_0
Identity_0-->|"0 [2, 3, 80, 80]&rarr;0"|Slice_1
Slice_0-->|"0 [2, 3, 16, 80]&rarr;0"|MatMul_0
Producer_0-->|"0 [2]&rarr;1"|Slice_0
MatMul_0-->|"0 [2, 3, 16, 80]&rarr;0"|Concat_0
Identity_1-->|"0 [2, 3, 80, 80]&rarr;1"|MatMul_1
Identity_1-->|"0 [2, 3, 80, 80]&rarr;1"|MatMul_0
Slice_1-->|"0 [2, 3, 64, 80]&rarr;0"|MatMul_1
input0((in#0)):::inputCls--->|"&rarr;0[2, 3, 80, 80]"|Identity_0
input1((in#1)):::inputCls--->|"&rarr;0[2, 3, 80, 80]"|Identity_1
Concat_0--->|"0 [2, 3, 80, 80]&rarr;"|output0((out#0)):::outputCls
classDef inputCls fill:#afa
classDef outputCls fill:#ffa
classDef externalCls fill:#ccc
classDef producerCls fill:#ccf
classDef genericCls fill:#f9f9ff,stroke-width:1px,stroke-dasharray: 5 5
classDef metaCls stroke-width:5px
classDef rootCls stroke:#f00
classDef producerCls_rootCls stroke:#f00,fill:#ccf
classDef genericCls_rootCls stroke:#f00,fill:#f9f9ff,stroke-width:1px,stroke-dasharray: 5 5
classDef metaCls_rootCls stroke:#f00,stroke-width:5px

Initial graph:

%%{init: {'flowchart': { 'curve': 'monotoneY'}, 'fontFamily': 'Verdana' } }%%
flowchart TB

MatMul_0("matmul1<br/><sub><em>(MatMul#0)</em></sub>"):::rootCls
Producer_1("w1<br/><sub><em>(Producer#1)</em></sub>"):::producerCls
Producer_0("dataProvider<br/><sub><em>(Producer#0)</em></sub>"):::producerCls
MatMul_0--->|"0 [2, 3, 80, 80]&rarr;"|output0((out#0)):::outputCls
Producer_1-->|"0 [2, 3, 80, 80]&rarr;1"|MatMul_0
Producer_0-->|"0 [2, 3, 80, 80]&rarr;0"|MatMul_0
classDef inputCls fill:#afa
classDef outputCls fill:#ffa
classDef externalCls fill:#ccc
classDef producerCls fill:#ccf
classDef genericCls fill:#f9f9ff,stroke-width:1px,stroke-dasharray: 5 5
classDef metaCls stroke-width:5px
classDef rootCls stroke:#f00
classDef producerCls_rootCls stroke:#f00,fill:#ccf
classDef genericCls_rootCls stroke:#f00,fill:#f9f9ff,stroke-width:1px,stroke-dasharray: 5 5
classDef metaCls_rootCls stroke:#f00,stroke-width:5px
Edited by Olivier BICHLER

Merge request reports

Loading