Add unary and batch support

Merge Request Summary

This MR enhances the XNNPACK export functionality with 2 main improvements:

1. Unary Operator Support

Refactored the clamping implementation into a more flexible unary operator framework:

  • Replaced clamp_f32_ctx.hpp to unary_f32_ctx.hpp and templates files to support generalized unary operators
  • Replaced the relu.py operator module with the new unary.py module

This change makes it easier to add new unary operators in the future without duplicating code, simply by creating a new class in unary.py and setting op_type to the operator to be added.

2. Added Batch Support

Add batch dimension handling across multiple operator templates:

  • Convolution layers: conv2d_nhwc_f32, conv2d_act_nhwc_f32
  • Pooling layers: avgpool2d_nhwc_f32, maxpool2d_nhwc_f32, globalavgpool2d_nhwc_f32
  • Fully connected layers: fc_f32

Merge request reports

Loading