Inconsistent batch size after inference
Problem description
When trying to evaluate a model, there is an inconsistency in the batch size between input tensor and output tensor of the model.
Reproducible example code
In the provided notebook notebook.tar.gz, a provider is providing data_batch tensor of size [16, 784], which is fed into scheduler.forward() method. After inference, the output#1 of the model has a size of [64, 10]. The corresponding batch size should be 16 and not 64.