Fuse bn
-
Fix Add + Mul recipies -
Add fuseBN recipies -
Fix parameters issue #8 (closed) -
Update tensor to get/set values -
Add approxEq Tensor utils method to check if two tensor are approximatly equals -
Update the way scheduler is updated -
Add unit test for: -
fuse Add + Mul -
remove padding -
Add unit test for fuseBN recipies (TODO in aidge_backend_cpu because of dependancie to the backend for the test)
-
Merge request reports
Activity
changed milestone to %v0.1.0
assigned to @cmoineau
marked the checklist item Fix parameters issue #8 (closed) as completed
added 1 commit
- 6dbe2ca8 - Update fix #8 (closed) by removing the assumption that size_t is unsigned.
changed this file in version 11 of the diff
actually, a TensorImpl is required to store a contiguous 1D array of bytes. Thus retrieving address of this array is not virtual, nor accessing a given byte. Besides, getRaw is thus not virtual and totally equivalent to the default
::operator[](std::size_t)
Unless I missed something, I propose to remove it entirelyMaking lazyInit explicit can indeed solve the issue of useless calls (even though we still have to solve how to manage users manually generated Tensors) Couldn't
getRaw()
return a (void*) pointer and::operator[](std::size_t)
return a typed value?Also, as we discussed,
getRaw()
could return only a device-usable "address"What
::operator[](std::size_t)
should return ? I am not sure yetWhen I created getRaw the goal was to NOT return a standard CPU addr.
The goal is to return a pointer of the real data (RAW) to mimic PyTorch behavior of https://pytorch.org/docs/stable/generated/torch.Tensor.data_ptr.html.
This is a usefull feature for interoperability especialy with Pytorch CUDA.
Maybe we should add another method (getHostPtr) to access a CPU memory addr.
AFAIK, there is no such raw pointer for cuda. A CUDA pointer is only usable by CUDA, if you dereferenced it from the CPU, it is undefined behavior and you are certain that if it does not segfault, you will not have the expected value: https://stackoverflow.com/questions/20607546/dereferencing-pointer-in-cuda-c#:~:text=Fundamental%20CUDA%20rules%20(ignoring%20cuda,device%20pointer%20in%20host%20code
CUDA has made the choice to represent its memory location identifier as pointer but it's confusing for user.
Besides, other target may return any kind of type to represent a memory location (see, for instance, he handle types in windows).Yet this pointer cannot be used for reading values. It seems that actual implementation cannot prohibit a client to try to unreference it.
Typically the getter cannot work, as it is implemented, on a CUDA backend.
Reading single values from host would probably be excessively expansive and acceptable only of this access is seldom.We could (should) have separate interfaces for host and device memory location indeed. Yet there is no common type for the device side (it is not necessarily a value stored in a pointer).
See, for instance what is done with the standard thread API: https://en.cppreference.com/w/cpp/thread/thread/native_handle
Yet it simpler in this case because there is only one underlying library. Thus thenative_handle
type can be anything but it's known compile time.
Our use-case is more complicated (we can return avoid *
to a memory location handle and, in a translation unit that is aware of the device, cast it to a"handle" *
that can be manipulated with device routines.
Sorry I was speaking about ongoing refactoring. Yet, see issue #13 (closed) or #14 (closed), these data may effectively go back to Tensor
changed this file in version 11 of the diff
conversion functions will be excessively expensive. As number of dimension is practically a constant for a given tensor, I propose to preallocate a vector of coordinates of the right size and pass it by reference to the coordinate computation function and not return it. The actual implementation makes reallocations and then a full vector copy. Even if the compiler is smart, is not that smart
Actually I'm not sure how to do that properly in order to avoid threading issues, dangling references and so on...
I think that the best solution is to pass a reference to an user preallocated vector of coordinates to getCoord.
It should be documented as an unsafe function: if the input vector size is not the right one, the behavior is undefined.changed this file in version 11 of the diff
requested review from @pineapple
added 1 commit
- f8a49175 - [TensorUtils] Add approxEq method to check if two tensors are approximatly equalts.
@pineapple Can you please review f8a49175 ? Thanks in advance :)
- Resolved by Cyril Moineau
- Resolved by Cyril Moineau
- Resolved by Cyril Moineau
added 1 commit
- 5a104feb - [approxEq] relactive and absolute T -> float.
added 1 commit
- 9cabcb41 - [Scheduler] Update the way progress bar is computed.
added 1 commit
- 5fc900c6 - [approxEq] Check range for relative and absolute parameters.
added 48 commits
-
5fc900c6...44afa3b4 - 47 commits from branch
main
- 5946998f - Merge branch 'main' into fuseBN
-
5fc900c6...44afa3b4 - 47 commits from branch
Hi, @pineapple @cmoineau: Interested for the merge request.
mentioned in merge request !16 (merged)
mentioned in merge request aidge_backend_cpu!6 (merged)
added 1 commit
- 811b7e16 - [Add] FuseMulAdd test and fix typo in test_tensor
added 1 commit
- 4f141900 - [Paramerter] Add exit(-1) to end get recursion.
added 28 commits
-
4f141900...8228ba63 - 27 commits from branch
main
- d2d94507 - Merge branch 'main' into fuseBN
-
4f141900...8228ba63 - 27 commits from branch