Skip to content
Snippets Groups Projects

Fuse bn

Merged Cyril Moineau requested to merge fuseBN into main
4 unresolved threads
  • Fix Add + Mul recipies
  • Add fuseBN recipies
  • Fix parameters issue #8 (closed)
  • Update tensor to get/set values
  • Add approxEq Tensor utils method to check if two tensor are approximatly equals
  • Update the way scheduler is updated
  • Add unit test for:
    • fuse Add + Mul
    • remove padding
    • Add unit test for fuseBN recipies (TODO in aidge_backend_cpu because of dependancie to the backend for the test)
Edited by Maxence Naud

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • laurent soulier
    laurent soulier @lsoulier started a thread on commit 53f2e8b8
    • actually, a TensorImpl is required to store a contiguous 1D array of bytes. Thus retrieving address of this array is not virtual, nor accessing a given byte. Besides, getRaw is thus not virtual and totally equivalent to the default ::operator[](std::size_t) Unless I missed something, I propose to remove it entirely

    • getRaw() also runs the lazyInit() functionnality and does not return a typed pointer (void*)

    • According to our design meeting on Tensor, I was of the opinion to make lazyInit() an explicit function and don't hide it in low level calls where it might be often called for nothing, leading to overhead.

    • Making lazyInit explicit can indeed solve the issue of useless calls (even though we still have to solve how to manage users manually generated Tensors) Couldn't getRaw() return a (void*) pointer and ::operator[](std::size_t) return a typed value?

      Also, as we discussed, getRaw() could return only a device-usable "address"

      What ::operator[](std::size_t) should return ? I am not sure yet

    • ::operator[](std::size_t) is the default array access operator. It can be used only on a properly typed array. Not on raw storage nor on void *
      Thus it can be used only if getRaw returns a correctly typed address (and, as you say, a standard CPU memory address).

    • Author Maintainer

      When I created getRaw the goal was to NOT return a standard CPU addr.

      The goal is to return a pointer of the real data (RAW) to mimic PyTorch behavior of https://pytorch.org/docs/stable/generated/torch.Tensor.data_ptr.html.

      This is a usefull feature for interoperability especialy with Pytorch CUDA.

      Maybe we should add another method (getHostPtr) to access a CPU memory addr.

    • AFAIK, there is no such raw pointer for cuda. A CUDA pointer is only usable by CUDA, if you dereferenced it from the CPU, it is undefined behavior and you are certain that if it does not segfault, you will not have the expected value: https://stackoverflow.com/questions/20607546/dereferencing-pointer-in-cuda-c#:~:text=Fundamental%20CUDA%20rules%20(ignoring%20cuda,device%20pointer%20in%20host%20code
      CUDA has made the choice to represent its memory location identifier as pointer but it's confusing for user.
      Besides, other target may return any kind of type to represent a memory location (see, for instance, he handle types in windows).

    • Author Maintainer

      That is what I tried to say, raw pointer should return the memory pointer to the CUDA device.

    • Yet this pointer cannot be used for reading values. It seems that actual implementation cannot prohibit a client to try to unreference it.
      Typically the getter cannot work, as it is implemented, on a CUDA backend.
      Reading single values from host would probably be excessively expansive and acceptable only of this access is seldom.

    • Author Maintainer

      The device (GPU) pointer is used to read values using synhcronization method with the host (CPU).

      The goal of getRaw is to be able to have de DEVICE ptr, the (normal) getter should return a HOST ptr.

    • We could (should) have separate interfaces for host and device memory location indeed. Yet there is no common type for the device side (it is not necessarily a value stored in a pointer).
      See, for instance what is done with the standard thread API: https://en.cppreference.com/w/cpp/thread/thread/native_handle
      Yet it simpler in this case because there is only one underlying library. Thus the native_handle type can be anything but it's known compile time.
      Our use-case is more complicated (we can return a void * to a memory location handle and, in a translation unit that is aware of the device, cast it to a "handle" * that can be manipulated with device routines.

    • BTW, wouldn't GetNativeStorageHandle() less error-prone?
      template<???> auto GetNativeStorageHandle()?
      Possibly not possible like that.

    • Author Maintainer

      This is why getRawPtr return a void*.

    • Please register or sign in to reply
  • laurent soulier
    laurent soulier @lsoulier started a thread on an outdated change in commit 53f2e8b8
  • laurent soulier
    laurent soulier @lsoulier started a thread on an outdated change in commit 53f2e8b8
    • conversion functions will be excessively expensive. As number of dimension is practically a constant for a given tensor, I propose to preallocate a vector of coordinates of the right size and pass it by reference to the coordinate computation function and not return it. The actual implementation makes reallocations and then a full vector copy. Even if the compiler is smart, is not that smart

    • Author Maintainer

      I do not understand your comment, you want to memoize the getIdx and getCoord method ?

    • Actually I'm not sure how to do that properly in order to avoid threading issues, dangling references and so on...
      I think that the best solution is to pass a reference to an user preallocated vector of coordinates to getCoord.
      It should be documented as an unsafe function: if the input vector size is not the right one, the behavior is undefined.

    • changed this file in version 11 of the diff

    • Please register or sign in to reply
  • Cyril Moineau added 3 commits

    added 3 commits

    • ade2edaf - [Producer] Add setOutputTensor method.
    • 689a11f5 - [GraphView.replaceWith] Fix issue when replacing a node with parameters.
    • 6682b885 - [Recipies] Add method FuseBatchNorm.

    Compare with previous version

  • Cyril Moineau marked the checklist item Update tensor to get/set values as completed

    marked the checklist item Update tensor to get/set values as completed

  • Cyril Moineau marked the checklist item Add fuseBN recipies as completed

    marked the checklist item Add fuseBN recipies as completed

  • Cyril Moineau changed the description

    changed the description

  • Maxence Naud added 1 deleted label

    added 1 deleted label

  • Maxence Naud requested review from @pineapple

    requested review from @pineapple

  • Cyril Moineau added 1 commit

    added 1 commit

    • f8a49175 - [TensorUtils] Add approxEq method to check if two tensors are approximatly equalts.

    Compare with previous version

  • Cyril Moineau changed the description

    changed the description

  • Author Maintainer

    @pineapple Can you please review f8a49175 ? Thanks in advance :)

  • Maxence Naud
  • Maxence Naud
  • Maxence Naud
  • Cyril Moineau added 1 commit

    added 1 commit

    • 5a104feb - [approxEq] relactive and absolute T -> float.

    Compare with previous version

  • Cyril Moineau added 1 commit

    added 1 commit

    • 9cabcb41 - [Scheduler] Update the way progress bar is computed.

    Compare with previous version

  • Cyril Moineau changed the description

    changed the description

  • Cyril Moineau added 1 commit

    added 1 commit

    • 5fc900c6 - [approxEq] Check range for relative and absolute parameters.

    Compare with previous version

  • Maxence Naud added 48 commits

    added 48 commits

    Compare with previous version

  • Hi, @pineapple @cmoineau: Interested for the merge request.

  • Maxence Naud added 2 commits

    added 2 commits

    • a2248696 - [Fix] function getParents() typo to getParent()
    • eb8bda50 - [Fix] coord conversion functions in Tensor

    Compare with previous version

  • Olivier BICHLER mentioned in merge request !16 (merged)

    mentioned in merge request !16 (merged)

  • mentioned in merge request aidge_backend_cpu!6 (merged)

  • Cyril Moineau marked the checklist item Add unit test for fuseBN recipies (TODO in aidge_backend_cpu because of dependancie to the backend for the test) as completed

    marked the checklist item Add unit test for fuseBN recipies (TODO in aidge_backend_cpu because of dependancie to the backend for the test) as completed

  • Cyril Moineau marked the checklist item Add unit test for: as completed

    marked the checklist item Add unit test for: as completed

  • Maxence Naud added 3 commits

    added 3 commits

    • e98dd055 - [Upd] Matmul to MatMul
    • 6fd58bac - [Fix] FuseMulAdd python test and fusemuladd function
    • cf344748 - [Fix] MSVC warnings

    Compare with previous version

  • Maxence Naud added 1 commit

    added 1 commit

    • 811b7e16 - [Add] FuseMulAdd test and fix typo in test_tensor

    Compare with previous version

  • Cyril Moineau added 3 commits

    added 3 commits

    • 05361ff3 - [GitLabCI] Add numpy dependancy for unittest.
    • c8bc585b - [Paramerter] Add noreturn for MSVC.
    • 2f181c25 - Merge branch 'fuseBN' of gitlab.eclipse.org:eclipse/aidge/aidge_core into fuseBN

    Compare with previous version

  • Cyril Moineau added 1 commit

    added 1 commit

    • d3dbe179 - [Paramerter] Update get method.

    Compare with previous version

  • Cyril Moineau added 1 commit

    added 1 commit

    • 4f141900 - [Paramerter] Add exit(-1) to end get recursion.

    Compare with previous version

  • Maxence Naud marked the checklist item Fix Add + Mul recipies as completed

    marked the checklist item Fix Add + Mul recipies as completed

  • Maxence Naud approved this merge request

    approved this merge request

  • Cyril Moineau added 28 commits

    added 28 commits

    Compare with previous version

  • Cyril Moineau added 1 commit

    added 1 commit

    Compare with previous version

  • merged

  • Please register or sign in to reply
    Loading