Commit b26dbb97 authored by Boris Baldassari's avatar Boris Baldassari
Browse files

Minor fixes in aice_aura_demonstrator

parent 423fb08e
---
title: "AICE: The Aura Demonstrator"
title: "AICE: The AURA Demonstrator"
date: 2022-01-20T10:00:00-04:00
layout: "single"
footer_class: "footer-darker"
......@@ -7,19 +7,19 @@ footer_class: "footer-darker"
## Introduction
This document describes the work done on the AICE Working Group demonstrator with the Aura use case. This project constitues the first iteration of the AICE OpenLab and intends to demonstrate the benefits of a shared, common platform to collaboratively work on AI workflows.
This document describes the work done on the AICE Working Group demonstrator with the AURA use case. This project constitues the first iteration of the AICE OpenLab and intends to demonstrate the benefits of a shared, common platform to collaboratively work on AI workflows.
### About the AICE working group
The Eclipse AI, Cloud & Edge (AICE) Working Group is a special interest working group hosted at the Eclipse Foundation, where participants discuss and work on AI, cloud & edge ecosystems to innovate and grow with open source. The aim of the AICE Working Group is to accelerate the adoption of AI, cloud & edge technologies and standards through the provision and operation of a collaborative work and test environment for its participants, engagement with research and innovation initiatives, and the promotion of open source projects to AI, cloud & edge developers.
The AICE OpenLab has been initiated to provide a common shared platform to test, evaluate and demonstrate AI workflows developed by partners. This enables an open collaboration and discussion on AI solutions, and fosters portability and standardisation. The AICE OpenLab is currently working on two use cases: Aura, as described in this document, and Eclipse Graphene, a general-purpose scheduler for AI workflows.
The AICE OpenLab has been initiated to provide a common shared platform to test, evaluate and demonstrate AI workflows developed by partners. This enables an open collaboration and discussion on AI solutions, and fosters portability and standardisation. The AICE OpenLab is currently working on two use cases: AURA, as described in this document, and Eclipse Graphene, a general-purpose scheduler for AI workflows.
More information about AICE:
* AICE Working Group wiki: https://wiki.eclipse.org/AICE_WG/
* Eclipse Graphene / AI4EU Experiments: https://ai4europe.eu.
### About Aura
### About AURA
AURA is a non-profit French organisation that designs and develops a patch to detect epileptic seizures before they happen and warns patients ahead for safety purposes. For this AURA is creating a multidisciplinary community integrating open source and open hardware philosophies with the health and research worlds. The various partners of the initiative (patients, neurologists, data scientists, designers) each bring their experience and expertise to build an open, science-backed workflow that can actually help the end-users. In the end, this device could be a life-changer for the 10 million people with drug-resistant epilepsy worldwide.
......@@ -44,7 +44,7 @@ More information on epileptic seizure detection:
In this context, our first practical goal was to train the model on a large dataset of EEGs/ECGs from Temple University Hospital (TUH). The TUH dataset is composed of EDF files recording the electrocardiogram signal, along with their annotation files that classify the time ranges as either noise or as an epileptic seizure. The full dataset has 5600+ EDF files and as many annotations, representing 692 Patients, 1074 Hours of Recordings and 3500+ Seizures. Its size on disk is 67GB.
AI-related research and development activities, even if they rely on smaller datasets for the early stages of the set up, require a more complete dataset to run when it comes to the fine-tuning and exploitation of the model. The TUH database was not used with the previous Aura workflow, as its full execution would take more than 20 hours on the developer's computers. Executions often failed because of wrong input data, and switching to more powerful computers was difficult because of the complex setup.
AI-related research and development activities, even if they rely on smaller datasets for the early stages of the set up, require a more complete dataset to run when it comes to the fine-tuning and exploitation of the model. The TUH database was not used with the previous AURA workflow, as its full execution would take more than 20 hours on the developer's computers. Executions often failed because of wrong input data, and switching to more powerful computers was difficult because of the complex setup.
In this context, the established objectives of the project were to:
* Propose a proper, industrial-like process to open up and improve collaboration on the work done in the lab:
......@@ -68,7 +68,7 @@ The work done by the AURA researchers and data scientists on ECGs had been organ
* Provide up-to-date documentation and passing tests.
* Set up a process to automatically build the Docker images to allow multiple execution methods: Airflow/pure python, Docker/Compose, Kubernetes or Eclipse Graphene.
Building upon the current resources in use at Aura for Ai workflow, the following directory structure was adopted:
Building upon the current resources in use at AURA for Ai workflow, the following directory structure was adopted:
```
├── data => Data samples for tests
......@@ -87,7 +87,7 @@ Building upon the current resources in use at Aura for Ai workflow, the followin
We defined and enforced a contributing guide, making tests and documentation mandatory in the repository. We also set up a Travis job to execute the full python test suite at every commit, and made it visible through a badge in the repository's README. Regarding Git we used a simple Git workflow to maintain a clean branching structure. The newly agreed development process definitely helped clean up the repository. Each time a set of scripts was added, we knew exactly where they should go and how to reuse them in the overall workflow.
### Portability: Building the Aura Containers
### Portability: Building the AURA Containers
One key aspect of the work achieved was to make the AI workflow easy to run anywhere, from the researcher's computers to our Kubernetes cluster. This implies to have a set of scripts and resources to automatically build a set of Docker images for each identified step of the process. On top of drastically improving portability, it also means that the very same workflow can be identically reproduced on different datasets.
......@@ -161,7 +161,7 @@ Furthermore, the automatic building of containers for multiple execution targets
### Portability and deployment
Once the new Docker images are built and pushed to a Docker registry, they can be pulled from any computer in order to run the full workflow without any local install or specific knowledge. The provided docker-compose files will automatically pull the required images and execute the full workflow on any dataset and on any host. An example is proposed to fine-tune and run the multi-containers setup easily. For the teams at Aura, it means they can now run their workflows on any type of hosting provided by their partners.
Once the new Docker images are built and pushed to a Docker registry, they can be pulled from any computer in order to run the full workflow without any local install or specific knowledge. The provided docker-compose files will automatically pull the required images and execute the full workflow on any dataset and on any host. An example is proposed to fine-tune and run the multi-containers setup easily. For the teams at AURA, it means they can now run their workflows on any type of hosting provided by their partners.
We also installed a fresh instance of AI4EU Experiments on our dedicated hardware for the onboarding of the models, and plan to make stable, verified images available on the marketplace in the upcoming months.
......@@ -175,7 +175,7 @@ All considered we were able to scale down the full execution time on the TUH dat
## Conclusion
It has been a fantastic collaborative work, building upon the expertise of the Aura data scientists and AICE MLOps practionners to deliver exciting and pragmatic outcomes. The result is a set of optimised, reliable processes, with new perspectives and possibilities, and a better confidence in the developed pipeline. All actors learned a lot and the sequels of the work will be replicated in the forthcoming projects in both teams.
It has been a fantastic collaborative work, building upon the expertise of the AURA data scientists and AICE MLOps practionners to deliver exciting and pragmatic outcomes. The result is a set of optimised, reliable processes, with new perspectives and possibilities, and a better confidence in the developed pipeline. All actors learned a lot and the sequels of the work will be replicated in the forthcoming projects in both teams.
Besides the team benefits, the project itself hugely benefited from the various improvements and optimisation. It is now very easy to run the full stack on different datasets for development, and the new container deployment method will be extended to partners and healthcare centers (L'Institut La Teppe).
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment