Commit b3b3a782 authored by Florent Zara's avatar Florent Zara 💬
Browse files

AURA demonstrator article rework for better comprehension

parent 3e1d0732
Pipeline #2873 passed with stage
......@@ -34,21 +34,37 @@ More information about Aura:
The epileptic seizure is the cornerstone for the management of the disease by health professionals. A precise mapping of seizures in daily life is a way to better qualify the effectiveness of treatments and care. This is a first step towards the forecasting of epileptic seizures which would allow people to better have control over their epilepsy and regain autonomy in their daily lives.
There are a myriad of different forms and origins of epilepsy and epileptic seizures. The symptoms and physical signs broadly differ according to each patient: research has been conducted on electroencephalograms (EEGs), electrocardiograms (ECGs), movement detection, electrodermal activity, and even using dogs. As a result it is impossible — as of today at least — to draw a generic-purpose diagnostic or prediction method, and machine learning methods have been extensively used in the recent years to draw viable seizure detection and forecasting, and tackle the variability across patients.
There are a myriad of different forms and origins of epilepsy and epileptic seizures. The symptoms and physical signs broadly differ according to each patient: research has been conducted on electroencephalograms (EEGs), electrocardiograms (ECGs), movement detection, electrodermal activity, and even using dogs. As a result it is impossible — as of today at least — to draw a generic-purpose diagnostic or prediction method.
For annotations, neurophysiologists use a visualisation tool like Grafana to enter the annotations and define time ranges as either normal activity (noise) or epileptic activity (seizures). Annotations are stored in .tse_bi files with a 1-to-1 association with the EDF signal files. The annotations are used as a reference dataset for training various ML models. Available datasets are usually split so as to set one part for the training and another one to verify the trained model. A typical workflow is to then try to predict epileptic seizures according to an ECG signal and check if the human annotations confirm the seizure.
However machine learning (ML) methods have been extensively used in the recent years to draw viable seizure detection and forecasting, and tackle the variability across patients. Neurophysiologists use a visualisation tool like Grafana to enter the annotations and define time ranges as either normal activity (noise) or epileptic activity (seizures).
These annotations are used as a reference dataset for training various ML models. Available datasets are usually split so as to set one part for the training and another one to verify the trained model. A typical workflow is to then try to predict epileptic seizures according to an ECG signal and check if the human annotations confirm the seizure.
More information on epileptic seizure detection:
* Methods for seizure detection: https://en.aura.healthcare/analyse-des-donn%C3%A9es
* Seizure dogs: https://www.epilepsy.com/living-epilepsy/seizure-first-aid-and-safety/seizure-dogs
### Existing workflow
We started to work using an existing workflow previously designed by the AURA organisation. As input it's using:
* the raw signal data stored as European Data Format (EDF. see https://www.edfplus.info/)
* The annotations stored in  `.tse_bi` files with a 1-to-1 association with the EDF signal files.
Python scripts are used to generate the Seizure Detection Model as described by the following schema:
{{< grid/div isMarkdown="false" >}}
<img src="/images/articles/aice_aura_demonstrator/ecg_workflow.png" alt="The AURA AI process - before" class="img-responsive">
{{</ grid/div >}}
## Objectives of the project
In this context, our first practical goal was to train the model on a large dataset of EEGs/ECGs from Temple University Hospital (TUH). The TUH dataset is composed of EDF files recording the electrocardiogram signal, along with their annotation files that classify the time ranges as either noise or as an epileptic seizure. The full dataset has 5600+ EDF files and as many annotations, representing 692 Patients, 1074 Hours of Recordings and 3500+ Seizures. Its size on disk is 67GB.
AI-related research and development activities, even if they rely on smaller datasets for the early stages of the set up, require a more complete dataset to run when it comes to the fine-tuning and exploitation of the model. The TUH database was not used with the previous AURA workflow, as its full execution would take more than 20 hours on the developer's computers. Executions often failed because of wrong input data, and switching to more powerful computers was difficult because of the complex setup.
In this context, the established objectives of the project were to:
Established objectives of the project were to:
* Propose a proper, industrial-like process to open up and improve collaboration on the work done in the lab:
* improve sustainability for better collaborative work -- both in and outside of the lab,
* improve reliability regarding missing/incomplete data,
......@@ -61,22 +77,21 @@ More information:
* Temple university dataset homepage: https://isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tusz
* Temple university dataset reference: Obeid I., Picone J. (2016). The temple university hospital EEG data corpus. Front. Neurosci. 10:196. 10.3389/fnins.2016.00196.
## Areas of improvement
We worked
{{< grid/div isMarkdown="false" >}}
<img src="/images/articles/aice_aura_demonstrator/ecg_workflow.png" alt="The AURA AI process - before" class="img-responsive">
{{</ grid/div >}}
We identified four area of improvements
* Portability
* Performance
* Visualisation
* Industrialisation
### Portability: Building the AURA Containers
One key aspect of the work achieved was to make the AI workflow easy to run anywhere, from the researcher's computers to our Kubernetes cluster. This implies to have a set of scripts and resources to automatically build a set of Docker images for each identified step of the process. On top of drastically improving portability, it also means that the very same workflow can be identically reproduced on different datasets.
We developed three Docker images to easily execute the full workflow or specific steps:
* Simple direct Python execution, using either the command line or a ML tracking tool like Airflow.
* Simple Docker images, executed independently with: \
`docker run -v $(pwd)/data/:/data bbaldassari/aura_dataprep bash /aura/scripts/run_bash_pipeline.sh`
......@@ -105,9 +120,9 @@ The images have been imported into our instance of AI4EU Experiments for further
Another step was to refactor the scripts to identify and remove performance bottlenecks. Things that work well on a small dataset can become unusable on a larger scale. By running it on larger datasets, up to thousands of files (i.e. the TUH dataset) we encountered unexpected cases and fixed them along the way. We now have a set of scripts that 1. can run on the entire TUH dataset (67GB) without major issue, and 2. is compatible with the two data formats most used by the AURA researchers: TUH and La Teppe.
The performance gain enabled us to run more precise and resource-consuming operations in order to refine the training. For example we modified the length of the sliding window when computing the rr-intervals from 9 seconds to 1 second, which generates a substancial amount of computations while seriously improving predictions from the ML training.
The performance gain enabled us to run more precise and resource-consuming operations in order to refine the training. For example we modified the length of the sliding window when computing the rr-intervals from 9 seconds to 1 second, which generates a substantial amount of computations while seriously improving predictions from the ML training.
We identified atomic steps that could be executed independently and built them as parallel execution jobs. As an example, the cleaning and preparation of data files can be executed simultaneously on different directories to accelerate the overall step. By partitionning the dataset in subsets of roughly 10GB and running concurrently 6 data preparation containers we went down from almost 17h to 4h on the same host.
We identified atomic steps that could be executed independently and built them as parallel execution jobs. As an example, the cleaning and preparation of data files can be executed simultaneously on different directories to accelerate the overall step. By partitioning the dataset in subsets of roughly 10GB and running concurrently 6 data preparation containers we went down from almost 17h to 4h on the same host.
Also by being able to run the process everywhere, we could execute it on several hardwares with different capabilities. This allowed us to check (and fix) portability while getting a better understanding of the resource requirements of each step. The following plot shows the evolution of performance in various situations:
......@@ -129,13 +144,12 @@ An example of rr-interval with the associated annotations (blue/red bottom line)
![ECG and annotations](/images/articles/aice_aura_demonstrator/ecg_annotations.png)
The process of importing the rr-intervals and annotations is time- and resource- consuming, so we decided to apply the same guidelines as for the training workflow and built a dedicated container for the mass import of ECG signals with their annotations. By partitionning the dataset and setting up multiple containers we are able to run several import threads in parallel, thus massively improving the overall performance of the import. It enabled us to:
The process of importing the rr-intervals and annotations is time- and resource- consuming, so we decided to apply the same guidelines as for the training workflow and built a dedicated container for the mass import of ECG signals with their annotations. By partitioning the dataset and setting up multiple containers we are able to run several import threads in parallel, thus massively improving the overall performance of the import. It enabled us to:
* execute the import on a powerful machine thanks to the container's portability, and
* drastically reduce the import time thanks to the parallel runs.
It is also very important to interpret visually and discuss the outcomes of the AI-based seizure detector with the healthcare professionals in order to build trust and assess limitation of the algorithm. Having an easy way to import ECGs to easily visualise and annotate them is a major benefit in this context, especially in healthcare centers where teams do not always have the resources and knowledge to set up a complex software stack. We are now working on a database dump that will enable end-users to import specific datasets into their own Postgres / Grafana instance in a few clicks, thus fostering the usage and research on open datasets.
### Industrialisation: Cleaning/Refactoring of the repository
The work done by the AURA researchers and data scientists on ECGs had been organised in a bunch of GitHub repositories, with different people using different tools and structures. The first step was to identify the parts required to run the complete workflow, and extract them from the various repositories and branches to build a unified structure. The requirements of this repository structure are:
......@@ -162,36 +176,34 @@ Building upon the current resources in use at AURA for Ai workflow, the followin
We defined and enforced a contributing guide, making tests and documentation mandatory in the repository. We also set up a Travis job to execute the full python test suite at every commit, and made it visible through a badge in the repository's README. Regarding Git we used a simple Git workflow to maintain a clean branching structure. The newly agreed development process definitely helped clean up the repository. Each time a set of scripts was added, we knew exactly where they should go and how to reuse them in the overall workflow.
## Benefits
### Industrialisation of the solution
The new repository has a sound and clean structure, with passing tests, a complete documentation to exploit and run the various steps, and has everything needed for further developments. All scripts are stored under the src/ directory and are copied to the docker images during the build, thus always relying on a single source of tested truth.
Furthermore, the automatic building of containers for multiple execution targets (Airflow, Docker, Kubernetes) can easily be reproduced. As a result the new, improved structure will be reused and is set to become the reference implementation for the next developments.
### Portability and deployment
Once the new Docker images are built and pushed to a Docker registry, they can be pulled from any computer in order to run the full workflow without any local install or specific knowledge. The provided docker-compose files will automatically pull the required images and execute the full workflow on any dataset and on any host. An example is proposed to fine-tune and run the multi-containers setup easily. For the teams at AURA, it means they can now run their workflows on any type of hosting provided by their partners.
We also installed a fresh instance of AI4EU Experiments on our dedicated hardware for the onboarding of the models, and plan to make stable, verified images available on the marketplace in the upcoming months.
### Better performances
The major performance gain was achieved by setting up dedicated containers to run atomic tasks (e.g. data preparation, visualisation imports) in parallel. Most computers, both in the lab and for high-end execution platforms, have multiple threads and enough memory to manage several containers simultaneously, and we need to take advantage of the full computing power we have. Another major gain was obviously to run the process on a more powerful system, with enough memory, CPUs and disk throughput.
All considered we were able to scale down the full execution time on the TUH dataset from 17 hours on the lab's laptop to roughly 4 hours in our cluster.
### Visualisation
## Conclusion
// _TODO_
It has been a fantastic collaborative work, building upon the expertise of the AURA data scientists and AICE MLOps practionners to deliver exciting and pragmatic outcomes. The result is a set of optimised, reliable processes, with new perspectives and possibilities, and a better confidence in the developed pipeline. All actors learned a lot and the sequels of the work will be replicated in the forthcoming projects in both teams.
### Industrialisation of the solution
Besides the team benefits, the project itself hugely benefited from the various improvements and optimisation. It is now very easy to run the full stack on different datasets for development, and the new container deployment method will be extended to partners and healthcare centers (L'Institut La Teppe).
The new repository has a sound and clean structure, with passing tests, a complete documentation to exploit and run the various steps, and has everything needed for further developments. All scripts are stored under the src/ directory and are copied to the docker images during the build, thus always relying on a single source of tested truth.
Furthermore, the automatic building of containers for multiple execution targets (Airflow, Docker, Kubernetes) can easily be reproduced. As a result the new, improved structure will be reused and is set to become the reference implementation for the next developments.
We identified a few areas of improvement, though. One aspect that we lacked in this experience was a precise benchmarking process and framework for the various steps, at each optimisation round. We are currently working on a monitoring solution based on Prometeus, Node exporter and Grafana to solve the issue, and we will be publishing soon a more detailed report on the performance gains.
## Conclusion
It has been a fantastic collaborative work, building upon the expertise of the AURA data scientists and AICE MLOps practitioners to deliver exciting and pragmatic outcomes. The result is a set of optimised, reliable processes, with new perspectives and possibilities, and a better confidence in the developed pipeline. All actors learned a lot and the sequels of the work will be replicated in the forthcoming projects in both teams.
Besides the team benefits, the project itself hugely benefited from the various improvements and optimisation. It is now very easy to run the full stack on different datasets for development, and the new container deployment method will be extended to partners and healthcare centers (L'Institut La Teppe).
We identified a few areas of improvement, though. One aspect that we lacked in this experience was a precise benchmarking process and framework for the various steps, at each optimisation round. We are currently working on a monitoring solution based on Prometheus, Node exporter and Grafana to solve the issue, and we will be publishing soon a more detailed report on the performance gains.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment