Commit 7efd8d97 authored by Boris Baldassari's avatar Boris Baldassari
Browse files

Fix aura article.

parent 08cd9011
Pipeline #3576 passed with stage
...@@ -67,7 +67,7 @@ More information: ...@@ -67,7 +67,7 @@ More information:
## Objectives of the project ## Objectives of the project
In this context, our first practical goal was to train the model on a large dataset of ECGs from the Temple University Hospital (TUH). The TUH dataset is composed of EDF files recording the electrocardiogram signal, along with their annotation files that classify the time ranges as either noise or as an epileptic seizure. The full dataset has 5600+ EDF files and as many annotations, representing 692 Patients, 1074 Hours of Recordings and 3500+ Seizures. Its size on disk is 67GB. In this context, our first practical goal was to train the model on a large dataset of ECGs from the Temple University Hospital (TUH). The TUH dataset is composed of EDF files recording the electrocardiogram signal, along with their annotation files that classify the time ranges as either noise or as an epileptic seizure. The full dataset has 5600+ EDF files and as many annotations, representing 692 patients, 1074 hours of recordings and 3500+ seizures. Its size on disk is 67GB.
AI-related research and development activities, even if they rely on smaller datasets for the early stages of the set up, require a more complete dataset to run when it comes to the fine-tuning and exploitation of the model. The TUH database was not used often with the previous AURA workflow, as its full execution would take more than 20 hours on the developer's computers. Executions often failed because of wrong input data, and switching to more powerful computers was difficult because of the complex setup. AI-related research and development activities, even if they rely on smaller datasets for the early stages of the set up, require a more complete dataset to run when it comes to the fine-tuning and exploitation of the model. The TUH database was not used often with the previous AURA workflow, as its full execution would take more than 20 hours on the developer's computers. Executions often failed because of wrong input data, and switching to more powerful computers was difficult because of the complex setup.
...@@ -130,7 +130,7 @@ Another step was to refactor the scripts to identify and remove performance bott ...@@ -130,7 +130,7 @@ Another step was to refactor the scripts to identify and remove performance bott
The performance gain enabled us to run more precise and resource-consuming operations in order to refine the training. For example we modified the length of the sliding window when computing the rr-intervals from 9 seconds to 1 second, which generates a substantial amount of computations while seriously improving predictions from the ML training. The performance gain enabled us to run more precise and resource-consuming operations in order to refine the training. For example we modified the length of the sliding window when computing the rr-intervals from 9 seconds to 1 second, which generates a substantial amount of computations while seriously improving predictions from the ML training.
We identified atomic steps that could be executed independently and built them as parallel execution jobs. As an example, the cleaning and preparation of data files can be executed simultaneously on different directories to accelerate the overall step. By partitioning the dataset in subsets of roughly 10GB and running concurrently 6 data preparation containers we went down from almost 17h to 4h on the same reference host. We identified atomic steps that could be executed independently and built them as parallel execution jobs. As an example, the cleaning and preparation of data files can be executed simultaneously on different directories to accelerate the overall step. By partitioning the dataset in subsets of roughly 10GB and running concurrently 6 data preparation containers we went down from almost 17 hours to 4 hours on the same reference host.
{{< grid/div isMarkdown="false" >}} {{< grid/div isMarkdown="false" >}}
<img src="/images/articles/aice_aura_demonstrator/aura_process_multi.png" alt="The AURA AI process" class="img-responsive"> <img src="/images/articles/aice_aura_demonstrator/aura_process_multi.png" alt="The AURA AI process" class="img-responsive">
...@@ -142,13 +142,13 @@ Also by being able to run the process everywhere, we could execute it on several ...@@ -142,13 +142,13 @@ Also by being able to run the process everywhere, we could execute it on several
* A high-range server (label: SDIA), HDD disks and 2 x Xeon (48 threads). * A high-range server (label: SDIA), HDD disks and 2 x Xeon (48 threads).
* With a single container for data preparation vs. multiple containers executed in parallel (label: Mono / Multi). * With a single container for data preparation vs. multiple containers executed in parallel (label: Mono / Multi).
We could identify different behaviours regarding performance. The data preparation step relies heavily on IOs, and improving the disk throughput (e.g. SSD + NVMe instead of a classic HDD) shows a 30% gain. The ML training on the other hand is very CPU- and memory- intensive, and running it on a node with a large number of threads (e.g. 48 in our case) brings a stunning 10x performance improvement compared to a laptop equipped with an Intel i7. The following plot shows the evolution of performance in various situations:
{{< grid/div isMarkdown="false" >}} {{< grid/div isMarkdown="false" >}}
<img src="/images/articles/aice_aura_demonstrator/benchmark_perf.png" alt="Execution time benchmark" class="img-responsive"> <img src="/images/articles/aice_aura_demonstrator/benchmark_perf.png" alt="Execution time benchmark" class="img-responsive">
{{</ grid/div >}} {{</ grid/div >}}
We could identify different behaviours regarding performance. The data preparation step relies heavily on IOs, and improving the disk throughput (e.g. SSD + NVMe instead of a classic HDD) shows a 30% gain. The ML training on the other hand is very CPU- and memory- intensive, and running it on a node with a large number of threads (e.g. 48 in our case) brings a stunning 10x performance improvement compared to a laptop equipped with an Intel i7. The following plot shows the evolution of performance in various situations:
### Visualisation process ### Visualisation process
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment