Performance =========== Overview -------- `mepylome` is designed to be a highly efficient tool for methylation data analysis, significantly outperforming other tools like `minfi` and `conumee2.0` in terms of speed and memory consumption. This section provides an overview of its performance metrics and explains the underlying mechanisms that contribute to its efficiency. Performance Comparison ----------------------- `mepylome` vs. `minfi` and `conumee2.0` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - **Import Time**: While the import time for `minfi` and `conumee2.0` is a couple of seconds (up to a minute), `mepylome` has an import time of just a fraction of a second. - **Data Extraction and CNV Analysis**: `mepylome` performs data extraction and CNV (Copy Number Variation) analysis multiple times faster than `minfi` and `conumee2.0`. - **Overall Speed**: On average, `mepylome` is >60 times faster than `minfi` and >10 times faster than `conumee2.0`. - **Memory Consumption**: `mepylome` uses >3 times less memory compared to `minfi` and `conumee2.0` together. Performance Test ~~~~~~~~~~~~~~~~ | Tested on: 12th Gen Intel Core i5-12500, 12 cores at 3.0 GHz, 2 threads per | core, 250 GB SSD, 96 GB RAM , Ubuntu 22.04 LTS | Prepreparation method: `illumina` | No. of Cases: 250 cases for data extraction, 20 reference cases for CNV analysis | 1. **Import Time**: | - `mepylome`: **0.73** seconds | - `minfi`: 10.51 seconds | - `conumee2.0`: 39.48 seconds | 2. **Methylation Data Extraction Per Case**: | - `mepylome`: **0.06** seconds | - `minfi`: 3.73 seconds | 3. **CNV Analysis Per Case**: | - `mepylome`: **2.08** seconds | - `conumee2.0`: 20.44 seconds | 4. **Memory Consumption Data Extraction**: | - `mepylome`: **0.39** GB | - `minfi`: 1.49 GB | 5. **Memory Consumption CNV Analysis**: | - `mepylome`: **2.29** GB | - `conumee2.0`: 7.78 GB How It Works ------------ Caching Mechanism ~~~~~~~~~~~~~~~~~ `mepylome` employs an efficient caching mechanism to enhance performance. The first operation in a session might take slightly longer due to the initial setup and data loading. However, subsequent operations are significantly faster as the data is cached in memory. Utilization of the `numpy` Package ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ `mepylome` extensively uses the `numpy` package for numerical computations. `numpy` is known for its speed and efficiency, particularly with large arrays and matrices. This contributes significantly to the performance gains observed in `mepylome`. Lazy Loading ~~~~~~~~~~~~ The package employs lazy loading techniques to delay the loading of data until it is actually needed. This reduces the initial load time and memory usage. Parallel Processing ~~~~~~~~~~~~~~~~~~~ `mepylome` leverages parallel processing where applicable, distributing tasks across multiple CPU cores to speed up computation.