Compares two ONT sequences samples for differential signals cause by mutations and modifications.
If you find a bug, please add it to the issues on GitHub with a detailed description.
To install Magnipore we recommend to use Conda: Magnipore is available for linux-64 and osx-64.
conda install mamba
mamba create -n magnipore -c jannessp magnipore
conda activate magnipore
If you want to basecall your ONT data you also need a Guppy version from Oxford Nanopore Technologies.
Magnipore is a tool written in python3 to analyze and pair-wise compare sequencing samples from Oxford Nanopore Technologies (ONT) sequencing.
Magnipore compares two ONT samples on a signal level to find differential signals between them in single base resolution. Such differences are caused by mutations or modifications. Magnipore classifies these differences and provides the user with a position-wise comparison.
Magnipore depends on/requires other tools to preprocess and analyze the data.
Conda Dependencies - python (>=3.8,<3.11) - h5py>=3.7 - biopython>=1.80 - mafft>=7.508 - matplotlib>=3.6.2 - numpy>=1.23 - scipy>=1.9 - winnowmap>=2.0 - pandas>=1.5 - seaborn>=0.12 - psutil>=5.9 - hdf5plugin>=3.3.1 - ontvbzhdf_plugin>=1.0.1 - pytest>=7.1 - gzip>=1.12 - read5>=1.1.6 - f5c>=1.2 - read5>=1.2.0
For each sample in the comparison, Magnipore takes: - (FASTA) exactly ONE reference sequence - (FAST5) the raw sequencing data from ONT - (optinal FASTQ) optionally basecalls, if you do not have the guppy binary or do not want to basecall the raw ONT data (again).
If you are not using the conda package replace "magnipore" by "python3 magnipore.py".
magnipore raw_data_first_sample reference_first_sample label_first_sample raw_data_sec_sample reference_sec_sample label_sec_sample working_dir --basecalls_first_sample basecalls_first_sample --basecalls_sec_sample basecalls_sec_sample
magnipore raw_data_first_sample reference_first_sample label_first_sample raw_data_sec_sample reference_sec_sample label_sec_sample working_dir --guppy_bin PATH --guppy_model PATH
magnipore --basecalls_first_sample basecalls_first_sample --basecalls_sec_sample basecalls_sec_sample raw_data_first_sample reference_first_sample label_first_sample raw_data_sec_sample reference_sec_sample label_sec_sample working_dir
Using the same reference sequence for both samples results in no reported mutations. Magnipore will only report potential modifications in this case. If you assume there are mutations between the samples, try to provide different reference sequences containing these mutations.
Complete help messages can be found here!
use either the basecalling arguments or provide basecalls - basecalling arguments: - guppybin : Path to guppy binary - guppymodel : Path to guppy model used for basecalling - (optional) guppydevice : Device used for basecalling (cpu or gpu cuda:0) - provided basecalls (FASTQ) - basecallsfirstsample : Path - basecallssec_sample : Path
The .magnipore file is a TSV containing the following columns.
same for second sample: - ref2, pos2, base2, motif2, signalmean2, signalstd2, ndatapoints2, containeddatapoints2, nsegments2, containedsegments2, nreads2
Errors of first sample: - 119: Cannot basecall .slow5/.blow5 with guppy - 120: Could not find raw data or unknown file format - 121: Guppy basecalling failed - 122: mapping failed - 123: Samtools indexing failed - 124: f5c index failed - 125: f5c eventalign failed - 126: Could not find provided fastq files
Errors of second sample - 219: Cannot basecall .slow5/.blow5 with guppy - 220: Could not find raw data or unknown file format - 221: Guppy basecalling failed - 222: mapping failed - 223: Samtools indexing failed - 224: f5c index failed - 225: f5c eventalign failed - 226: Could not find provided fastq files - 227: f5c eventalign file is empty
The -e parameter of nanosherlock specifies the leading number of the error code. Default is 0. - 019: Cannot basecall .slow5/.blow5 with guppy - 020: Could not find raw data or unknown file format - 021: Guppy basecalling failed - 022: mapping failed - 023: Samtools indexing failed - 024: f5c index failed - 025: f5c eventalign failed - 026: Could not find provided fastq files - 027: f5c eventalign file is empty