About Anaconda Help Download Anaconda

BiRD / packages / deblurpipeline 0.2.0

Installers

  • linux-64 v0.2.0

conda install

To install this package run one of the following:
conda install bird::deblurpipeline

Description

Description

This pipeline aims to provide data analysis with the deblur tool for amplicon sequencing based on Illumina Miseq/Hiseq error profiles based on a greedy deconvolution algorithm.

Prerequisites

  • The computing grid is expected to run on a beegfs partition (or at least a multi-thread capable partition)
  • Miniconda3 is a necessity

Input data

  • The fastq(.gz) files need to be gathered in a directory. The pathway to this directory will be specified in the config.json. The fastq files can be paired-end (like F3D141S207L001R1001.fastq) or single-end.

Creation of the virtual environments

~~~ conda create -n myVirtualEnvironment deblurpipeline -c bird -c conda-forge -c bioconda conda info --env # To get the path of the directory of myVirtualEnvironment : myCondaPath cd myCondaPath/deblurpipeline/ conda env create -f virtualEnvs/conda/QiimeEnv.yml ~~~

config.json

~~~ |-------------------------|----------------------------------------------------------------------------------------------------------------| | FASTQPATH: | directory of the fastq data (input) | |-------------------------|----------------------------------------------------------------------------------------------------------------| | FASTQTYPE: | "singleEnd" for single-end fastq files and "pairedEnd" for paired-end fastq files | | | for paired-end files, the expected name pattern is _R1.fastq(.gz) and _R2.fastq(.gz) | |-------------------------|----------------------------------------------------------------------------------------------------------------| | OUTPUTPATH: | directory of the intermediate and final results of the data analysis | |-------------------------|----------------------------------------------------------------------------------------------------------------| | QIIMEMETADATA: | optional - Metadata mapping files (comma-separated if more than one) | |-------------------------|----------------------------------------------------------------------------------------------------------------| | QIIMEBARCODE: | optional - The barcode read fastq files (comma-separated if more than one) | |-------------------------|----------------------------------------------------------------------------------------------------------------| | QIIMEQUALITYTHRESHOLD:| The maximum unacceptable Phred quality score (e.g., for Q20 and better, specify -q 19) | |-------------------------|----------------------------------------------------------------------------------------------------------------| | DEBLURTRIMMING: | The sequence trim length (any read that is shorter will be omitted) | |-------------------------|----------------------------------------------------------------------------------------------------------------| | DEBLURMINREADS: | remove sOTUs with a total read count (across all samples) lower than the given threshold | |-------------------------|----------------------------------------------------------------------------------------------------------------| | DEBLUR_CPU: | Deblur can operate in parallel (running more threads than available cores is not advised) | |-------------------------|----------------------------------------------------------------------------------------------------------------| ~~~

Execution of the pipeline

~~~ source activate myVirtualEnvironment snakemake -p --latency-wait 60 --cluster "qsub -o ./logs/ -e ./logs/" --jobs 100 --jobscript deblur.sh ~~~

Output data

  • reference-hit.biom : contains only Deblurred reads matching the positive filtering database. By default, a reference composed of 16S sequences is used, and this resulting table will contain only those reads which recruit at a coarse level to it will be retained. Reads are also filtered against the negative reference, which by default will remove any read which appears to be PhiX or adapter.
  • reference-hit.seqs.fa : a fasta file containing all the sequences in reference-hit.biom
  • reference-non-hit.biom : contains only Deblurred reads that did not align to the positive filtering database. Negative filtering is also appied to this table, so by default, PhiX and adapter are removed.
  • reference-non-hit.seqs.fa : a fasta file containing all the sequences in reference-non-hit.biom
  • all.biom : contains all Deblurred reads. This file represents the union of the "reference-hit.biom" and "reference-non-hit.biom" tables.
  • all.seqs.fa : a fasta file containing all the sequences in all.biom

References

  • https://github.com/biocore/deblur
  • http://qiime.org/scripts/splitlibrariesfastq.html

© 2025 Anaconda, Inc. All Rights Reserved. (v4.2.2) Legal | Privacy Policy