A bioinformatics tool for detecting Quasi Species in NGS Sorted BAM file alignment.
|
Quasi Species Detection Tool |
QuasiGO is a bioinformatics tool designed for the detection of Quasi Species in Next-Generation Sequencing (NGS) BAM file alignment. It generates detailed Excel reports and consensus sequences with ease, providing valuable insights into genomic data.
QuasiGO provides comprehensive statistics for each column in the BAM file. This includes base counts, base quality, and more, offering a granular view of the data and aiding in-depth analysis.
The tool generates consensus sequences based on majority and degenerate bases. This feature is crucial for analyzing genomic variation and understanding the nuances of genetic data.
QuasiGO outputs a detailed Excel report with conditional formatting for easy and efficient data interpretation, ensuring that insights are easily accessible and actionable.
Utilizing efficient algorithms and libraries, QuasiGO guarantees fast and accurate processing of large genomic datasets, ensuring that you can focus more on analysis and less on waiting.
QuasiGO allows users to specify output directories and file prefixes for organized and streamlined data management, enhancing the ease of use and ensuring smooth workflow management.
Ensure that you have Python 3 installed on your system. You can then use the following command to install the required libraries:
conda create -n <env_name> -c gosahan quasigo
Run QuasiGO using the following command to trigger interactive mode with default options:
quasigo
Run QuasiGO using the following command to trigger standalone mode with custom options:
quasigo -i <bam_file_path> -o <output_directory> -r <reference_file_path> [options]
-i <bam_file_path>
: Specify the path to the input BAM file. The default is the first *.bam file in the current working directory.-o <output_directory>
: Define the path to the output directory, if directory does not exist then one will be made with the name defined. The default is a directory named "quasiGO" in the current working directory.-r <reference_file_path>
: Specify the path to the reference file. The default is the first *.fa file in the current working directory.--majority_prefix <majority_prefix>
: Specify the prefix for the Majority Consensus output. Default is "majority_consensus".--degenerate_prefix <degenerate_prefix>
: Specify the prefix for the Degenerate Consensus output. Default is "degenerate_consensus".--quasicall
: Minimum threshold value for flagging majority(%) column. Anything below that value will be flagged as possible quasicall. Default = 95.--excel_prefix
: Specify the prefix for the excel output. Default: "quasigoanalysisreport".--chart_prefix
: Specify the prefix for the chart output. Default: "quasigodepthchart".--no-chart
: Do not generate the coverage depth chart.--majority_percent <majority_percent>
: Specify the majority percent threshold for calling. Enter a value between 0 and 100. Default is 25.--minimum_quality <min_base_quality>
: Minimum base quality required to consider a nucleotide. Enter a positive integer. Default is 25.--minimum_depth
: Minimum depth of coverage. Anything below that value will be flagged in the excel file. Default = 10. --overwrite
: Overwrite output files if they already exists, Default: False.-c, --cores <number_of_cores>
: Specify the number of cores to be used for processing. Default is 4.-v
, --version
: Use this to display the version number of QuasiGO.quasigo -i example.bam -o output_directory -m majority -d degenerate
This command will process the example.bam
file, output the results to output_directory
, and use majority
and degenerate
as the prefixes for the Majority and Degenerate Consensus outputs, respectively.
QuasiGO generates the following output files:
nucleotide_frequency_data.pkl
: A pickle file containing the column statistics.quasigo_analysis_report.xlsx
: An Excel file containing detailed column statistics with conditional formatting.quasigo_summary_report.txt
: A text file listing rows with degenerate nucleotides.majority_consensus.fasta
and degenerate_consensus.fasta
: FASTA files containing the Majority and Degenerate consensus sequences, respectively.coverage_depth_chart.svg
, coverage_depth_chart.html
, and coverage_depth_chart.json
: An SVG, HTML, and JSON file containing the coverage depth chart.For any questions, issues, or additional assistance, please contact Gurasis Osahan at the National Microbiology Laboratory. Your queries will be addressed promptly, ensuring smooth and efficient use of QuasiGO.
QuasiGO is licensed under the ensed under the Apache License, Version 2.0. You may use this work only in compliance with the License. For more details, please visit Apache License, Version 2.0.
Copyright: Government of Canada
Written by: National Microbiology Laboratory, Public Health Agency of Canada
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Gurasis Osahan at National Microbiology Laboratory, Public Health Agency of Canada
Ensuring public health through advanced genomics. Developed with unwavering commitment and expertise by National Microbiology Laboratory, Public Health Agency of Canada.