×

<center> <table table align="center" style="margin: 0px auto;"> <tr> <td> <img src="https://raw.githubusercontent.com/phac-nml/nanogo/main/extra/quasigo_logo.svg" alt="NanoGo Logo" width="150" height="auto"/> </td> <td> <h1>Quasi Species Detection Tool</h1> <a href="https://anaconda.org/gosahan/quasigo"> <img src="https://anaconda.org/gosahan/quasigo/badges/version.svg"/> </a> <a href="https://anaconda.org/gosahan/quasigo"> <img src="https://anaconda.org/gosahan/quasigo/badges/platforms.svg"/> </a> <a href="https://anaconda.org/gosahan/quasigo"> <img src="https://anaconda.org/gosahan/quasigo/badges/latest_release_date.svg"/> </a> </td> </tr> </table> </center> ## Overview QuasiGO is a bioinformatics tool designed for the detection of Quasi Species in Next-Generation Sequencing (NGS) BAM file alignment. It generates detailed Excel reports and consensus sequences with ease, providing valuable insights into genomic data. ## Features ### Detailed Column Statistics QuasiGO provides comprehensive statistics for each column in the BAM file. This includes base counts, base quality, and more, offering a granular view of the data and aiding in-depth analysis. ### Majority and Degenerate Consensus Sequences The tool generates consensus sequences based on majority and degenerate bases. This feature is crucial for analyzing genomic variation and understanding the nuances of genetic data. ### Excel Report Generation QuasiGO outputs a detailed Excel report with conditional formatting for easy and efficient data interpretation, ensuring that insights are easily accessible and actionable. ### Efficient Processing Utilizing efficient algorithms and libraries, QuasiGO guarantees fast and accurate processing of large genomic datasets, ensuring that you can focus more on analysis and less on waiting. ### Customizable Output QuasiGO allows users to specify output directories and file prefixes for organized and streamlined data management, enhancing the ease of use and ensuring smooth workflow management. ## Installation Ensure that you have Python 3 installed on your system. You can then use the following command to install the required libraries: ```bash conda create -n <env_name> -c gosahan quasigo ``` ## Usage Run QuasiGO using the following command to trigger interactive mode with default options: ```bash quasigo ``` Run QuasiGO using the following command to trigger standalone mode with custom options: ```bash quasigo -i <bam_file_path> -o <output_directory> -r <reference_file_path> [options] ``` ## Arguments ### Input/Output Options - `-i <bam_file_path>`: Specify the path to the input BAM file. The default is the first *.bam file in the current working directory. - `-o <output_directory>`: Define the path to the output directory, if directory does not exist then one will be made with the name defined. The default is a directory named "quasiGO" in the current working directory. - `-r <reference_file_path>`: Specify the path to the reference file. The default is the first *.fa file in the current working directory. ### Report Options - `--majority_prefix <majority_prefix>`: Specify the prefix for the Majority Consensus output. Default is "majority_consensus". - `--degenerate_prefix <degenerate_prefix>`: Specify the prefix for the Degenerate Consensus output. Default is "degenerate_consensus". - `--quasicall` : Minimum threshold value for flagging majority(%) column. Anything below that value will be flagged as possible quasicall. Default = 95. - `--excel_prefix` : Specify the prefix for the excel output. Default: "quasigo_analysis_report". - `--chart_prefix` : Specify the prefix for the chart output. Default: "quasigo_depth_chart". - `--no-chart` : Do not generate the coverage depth chart. ### Report Options - `--majority_percent <majority_percent>`: Specify the majority percent threshold for calling. Enter a value between 0 and 100. Default is 25. - `--minimum_quality <min_base_quality>`: Minimum base quality required to consider a nucleotide. Enter a positive integer. Default is 25. - `--minimum_depth` : Minimum depth of coverage. Anything below that value will be flagged in the excel file. Default = 10. ### Miscellaneous Options - `--overwrite`: Overwrite output files if they already exists, Default: False. - `-c, --cores <number_of_cores>`: Specify the number of cores to be used for processing. Default is 4. - `-v`, `--version`: Use this to display the version number of QuasiGO. ## Example ```bash quasigo -i example.bam -o output_directory -m majority -d degenerate ``` This command will process the `example.bam` file, output the results to `output_directory`, and use `majority` and `degenerate` as the prefixes for the Majority and Degenerate Consensus outputs, respectively. ## Output QuasiGO generates the following output files: - `nucleotide_frequency_data.pkl`: A pickle file containing the column statistics. - `quasigo_analysis_report.xlsx`: An Excel file containing detailed column statistics with conditional formatting. - `quasigo_summary_report.txt`: A text file listing rows with degenerate nucleotides. - `majority_consensus.fasta` and `degenerate_consensus.fasta`: FASTA files containing the Majority and Degenerate consensus sequences, respectively. - `coverage_depth_chart.svg`, `coverage_depth_chart.html`, and `coverage_depth_chart.json`: An SVG, HTML, and JSON file containing the coverage depth chart. ## Support For any questions, issues, or additional assistance, please contact [Gurasis Osahan](mailto:gurasis.osahan@phac-aspc.gc.ca) at the National Microbiology Laboratory. Your queries will be addressed promptly, ensuring smooth and efficient use of QuasiGO. ## License QuasiGO is licensed under the ensed under the Apache License, Version 2.0. You may use this work only in compliance with the License. For more details, please visit [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0). ## Legal **Copyright**: Government of Canada **Written by**: National Microbiology Laboratory, Public Health Agency of Canada Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ## Contact ## [**Gurasis Osahan**](mailto:gurasis.osahan@phac-aspc.gc.ca) at National Microbiology Laboratory, Public Health Agency of Canada --- *Ensuring public health through advanced genomics. Developed with unwavering commitment and expertise by National Microbiology Laboratory, Public Health Agency of Canada.*

Uploaded Wed Nov 27 00:22:33 2024
md5 checksum ddb282f2e73dfda5e4287d2e2550acb8
arch x86_64
build py310he53d0f1_0
depends alive-progress >=3.1.5, build >=0.10.0, matplotlib >=3.8.0, numpy >=1.26.4, openpyxl >=3.1.0, pandas >=2.2.3, pip >=24.3.1, plotly >=5.24.1, poetry-core >=1.9.1, psutil >=6.1.0, pysam >=0.22.1, python >=3.10,<3.11.0a0, python_abi 3.10.* *_cp310, scipy >=1.14.1, setuptools >=75.6.0, typing-extensions >=4.12.2, wheel >=0.45.1
has_prefix True
license Apache-2.0
license_family Apache
machine x86_64
operatingsystem linux
platform linux
subdir linux-64
target-triplet x86_64-any-linux
timestamp 1732666633088