|Logo| Pyrodigal |Stars| ======================== .. |Logo| image:: /_images/logo.png :scale: 40% :class: dark-light .. |Stars| image:: https://img.shields.io/github/stars/althonos/pyrodigal.svg?style=social&maxAge=3600&label=Star :target: https://github.com/althonos/pyrodigal/stargazers :class: dark-light *Cython bindings and Python interface to* `Prodigal `_, *an ORF finder for genomes and metagenomes*. **Now with SIMD!** |Actions| |Coverage| |PyPI| |Bioconda| |AUR| |Wheel| |Versions| |Implementations| |License| |Source| |Mirror| |Issues| |Docs| |Changelog| |Downloads| |Paper| |Citations| .. |Actions| image:: https://img.shields.io/github/actions/workflow/status/althonos/pyrodigal/test.yml?branch=main&logo=github&style=flat-square&maxAge=300 :target: https://github.com/althonos/pyrodigal/actions :class: dark-light .. |GitLabCI| image:: https://img.shields.io/gitlab/pipeline/larralde/pyrodigal/main?gitlab_url=https%3A%2F%2Fgit.embl.de&logo=gitlab&style=flat-square&maxAge=600 :target: https://git.embl.de/larralde/pyrodigal/-/pipelines :class: dark-light .. |Coverage| image:: https://img.shields.io/codecov/c/gh/althonos/pyrodigal?style=flat-square&maxAge=600 :target: https://codecov.io/gh/althonos/pyrodigal/ :class: dark-light .. |PyPI| image:: https://img.shields.io/pypi/v/pyrodigal.svg?style=flat-square&maxAge=3600 :target: https://pypi.python.org/pypi/pyrodigal :class: dark-light .. |Bioconda| image:: https://img.shields.io/conda/vn/bioconda/pyrodigal?style=flat-square&maxAge=3600 :target: https://anaconda.org/bioconda/pyrodigal :class: dark-light .. |AUR| image:: https://img.shields.io/aur/version/python-pyrodigal?logo=archlinux&style=flat-square&maxAge=3600 :target: https://aur.archlinux.org/packages/python-pyrodigal :class: dark-light .. |Wheel| image:: https://img.shields.io/pypi/wheel/pyrodigal?style=flat-square&maxAge=3600 :target: https://pypi.org/project/pyrodigal/#files :class: dark-light .. |Versions| image:: https://img.shields.io/pypi/pyversions/pyrodigal.svg?style=flat-square&maxAge=3600 :target: https://pypi.org/project/pyrodigal/#files :class: dark-light .. |Implementations| image:: https://img.shields.io/pypi/implementation/pyrodigal.svg?style=flat-square&maxAge=3600&label=impl :target: https://pypi.org/project/pyrodigal/#files :class: dark-light .. |License| image:: https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square&maxAge=3600 :target: https://choosealicense.com/licenses/mit/ :class: dark-light .. |Source| image:: https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=3600&style=flat-square :target: https://github.com/althonos/pyrodigal/ :class: dark-light .. |Mirror| image:: https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square&maxAge=3600 :target: https://git.embl.de/larralde/pyrodigal/ :class: dark-light .. |Issues| image:: https://img.shields.io/github/issues/althonos/pyrodigal.svg?style=flat-square&maxAge=600 :target: https://github.com/althonos/pyrodigal/issues :class: dark-light .. |Docs| image:: https://img.shields.io/readthedocs/pyrodigal?style=flat-square&maxAge=3600 :target: http://pyrodigal.readthedocs.io/en/stable/?badge=stable :class: dark-light .. |Changelog| image:: https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=3600&style=flat-square :target: https://github.com/althonos/pyrodigal/blob/main/CHANGELOG.md :class: dark-light .. |Downloads| image:: https://img.shields.io/pypi/dm/pyrodigal?style=flat-square&color=303f9f&maxAge=86400&label=downloads :target: https://pepy.tech/project/pyrodigal :class: dark-light .. |Paper| image:: https://img.shields.io/badge/paper-JOSS-9400ff?style=flat-square&maxAge=86400 :target: https://doi.org/10.21105/joss.04296 :class: dark-light .. |Citations| image:: https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fbadge.dimensions.ai%2Fdetails%2Fid%2Fpub.1147419140%2Fmetadata.json&query=%24.times_cited&style=flat-square&label=citations&maxAge=86400 :target: https://badge.dimensions.ai/details/id/pub.1147419140 :class: dark-light Overview -------- Pyrodigal is a Python module that provides bindings to Prodigal using `Cython `_. It directly interacts with the Prodigal internals, which has the following advantages: .. grid:: 1 2 3 3 :gutter: 1 .. grid-item-card:: :fas:`battery-full` Batteries-included Just add ``pyrodigal`` as a ``pip`` or ``conda`` dependency, no need for the Prodigal binary or any external dependency. .. grid-item-card:: :fas:`screwdriver-wrench` Flexible I/O Directly pass sequences to process as Python `str` objects, no need for intermediate files. .. grid-item-card:: :fas:`memory` Memory-efficient Benefit from conservative memory allocation and a reworked data layout for candidate nodes. .. grid-item-card:: :fas:`microchip` Faster computation Use the full power of your CPU with :wiki:`SIMD` instructions to filter out candidate genes prior to the scoring stage. .. grid-item-card:: :fas:`check` Consistent results Get the same results as Prodigal ``v2.6.3+31b300a``, with additional bug fixes compared to the latest stable Prodigal version. .. grid-item-card:: :fas:`toolbox` Feature-complete Access all the features of the original CLI through the :doc:`Python API ` or a :doc:`drop-in CLI replacement `. Features -------- The library now features everything from the original Prodigal CLI: - **run mode selection**: Choose between *single* mode, using a training sequence to count nucleotide hexamers, or *metagenomic* mode, using pre-trained data from different organisms (``prodigal -p``). - **region masking**: Prevent genes from being predicted across regions containing unknown nucleotides (``prodigal -m``). - **closed ends**: Genes will be identified as running over edges if they are larger than a certain size, but this can be disabled (``prodigal -c``). - **training configuration**: During the training process, a custom translation table can be given (``prodigal -g``), and the Shine-Dalgarno motif search can be forcefully bypassed (``prodigal -n``) - **output files**: Output files can be written in a format mostly compatible with the Prodigal binary, including the protein translations in FASTA format (``prodigal -a``), the gene sequences in FASTA format (``prodigal -d``), or the potential gene scores in tabular format (``prodigal -s``). See the :doc:`Output Formats ` section for supported formats. - **training data persistence**: Getting training data from a sequence and using it for other sequences is supported; in addition, a training data file can be saved and loaded transparently (``prodigal -t``). In addition, the **new** features are available: - **custom gene size threshold**: While Prodigal uses a minimum gene size of 90 nucleotides (60 if on edge), Pyrodigal allows to customize this threshold, allowing for smaller ORFs to be identified if needed. Several changes were done regarding **memory management**: - **digitized sequences**: Sequences are stored as raw bytes instead of compressed bitmaps. This means that the sequence itself takes 3/8th more space, but since the memory used for storing the sequence is often negligible compared to the memory used to store dynamic programming nodes, this is an acceptable trade-off for better performance when extracting said nodes. - **node buffer growth**: Node arrays are dynamically allocated and grow exponentially instead of being pre-allocated with a large size. On small sequences, this leads to Pyrodigal using about 30% less memory. - **lightweight genes**: Genes are stored in a more compact data structure than in Prodigal (which reserves a buffer to store string data), saving around 1KiB per gene. Setup ----- Run ``pip install pyrodigal`` in a shell to download the latest release and all its dependencies from PyPi, or have a look at the :doc:`Installation page ` to find other ways to install ``pyrodigal``. Citation -------- Pyrodigal is scientific software, with a `published paper `_ in the `Journal of Open-Source Software `_. Check the :doc:`Publications page ` to see how to cite Pyrodigal properly. Library ------- Check the following pages of the user guide or the API reference for more in-depth reference about library setup, usage, and rationale: .. toctree:: :maxdepth: 2 User Guide API Reference Related Projects ---------------- The following Python libraries may be of interest for bioinformaticians. .. include:: related.rst License ------- This library is provided under the `GNU General Public License v3.0 `_. The Prodigal code was written by `Doug Hyatt `_ and is distributed under the terms of the GPLv3 as well. See the :doc:`Copyright Notice ` section for the full GPLv3 license. *This project is in no way not affiliated, sponsored, or otherwise endorsed by the original* `Prodigal`_ *authors. It was developed by* `Martin Larralde `_ *during his PhD project at the* `European Molecular Biology Laboratory `_ *in the* `Zeller team `_. The project icon was derived from `UXWing `_ and is re-used under `their permissive license `_.