Masks

class pyrodigal.Masks

A list of masked regions within a Sequence.

Prodigal and Pyrodigal support masking regions containing unknown nucleotides to prevent genes to be predicting across them. This collection allows storing the coordinates of the masked regions in a more compact way than a plain bitmap masking each position.

Note

Pyrodigal also improves the logic used for region masking: in Prodigal, every time a gene is found, all the masks from the sequence are tested to see if the gene intersects with any of them. However, since gene extraction happens sequentially, sorting the masks once allows to bypass the full scan, saving some time for sequences with a lot of unknown regions.

__init__(*args, **kwargs)
clear()

Remove all masks from the list.

copy()

Return a copy of this list of masks.

class pyrodigal.Mask

The coordinates of a masked region.

Hint

The region indices follow the Python slice convention: start-inclusive, end-exclusive. This allows the original sequence to be indexed by the mask start and end coordinates easily.

Changed in version 2.0.0: Change end coordinate to be exclusive.

intersects(begin, end)

Check whether the mask intersects a range of sequence coordinates.

Parameters:
  • begin (int) – The rightmost coordinate of the region to check for intersection (inclusive).

  • end (int) – The leftmost coordinate of the region to check for intersection (exclusive).

Example

>>> mask = pyrodigal.Mask(3, 5)
>>> mask.intersects(2, 5)
True
>>> mask.intersects(1, 4)
True
>>> mask.intersects(1, 3)  # range end is exclusive
False
>>> mask.intersects(5, 7)  # mask end is exclusive
False
begin

The leftmost coordinate of the masked region.

Type:

int

end

The rightmost coordinate of the masked region, exclusive.

Type:

int