pacbio_data_processing package

Subpackages

Submodules

pacbio_data_processing.bam module

pacbio_data_processing.bam_file_filter module

pacbio_data_processing.bam_utils module

pacbio_data_processing.cigar module

This module provides basic ‘re-invented’ functionality to handle Cigars. A Cigar describes the differences between two sequences by providing a series of operations that one has to apply to one sequence to obtain the other one. For instance, given these two sequences:

sequence 1 (e.g. from the refenrece):

AAGTTCCGCAAATT

and

sequence 2 (e.g. from the aligner):

AAGCTCCCGCAATT

The Cigar that brings us from sequence 1 to sequence 2 is:

3=1X3=1I4=1D2=

where the numbers refer to the amount of letters and the symbols’ meaning can be found in the table below. Therefore the Cigar in the example is a shorthand for:

3 equal bases followed by 1 replacement followed by 3 equal bases followed by 1 insertion followed by 4 equal bases followed by 1 deletion followed by 2 equal bases

symbol

meaning

=

equal

I

insertion

D

deletion

X

replacement

S

soft clip

H

hard clip

class pacbio_data_processing.cigar.Cigar(incigar)[source]

Bases: object

__init__(incigar)[source]
property diff_ratio

difference ratio: 1 means that each base is different; 0 means that all the bases are equal.

property number_diff_items
property number_diff_types
property number_pb_diffs
property number_pbs
property sim_ratio

similarity ratio: 1 means that all the bases are equal; 0 means that each base is different.

This is computed from diff_ratio().

pacbio_data_processing.constants module

pacbio_data_processing.errors module

exception pacbio_data_processing.errors.MissingGooeyError[source]

Bases: ModuleNotFoundError

exception pacbio_data_processing.errors.SMAMergeError[source]

Bases: SMAPipelineError

exception pacbio_data_processing.errors.SMAPipelineError[source]

Bases: Exception

pacbio_data_processing.errors.high_level_handler(func)[source]

pacbio_data_processing.external module

pacbio_data_processing.filters module

pacbio_data_processing.ipd module

pacbio_data_processing.logs module

pacbio_data_processing.methylation module

pacbio_data_processing.parameters module

pacbio_data_processing.plots module

pacbio_data_processing.sam module

pacbio_data_processing.sentinel module

class pacbio_data_processing.sentinel.Sentinel(checkpoint: Path)[source]

Bases: object

This class creates objects that are expected to be used as context managers. At __enter__ a sentinel file is created. At __exit__ the sentinel file is removed. If the file is there before entering the context, or is not there when the context is exited, an exception is raised.

__init__(checkpoint: Path)[source]
_anti_aging()[source]

Method that updates the modification time of the sentinel file every SLEEP_SECONDS seconds. This is part of the mechanism to ensure that the sentinel does not get fooled by an abandoned leftover sentinel file.

property is_file_too_old

Property that answers the question: is the sentinel file too old to be taken as an active sentinel file, or not?

exception pacbio_data_processing.sentinel.SentinelFileFound[source]

Bases: Exception

Exception expected when the sentinel file is there before its creation.

exception pacbio_data_processing.sentinel.SentinelFileNotFound[source]

Bases: Exception

Exception expected if the sentinel file is missing before the Sentinel removes it.

pacbio_data_processing.sm_analysis module

pacbio_data_processing.sm_analysis_gui module

pacbio_data_processing.summary module

pacbio_data_processing.templates module

pacbio_data_processing.types module

pacbio_data_processing.utils module

Module contents

Top-level package for PacBio data processing.