Aligners

The sm-analysis program delegates the alignment of BAM files to an external aligner program, which must be accessible at runtime. The current version of PacBio Data Processing (v. 1.3.0) uses the Pbmm2 aligner by default, but Blasr can be used as well.

The aligner program will be called on demand: if a suitable aligned file is found, the alignment process will be skipped for that file.

By default, the aligner is searched for in the PATH. If it is not found in the PATH, you will receive a common runtime error message:

[CRITICAL] [Errno 2] No such file or directory: 'pbmm2'

and the sm-analysis program itself will terminate.

In that case, the instructions in the following sections can help you.

Pbmm2

The installation of the pbmm2 program is described in the pbmm2 repository: pbmm2 can be installed using conda. Have a look at Setting up Bioconda, before installing pbmm2. Once conda is ready and you have an active conda environment, then install pbmm2:

$ conda install pbmm2

Upon success, you will have a pbmm2 executable in your conda environment, and you will be able to pass the path to pbmm2 to sm-analysis if needed (see below the section on Telling sm-analysis where is the aligner for details).

Blasr

Warning

PacBio does not recommend to use Blasr as aligner anymore. The official recommendation is to use pbmm2. But if, for whatever reason, you are interested in using Blasr to align your BAM files, keep reading. Still remember that since PacBio does not support Blasr, it can be a bit hard to get this tool in the future, and for that reason, it might happen that the information in this section is obsolete when you are reading it.

Probably the easiest way to install blasr is with conda. Have a look at Setting up Bioconda. Once those steps are followed, and the resulting conda environment is active, install blasr:

$ conda install blasr

Upon success, you will be able to pass the path to the blasr executable to sm-analysis if needed (see below the section on Telling sm-analysis where is the aligner for details).

Warning

Notice that, contrary to the suggestion given in PacBio & Bioconda, the explicit selection of the bioconda channel by means of the -c option of conda install (e.g., conda install -c bioconda ...) triggers a dependency error. DO NOT USE the -c bioconda option, just run conda install ... instead, as explained in the main text.

Note

At the time of this writing, SMRT-link software server tool does not contain the blasr executable neither.

Telling sm-analysis where is the aligner

If you install PacBio Data Processing and try to run sm-analysis but it does not find the aligner program you will, as described before, get an error like No such file or directory: 'pbmm2' (or ...: 'blasr', if you chose to use blasr).

If you don’t have an aligner on your target system, please read about how to install one at Pbmm2 or Blasr.

Once the aligner is installed, if it is not in the PATH, it is still necessary to tell sm-analysis where it can be found. You need to use the command line option sm-analysis -a. The rest of this section explains that option with a litle example.

Let us assume that PacBio Data Processing was installed inside a virtual environment located in:

/home/dave/.venvs/pdp

and let us assume that pbmm2 was installed in a conda environment at:

/home/dave/miniconda3

then, after activating the PacBio Data Processing’s virtual environment:

$ source /home/dave/.venvs/pdp/bin/activate

you can tell sm-analysis about pbmm2 by using the command line option sm-analysis -a, as follows:

$ sm-analysis -a /home/dave/miniconda3/bin/pbmm2

(the -a and --aligner options are equivalent).

On the other hand, if you want to use blasr, you must explicitly tell it to the sm-analysis program, using the sm-analysis --use-blasr-aligner option, like:

$ sm-analysis --use-blasr-aligner --aligner /home/dave/miniconda3/bin/blasr