Aligners¶
The sm-analysis program delegates the alignment of BAM files to an external aligner program, which must be accessible at runtime. The current version of PacBio Data Processing (v. 1.3.0) uses the Pbmm2 aligner by default, but Blasr can be used as well.
The aligner program will be called on demand: if a suitable aligned file is found, the alignment process will be skipped for that file.
By default, the aligner is searched for in the PATH. If it is not found in the PATH, you will receive a common runtime error message:
[CRITICAL] [Errno 2] No such file or directory: 'pbmm2'
and the sm-analysis program itself will terminate.
In that case, the instructions in the following sections can help you.
Pbmm2¶
The installation of the pbmm2
program is described in the
pbmm2 repository: pbmm2
can be installed using
conda
. Have a look at Setting up Bioconda, before
installing pbmm2
. Once conda
is ready and you have an
active conda
environment, then install pbmm2
:
$ conda install pbmm2
Upon success, you will have a pbmm2
executable in your conda
environment, and you will be able to pass the path to pbmm2
to sm-analysis if needed (see
below the section on Telling sm-analysis where is the aligner for details).
Blasr¶
Warning
PacBio does not recommend to use Blasr as aligner anymore. The official
recommendation is to use pbmm2
. But if, for whatever reason, you are
interested in using Blasr to align your BAM files, keep reading. Still
remember that since PacBio does not support Blasr, it can be a bit hard
to get this tool in the future, and for that reason, it might happen that
the information in this section is obsolete when you are reading it.
Probably the easiest way to install blasr
is with conda
.
Have a look at Setting up Bioconda. Once those steps are followed,
and the resulting conda
environment is active, install blasr
:
$ conda install blasr
Upon success, you will be able to pass the path to the blasr
executable to sm-analysis if needed (see below
the section on Telling sm-analysis where is the aligner for details).
Warning
Notice that, contrary to the suggestion given in PacBio & Bioconda,
the explicit selection of the bioconda
channel by means of the -c
option of conda install
(e.g., conda install -c bioconda ...
)
triggers a dependency error. DO NOT USE the -c bioconda
option,
just run conda install ...
instead, as explained in the main text.
Note
At the time of this writing, SMRT-link software server tool does not contain the blasr
executable neither.
Telling sm-analysis where is the aligner¶
If you install PacBio Data Processing and try to run sm-analysis
but it does not find the aligner program you will, as described before,
get an error like No such file or directory: 'pbmm2'
(or
...: 'blasr'
, if you chose to use blasr
).
If you don’t have an aligner on your target system, please read about how to install one at Pbmm2 or Blasr.
Once the aligner is installed, if it is not in the PATH,
it is still necessary to tell sm-analysis where it
can be found. You need to use the command line option
sm-analysis -a
. The rest of this section explains that option
with a litle example.
Let us assume that PacBio Data Processing was installed inside a virtual environment located in:
/home/dave/.venvs/pdp
and let us assume that pbmm2
was installed in a conda environment at:
/home/dave/miniconda3
then, after activating the PacBio Data Processing’s virtual environment:
$ source /home/dave/.venvs/pdp/bin/activate
you can tell sm-analysis
about pbmm2
by using the command
line option sm-analysis -a
, as follows:
$ sm-analysis -a /home/dave/miniconda3/bin/pbmm2
(the -a
and --aligner
options are equivalent).
On the other hand, if you want to use blasr
, you must explicitly tell
it to the sm-analysis program, using the
sm-analysis --use-blasr-aligner
option, like:
$ sm-analysis --use-blasr-aligner --aligner /home/dave/miniconda3/bin/blasr