HTSlib¶

PacBio Data Processing uses pysam, a wrapper around HTSlib, to read the BAM files. Although the installation of pysam is automatically triggered by the installation of PacBio Data Processing, HTSlib must be installed independently, otherwise PacBio Data Processing will die at runtime.

Installing HTSlib¶

In the sections below, I briefly explain two ways to install HTSlib.

Standard installation¶

Probably, the easiest way to install HTSlib is through your package manager. But it can be installed also from sources; have a look at the HTSlib webpage to learn about that.

Spack¶

Another particularly simple way to install HTSlib is through Spack, especially if you are going to work on a cluster where using its package manager is cumbersome, or even impossible, and the installation from sources is not appealing to you. In this case the installation with Spack goes as follows.

(Optional) Choosing the compiler. HTSlib will be compiled from source code by Spack. You might need to choose an up-to-date compiler (clusters tend to have very stable, ie. old, default compilers). See Using Spack for details.
Installing HTSlib itself. With the default compiler it would be:
```
$ spack install htslib
```
or if we want to install it with a specific compiler, say gcc-11.3:
```
$ spack install htslib%gcc@11.3
```
The result will be a module. In our case, the name of the module is htslib-1.14-gcc-11.3.0-22tiwx3
Using HTSlib. As mentioned above, PacBio Data Processing depends on HTSlib at runtime. It means that after a successfull installation, the created module must be loaded whenever it is needed:
```
$ module load htslib-1.14-gcc-11.3.0-22tiwx3
```
Warning

Remember to add the line:
```
module load htslib-1.14-gcc-11.3.0-22tiwx3
```
at the beginning of the slurm batch scripts used to submit any executable from PacBio Data Processing.

HTSlib¶

Installing HTSlib¶

Standard installation¶

Spack¶

PacBio Data Processing

Navigation

Related Topics