top of page
Coding Station

BIOINFORMATICS

In the last couple of years biological data generation expanded rapidly. Particularly, in the field of sequencing with the emerging of the Next Generation Sequencing (NGS) machines.

​

However the analysis of these large datasets is getting more complicated. In our bioinformatics workflows we use state-of-the-art technologies and the most updated, also trustworthy, databases to interpret our genomics data.

Pathogenicity prediction of genetic variants

De novo assembly

RNA quantification

Gene expression analysis

Quality control of NGS sequencing data

Reference mapping

Variant identification

Variant annotation

Standard bioinformatic pipeline

01

Sequencing

The Illumina next-generation sequencing (NGS) method is based on sequencing-by-synthesis, and reversible dye-terminators that enable the identification of single bases as they are introduced into DNA strands. Binary Base Call (BCL) files are the raw data files generated by the Illumina sequencers.

02

Fastq generation

FASTQ is a text-based sequencing data file format that stores both raw sequence data and quality scores (FASTQ).

03

Adapter trimming, quality filtering

Sequences corresponding to the library adapters can be present in the FASTQ files and should be removed from reads because they interfere with downstream analyses, such as alignment of reads to a reference. FastQC aims to provide a simple way to do quality control checks on raw sequence data coming from high throughput sequencing pipelines (trimmed FASTQ, FASTQC).

Software: FastQ Toolkit

04

Reference mapping

The graph-based alignment method uses alt-aware mapping for population haplotypes stitched into the reference with known alignments to establish alternate graph paths that reads could seed-map and align to. A BAM file is the compressed binary version of a SAM file that is used to represent aligned sequences (BAM).

05

Variant calling

The Variant Caller takes mapped and aligned DNA reads as input and calls SNPs and indels through a combination of column-wise detection and local de novo assembly of haplotypes. VCF is a text file format that contains information about variants found at specific positions in a reference genome (VCF).

06

Variant annotation

Nirvana provides clinical-grade annotation of genomic variants and it is being developed under a rigorous testing process to ensure accuracy of the results and enable embedding in other software with regulatory needs. VarSome Premium is a CE IVD-certified and HIPAA-compliant platform allowing fast and accurate variant discovery, annotation, and interpretation of NGS data (final report).

Software: Nirvana, VarSome Premium

If you’d like more information about our services and applications, get in touch today.

bottom of page