OranGenomi X

Welcome to our website!

Welcome to our projects page!


We create Genomics Pipelines that scale in parallel in computer clusters.

We help researchers achieve their goals faster reducing the programming burden and the infrastructure headaches.


We provide:



Well tested Genomics Pipelines that run on Apache Spark.

This means pipelines that scale up properly, and are optimized for good utilization of cluster resources.



Scalable both on CPU and GPU.

Need more computing power? Just add more nodes, and your task will end earlier.



Fully Integrated with Scala and Python's Data Science stack.

You can pick from Apache Spark friendly languages.



Simplified infrastructure on AWS EMR using CloudFormation.

No need to hire an IT team and wait months to purchase hardware: just choose the type of cluster you need on EMR, and you will be running in minutes.



Sample Notebooks.

With examples to get you started quickly.

Spark DeepVariant



DeepVariant is a Variant Caller release by Google in 2017. Since then it has proven to be accurate in many different benchmarks, however, using it at scale to tackle larger collections of data beyond a single machine is not trivial.


We provide,


A fully optimized core algorithm that works about twice as fast as the original implementation.


Well documented interface integrated into Jupyter Notebooks.


Optimized example generation stage that runs fully in parallel.


Optimized post processing.


Streaming mode: drop an input file on S3 (or any other distributed file system of your preference) and start getting results in near real-time.


Optimized Dataset Downloader: download reference datasets in a faster and safer manner through an optimized dataset downloader.


Affordable: about 40 cents an hour, regardless cluster size.

OranGenomi X

CONTACT US





OranGenomi X

CONTACT US