BioMart is an amazing resource of well curated genomic annotations - till you need to actually download data programatically...I gave it a try for a couple of hours using the biomaRt R package only to realise my query wouldn't be served in our lifetime...However, I then moved on to try using Biomart's REST API.That's a … Continue reading Accessing BioMart with REST API and multi-threading (Python3)
Category: bioinformatics
Genotype counts & sports games
In today's post, we'll just be doing some...simple counting with Genotype Counts (GC) at a Cambridge pub. 😎 The general form of GC with a ref allele A and multiple alternative alleles (B,C,D, etc.) is: GC = AA, AB,BB, AC,BC,CC, AD,BD,CD,DD, ... GC fields basically capture the likelihood of two events occurring simultaneously at two spots. … Continue reading Genotype counts & sports games
gnomAD: expanding multi-allelic variants in VCF (Part 2)
VCF Playground - Level 2 Following my previous post on GC elements' order, I'm now going to present an empirical proof of this convention! As a reminder, the inferred order of elements within a GC field is: GC=AA, AB,BB, AC,BC,CC, AD,BD,CD,DD, AE,BE,CE,DE,EE, ... (1) Notation used: Reference Allele: A Alternative Alleles: B, C, D, ... (in order of appearance in the VCF … Continue reading gnomAD: expanding multi-allelic variants in VCF (Part 2)
gnomAD: expanding multi-allelic variants in VCF (Part 1)
VCF playground - Level 1 TLDR; Order of elements in GC field of gnomAD VCF for multi-allelic entries: GC = AA, AB, BB, AC, BC, CC, AD, BD, CD, DD, AE, BE, CE, DE, EE, ... Ref. allele: A Alt. alleles: B, C, D, ... Ever tried to make sense of the infamous VCF format? … Continue reading gnomAD: expanding multi-allelic variants in VCF (Part 1)