geneplot
- geneplot.createGFFdb(gff3file)[source]
Creates a sqlite3 database of a GFF3 file with the gffutils Pyton package.
- Parameters
gff3file (str) – path to the GFF3 file
- class geneplot.genome(gff3file, iprfile=None, vcffiles=None)[source]
Bases:
objectInstantiates a genome object with genome-associated as paths pointing to the data source and set as class attributes. Data include the GFF3 file of genome annotation (positional), InterproScan output of protein domains identified on protein-coding genes (keyword), and directory with VCF files of polymorphisms (keyword).
- Parameters
gff3file (str) – path to the GFF3 file
iprfile (str) – path to the InterproScan’s output file
vcffiles (str) – path to the VCF files directory
- class gene(mRNAid, proteinid=None, description=None)[source]
Bases:
objectInstantiates a gene object with the method plot() to represent the intron/exon structure of the gene from a GFF3 file, the protein domain topology from InterproScan’s output, and single nucleotide polymofphisms (SNPs) from VCF files.
- Parameters
mRNAid (str) – gene identifier (ID) according to the GFF3 file annotations.
proteinid (str) – protein identifier (ID) from the InterproScan output
description (str) – user-defined description of the gene
- getsnppos(sp, vcffiles, onlycoding=True)[source]
Selects SNP data overlapping with genome coordinates of the gene ID (class object) from a VCF file whose sample ID matches the “sp” parameter of the function. SNP annotation by SNPEff is retrieved from the VCF file. If absent, de novo annotation of selected SNPs is performed.
- Parameters
sp (str) – Species ID to select SNP data from the VCF file
vcffiles (str) – path to the VCF files directory
onlycoding (boolean) – to plot only SNPs located on coding areas of the gene
- plot(domtype='Pfam', sp=None, onlycoding=True)[source]
Plots features of the gene ID (class object) previously generated by the functions of the class, including exon and UTR features (the latter only if present in the GFF3 file), Interpro protein domains and SNPs. SNP data are labelled with the genotype according to the VCF file information, and colored based on SNPEff impact, i.e. LOW: green, MODERATE: amber, MODIFIER: pink, HIGH: red. A PNG image is generated.
- Parameters
domtype (str) – protein domain type (as specified in the InterproScan output) to be plotted (Pfam by default).
sp (str) – Species ID to select SNP data from the VCF file.
onlycoding (boolean) – to plot only SNPs located on coding areas of the gene