Upon accessing HmtDB a web page introduces the user to the resources available via the top links. A click on Access HmtDB displays a main menu. The menu macro-functions are:
The query function allows users to retrieve data from HmtDB. This retrieval is performed according to an advanced search system carried out through the Boolean combination of different criteria, corresponding to the fields defined in the design of the database; the table below summarizes the whole set of criteria.
The retrieval output displays a list briefly describing the retrieved genomes. With a click on the Database Source Identifier, as well as on the Pubmed ID, the user can access to the NCBI link that allows the display of both the GenBank entry and the reference abstract. A click on the HmtDB Identifier displays the associated genome card. Multiple genomes may be browsed at once by selecting them through the left square, and then by clicking on View Genome Cards displayed on the bottom of the page. Both multi-alignments and sequences of the selected genomes can be downloaded; when downloading multi-alignments data, both the download of the rCRS reference sequence and the use of SeaView editor are suggested.
List and description of criteria that may be combined when using the HmtDB retrieval system
|HmtDB Genome Identifier||A pop up menu allows to select a genome whose HmtDB genome identifier is known|
|Reference DB Identifier||A pop up menu allows to select a genome whose INSDC Accession number is known|
|Subject's Geographical Origin||A pop up menu allows to select genomes whose associated subject belongs to a specific continent/country|
|Haplogroup User Code||A pop up menu allows to select genomes matching a specific haplogroup as it has been assigned in the associated paper*|
|Complete Genomes/Only Coding Regions||Allows to select either complete genomes or genomes not inclusive of the D-loop region|
|SNP Position||Position of the rCRS reference sequence in which the selected genomes present a mutation|
|Variation Type||In addition to the position search, it is possible to search for genomes whose selected position reports transitions or transversions only, and to search for a specific transition or transversion. It is also possible to search for genomes reporting insertions and/or deletions in specific positions|
|Subject's Age||Genomes whose related subject had a specific age at sampling time|
|Subject's Sex||Genomes from subjects of a specific gender|
|DNA Source||Genomes sequenced from samples extracted from a specific tissue|
|Individual Type||Genomes from healthy or pathologic datasets or from a phenotype related to a specific disease|
|References||Allows to select genomes related to a specific paper or to papers published from a specific author, as well as a genome identified in the paper with a specific haplotype code|
*The Haplogroup User Code may not match the haplogroup predicted by the application of the classifier tool, because the former was assigned when the genome was published, while the latter is assigned according to the last update of Phylotree. The Haplogroup User Code reports thus both the user assigned and the "best predicted" haplogroups, the latter obtained by applying the mt-classifier stand-alone version. The two codes are separated by a /.
The classification procedure is performed on every genome stored on HmtDB, and allows the prediction of its haplogroup on the basis of the classification available through Phylotree. It consists in the automatic comparison between a single human mitochondrial genome (query) and the RSRS reference sequence, with the aim of detecting the pattern of mtDNA SNPs in the query genome. The comparison is based on the MUSCLE software. Upon matching an obtained pattern against the Phylotree haplogroup classification stored in HmtDB, the prediction of the query sequence haplogroup may be performed. A genome card of the subject genome is generated and displayed; haplogroup prediction is expressed as a list of haplogroups for which a match was found, and for each haplogroup a percentage of the detected variations with respect to the total number of variations defining the haplogroup is given (with a threshold of 95%).
MToolBox is an automated pipeline that was developed in order to analyze human mtDNA from High-Throughput Sequencing data, with customizable parameters and capable of analysing multiple samples in a single run. MToolBox outputs a VCF file, a standard format for large-scale genotyping information, suitably customized for mitochondrial data, by including the heteroplasmy fraction as well as its related confidence interval. MToolBox also provides users with essential analysis of reconstructed mitochondrial genomes, e.g. haplogroup assignment and variant prioritization, exploiting a broad collection of annotation resources, thus providing a valuable support for the recognition of candidate mitochondrial mutations in clinical studies. Complete reference can be found here.
MToolBox is both available as a command-line stand-alone (accessible on MToolBox - GitHub) as well as through a graphical-interface web version hosted on MSeqDR website (accessible on MtoolBox - MSeqDR).
The downloading function allows the download of both MAs of the entire healthy and pathologic samples as well as continent-specific datasets and variability data obtained by applying the SiteVar/MitVarProt software.