Founders of HGMD

2026-01-25

How the Human Gene Mutation Database (HGMD) became the reference catalogue of published disease-causing gene variants and evolved into a licensed clinical infrastructure resource.

By the early 1990s, molecular genetics was producing increasing numbers of reports describing mutations responsible for inherited disease. Each study documented variants in isolation, scattered across hundreds of journals. Researchers and clinicians attempting to interpret patient variants had to manually search literature to determine whether a mutation had already been observed or linked to disease.

Interpretation, not discovery, became the practical bottleneck. Without a central catalogue, variant classification was slow, inconsistent, and heavily dependent on individual expertise. Diagnostic laboratories often repeated literature searches already performed elsewhere, and new mutation reports risked being overlooked or redundantly rediscovered.

HGMD emerged from efforts at the Institute of Medical Genetics at Cardiff University, led by David N. Cooper and collaborators, initially to study patterns and mechanisms of human gene mutation. Beginning in the mid-1990s, the project systematically collected published reports of germline mutations responsible for inherited disease, extracting and standardising variant descriptions into a structured database.

The technical shift was not algorithmic but operational. HGMD introduced continuous manual curation of the literature, ensuring each entry linked mutations to genes, disease context, and original publications using consistent nomenclature. Only mutations described in peer-reviewed reports were included, avoiding speculation or unpublished data. Over time, HGMD expanded to cover diverse mutation classes, including missense changes, splice defects, insertions, deletions, regulatory variants, and complex rearrangements.

The result was a searchable reference allowing laboratories to check whether a variant had already been reported and under what clinical circumstances. Variant interpretation workflows could move from literature hunting to database lookup, substantially reducing interpretation effort.

Item Details
Founders / developers David N. Cooper and collaborators at Cardiff University
Institutions Institute of Medical Genetics, Cardiff University
First public release 1996 (web availability)
Licensing model Public academic version free; commercial and professional access licensed
Pricing model Subscription pricing via commercial partner; exact pricing not publicly disclosed
Distribution mechanism Public web database and licensed professional database distributions
Adoption scale Used globally in clinical genetics and research laboratories
Maintenance model Continuous expert manual curation at Cardiff University, supported by licensing income
Primary use Identification and classification of inherited disease mutations
Current status Actively maintained with quarterly professional releases

Adoption spread pragmatically. Laboratories and researchers integrated HGMD into diagnostic and research workflows because it saved time and reduced uncertainty when evaluating candidate variants. As genetic testing expanded into clinical practice, the need for consistent variant interpretation made curated mutation databases essential infrastructure.

However, maintaining a manually curated mutation database required sustained funding and specialist effort. By the late 2000s, HGMD had grown dramatically in size, and continued curation demanded long-term financial support. A hybrid funding model emerged: academic users retained access to a public version of the database, while commercial distribution was licensed through industry partners. Initially marketed via BIOBASE GmbH and later through QIAGEN, a professional subscription version provided more current data and enhanced functionality.

Under this model, mutation data are made freely available to academic users after a delay, while commercial and clinical users subscribe to HGMD Professional for immediate access and advanced search capabilities. Licensing revenue supports continued curation and database maintenance. Subscription pricing is not publicly listed, but licenses are typically institutional and integrated into commercial clinical interpretation platforms.

Already in the first short description of HGMD published in 1997, the curators clearly described both its practical clinical value and the funding and distribution challenges that would shape its future:

“… In view of its potential usefulness, the curators of HGMD made the database publicly available through the World Wide Web in April 1996. However, because HGMD is partly dependent on industrial funding and involves considerable editorial work over and above mere literature screening (for example to ensure consistency of amino acid residue numbering, gene symbol usage and nucleotide sequence information), unresolved copyright problems have so far precluded, and might continue to preclude, HGMD from being downloadable in its entirety.

By November 1996, HGMD contained nearly 4,000 different single base-pair substitutions, 1,400 small (less than 20 bp) deletions, 470 small insertions and 44 small indels from a total of 574 human genes.”

— Cooper DN, Ball EV, Krawczak M. The Human Gene Mutation Database. Trends in Genetics, 1997;13:121–122. DOI: 10.1016/s0168-9525(97)01068-8 https://pubmed.ncbi.nlm.nih.gov/9066272/

The distinction between public and professional versions reflects operational realities rather than marketing choices: continuous expert curation, literature review, data validation, and database updates require ongoing specialist effort that cannot easily be funded through grants alone.

Today, HGMD functions as background infrastructure within many variant interpretation pipelines. Clinical laboratories, commercial annotation tools, and interpretation platforms routinely rely on HGMD data when assessing candidate disease variants. Users often encounter HGMD indirectly through other software rather than through direct database queries.

The lasting consequence of HGMD is that published mutation knowledge became searchable infrastructure rather than scattered literature. Variant interpretation moved from manual journal review toward standardised reference lookup, enabling clinical genomics workflows to scale.

The broader lesson from HGMD is operational. Infrastructure succeeds when it transforms repeated expert effort into shared reference systems. Once interpretation workflows depend on such resources, sustainability becomes as important as innovation.

HGMD public site: https://www.hgmd.cf.ac.uk/ac/index.php
HGMD Professional via QIAGEN: https://digitalinsights.qiagen.com/products-overview/clinical-insights-portfolio/human-gene-mutation-database/
Prof Cooper academic page: https://profiles.cardiff.ac.uk/staff/cooperdn

References

Recommended citation: Stenson PD, Mort M, Ball EV, Shaw K, Phillips AD, Cooper DN. The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting. Human Genetics, 2020. https://pubmed.ncbi.nlm.nih.gov/32417917/

Other articles on HGMD development:

  • 2020 https://pubmed.ncbi.nlm.nih.gov/32417917/
  • 2017 https://pubmed.ncbi.nlm.nih.gov/28349240/
  • 2014 https://pubmed.ncbi.nlm.nih.gov/24077912/
  • 2009 https://pubmed.ncbi.nlm.nih.gov/19348700/
  • 2003 https://pubmed.ncbi.nlm.nih.gov/12754702/
  • 2000 https://pubmed.ncbi.nlm.nih.gov/10612821/
  • 1998 https://pubmed.ncbi.nlm.nih.gov/9399854/
  • 1997 https://pubmed.ncbi.nlm.nih.gov/9066272/
  • 1993 Book: Cooper DN, Krawczak M. Human Gene Mutation. BIOS Scientific Publishers, Oxford, 1993 (revised editions 1994, 1995).