Improving methodologies of whole genome assembly and data processing of historical fungal collections to accelerate species discovery
Fungi are essential components of ecosystems, driving nutrient cycling, carbon sequestration, and countless symbiotic interactions. Yet, fewer than 10% of the estimated three million fungal species have been formally described. Historical fungarium collections, which capture much of this hidden diversity, represent an invaluable resource for biodiversity and evolutionary research. Despite challenges associated with degraded and contaminated DNA, recent advances in museomics are unlocking genome-scale data from these specimens, offering unprecedented opportunities for fungal systematics.
Whole-genome sequencing (WGS) of fungarium specimens has the potential to transform species discovery and resolve the fungal tree of life. However, the impact of such research depends critically on the quality of genome assemblies. Incomplete assemblies risk misleading phylogenetic inference and comparative analyses. The ongoing Defra-funded Fungarium Sequencing Project (FSP) at RBG Kew and the NHM is generating over 7,000 fungal and lichen genomes, alongside development of dedicated pipelines for handling degraded DNA.
This project will build on that foundation, testing the feasibility of k-mer–based species identification using fragmented WGS data and benchmarking its accuracy against traditional marker-based approaches. By integrating assembly quality metrics, specimen metadata, and comparative taxonomic analyses, the work will assess how genome-scale data improves species identification, phylogenetic resolution, and taxonomic placement. Outcomes will inform optimal assembly strategies for degraded material and establish new methodologies for fungariomics. Ultimately, this project aims to accelerate fungal species discovery, support the rediscovery of historical collections, and contribute to biodiversity conservation in the face of global environmental change.
The student will be based at the Royal Botanic Gardens, Kew, under the supervision of Dr. Ester Gaya, co-supervised by Dr. Wu Huang, a bioinformaticist leading the Fungarium Sequencing Project (FSP) bioinformatics team, and a co-supervisor from a selected university. The student will receive close mentorship from Dr. Wu Huang, who will provide day-to-day guidance on bioinformatics workflows and data analyses. The student will have access to the High-Performance Computer CropDiversity and benefit from training materials and support. The broader Science Directorate at Kew provides regular seminars, journal clubs, and technical workshops to support skills development and scientific exchange.
This project provides comprehensive training at the intersection of fungal genomics, bioinformatics, and taxonomy, equipping the candidate with a versatile skill set that is highly transferable across academia, industry, and the public sector.
The project directly supports career development in mycology, evolutionary biology, systematics, and biodiversity informatics. The student will gain expertise in genome assembly, species identification, and comparative genomics, preparing them for postdoctoral positions and academic careers in fungal biology, museomics, or computational biology. The focus on handling degraded DNA and historical collections also positions them well for work in ancient DNA research, museum genomics, and conservation genomics, fields that are expanding with the increasing digitisation of natural history collections.
The technical skills developed—large-scale data management, genome assembly optimisation, k-mer analysis, and contamination handling—are highly sought after in biotechnology, pharmaceutical R&D, and environmental monitoring industries. Graduates will also be competitive for roles in applied genomics companies (e.g., those focused on food safety, pathogen detection, agricultural biotechnology, and microbiome analysis). In addition, the project’s emphasis on biodiversity data integration and species identification links directly to career opportunities in governmental and intergovernmental organisations, conservation NGOs, and bioinformatics infrastructure providers, where genomics increasingly informs policy and management strategies.
Overall, the project fosters a skill profile that combines scientific depth with broad applicability, ensuring strong employability in both research-intensive and applied professional environments.
