Bioinformatics Datasets

This page provides access to comprehensive bioinformatics databases, including resources for drug discovery, disease ontologies, protein sequences, and biological pathways. These curated databases serve as essential tools for biomedical research, offering detailed information about molecular interactions, drug properties, and disease mechanisms.

1. ChEMBL

ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.

ChEMBL

2. DISEASES

DISEASES is a weekly updated web resource that integrates evidence on disease-gene associations from automatic text mining, manually curated literature, cancer mutation data, and genome-wide association studies. It further unifies the evidence by assigning confidence scores that facilitate comparison of the different types and sources of evidence.

DISEASES

3. Disease Ontology

The Disease Ontology has been developed as a standardized ontology for human disease with the purpose of providing the biomedical community with consistent, reusable and sustainable descriptions of human disease terms, phenotype characteristics and related medical vocabulary disease concepts through collaborative efforts with biomedical researchers, coordinated by the University of Maryland School of Medicine, Institute for Genome Sciences.

Disease Ontology

4. Mondo

Mondo is a semi-automatically constructed ontology that merges in multiple disease resources to yield a coherent merged ontology.

Mondo

5. Drug Approvals

FDA's comprehensive database providing information on approved drugs, including new drug applications, generic drug approvals, and drug safety updates. This resource serves as the authoritative source for drug approval information in the United States.

Drug Approvals

6. DrugCentral

DrugCentral provides information about active pharmaceutical ingredients, their mechanisms of action, pharmaceutical products, drug labels, and more. It serves as an online drug compendium integrating structure, bioactivity, regulatory, and pharmaceutical information.

DrugCentral

7. UniProt

UniProt is a comprehensive, high-quality resource of protein sequence and functional information. It combines Swiss-Prot, TrEMBL, and PIR databases to provide detailed protein annotations, including function, structure, and cross-references to other databases.

UniProt

8. KEGG Pathway

KEGG (Kyoto Encyclopedia of Genes and Genomes) Pathway is a collection of manually drawn pathway maps representing molecular interaction and reaction networks for metabolism, genetic information processing, cellular processes, and various diseases.

KEGG Pathway

9. Reactome Pathway

Reactome is a free, open-source, curated and peer-reviewed pathway database. It provides bioinformatics tools for visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, and systems biology.

Reactome Pathway

10. Gene Cards

GeneCards is a searchable, integrative database that provides comprehensive, user-friendly information on all annotated and predicted human genes. The knowledgebase automatically integrates gene-centric data from ~200 web sources, including genomic, transcriptomic, proteomic, genetic, clinical and functional information.

Gene Cards