Welcome to PHIDIAS!
The Pathogen-Host Interaction Data Integration and Analysis System (PHIDIAS) is a web-based module system and centralized resource for biomedical researchers to investigate integrated genome sequences, curated literature information, and gene expression data related to pathogen-host interactions (PHI, also called host-pathogen interactions or HPI) for pathogens with high priority in public health and biological defense. Infectious diseases remain among the most common and fatal of diseases. According to estimations of the World Health Organization, infectious diseases caused 14.7 million deaths in 2001, accounting for 26% of the total global mortality. Infectious disease is the result of an interactive relationship between a pathogen and its host. Integration and analysis of various data related to pathogens and pathogen-host interactions will yield a better understanding of and means for the control of infectious diseases induced by pathogens. PHIDIAS is aimed at organizing and elucidating the fundamental PHI insights.
Genomic information of completely sequenced host and pathogen organisms provides valuable information not only for identification and reconstruction of intra-organismic processes but also for interactions between host and microbial organisms. To facilitate genome analysis and comparison, PHIDIAS integrates genome data from more than 20 sources (e.g., NCBI RefSeq and Swissprot) and provides a genome browser allowing users to browse and compare more than 30 microbial genomes. PHIDIAS also links publicly available human and mouse genome browsers for users to browse and analyze human and mouse genomes. Conserved domains are critical for assessing protein functions that provide important clues to microbial pathogenesis and interactions between pathogens and hosts. While NCBI CDD contains conserved domains derived from various eukaryotic and prokaryotic organisms, it is difficult to compare and analyze pathogen-specific conserved domains. Therefore, we have developed PHIDIAS to search and store all pathogen-specific conserved domains. All sequence information is available for comparison and analysis using our customized BLAST programs.
A large amount of information about human and animal pathogens has been acquired, stored, and displayed through different resources, both electronically and by other means. Most electronic resources are formatted in HTML and/or PDF. While these resources are good for viewing and navigation, they do not permit machine-based data transfer and query. To allow machine-readable data exchange of the voluminous pathogen information, Dr. Yongqun He and colleagues at the Virginia Bioinformatics Institute (VBI) developed the XML-based Pathogen Information Markup Language (PIML, Ref: He et al., Bioinformatics. 2005;21(1):116–21). This language represents comprehensive pathogen-oriented information including pathogen taxonomy, genomic information, life cycle, epidemiology, induced diseases in hosts, diagnosis, treatment, and relevant laboratory analysis. A list of PIML documents addressing pathogens has been created and is available through a public VBI web service. However, compared to relational databases, XML databases do not support efficient query functions and scalability. These deficiencies prompted us to design a web-based relational database for general PHI information. This allows storage, integration, query and data mining of parsed PIML data and other PHI-related information, for example, data related to the pathobiology and management of pathogen-infected laboratory animals from the Hazards in Animal Research Database (HazARD).
The molecular functions of pathogen and host genes as well as their roles in microbial pathogenesis and host immunity have been extensively studied. However, a systematic collation from the literature of these molecules and their PHI functions is lacking. Although richly documented in the literature, descriptions of the networks of microbial and host molecular and cellular interactions that occur during pathogenic infections of hosts are underrepresented in current database systems. PHIDIAS targets to integrate and curate PHI specific molecules and their interaction networks from publicly available databases (e.g., KEGG and MiMI) and by manual curation. PHIDIAS also incorporates data from MINet based on an XML-based Molecular Interaction Network Markup Language (MINetML, also known as ProNet). PHIDIAS also includes a program to transfer PHI specific network data into the Biological Pathways Exchange format (BioPAX). Additionally, a network visualization tool is developed to graphically browse PHI specific networks.
Large-scale experimental techniques such as microarrays and mass spectrometry result in abundant sources of PHI data previously unavailable to the investigators. While a large amount of PHI related gene expression data is publicly available in different databases, it is often difficult to query. PHIDIAS stores information of gene expression experiments related to pathogens and host-pathogen interactions, from public gene expression repositories including NCBI GEO and EBI ArrayExpress. In PHIDIAS, a one-stop gateway is provided for users to query PHI gene expression data and to link it to original data sources thereby permitting further analysis.
PHIDIAS utilizes online data submission systems for efficient data curation making integrative PHI data more comprehensive. All PHIDIAS components are scalable. More pathogens and PHI systems may be added into the system along time. With the inclusion of an ever increasing number of pathogens in PHIDIAS and the increasing amount of informatin in the literature information, an ongoing challenge will be to curate all significant genes and keep current PHI-related information to PhiDB. A future direction will be to explore ontology-based natural language processing and statistical methods to promote efficient literature acquisition and curation. We currently are upgrading and implementing a literature mining and curation system (Limix) that we originally developed for Brucella genome annotation in the Brucella Bioinformatics Portal (BBP). Additionally, a web-based pipeline for PHI gene expression data analysis and modeling will be developed. We anticipate PHIDIAS to become a system for researchers to address scientific PHI questions with the ultimate goal of successfully fighting infectious diseases.
Overall, PHIDIAS includes the following components:
Phidias was a great Greek sculptor and architect in the 5th Century B.C. It was said that Phidias alone had seen the exact images of the gods. He established forever image conceptions of Zeus in the Statue of Zeus at Olympia and Athena in the Parthenon. It is our wish that PHIDIAS will help us to elucidate the fundamental mechanisms beyond the intimate interactions between pathogens and hosts.