Zhejiang U | College of Pharmaceutical Sciences | 中文版
     
     
IDRB: Research Projects
RESEARCH PROJECTS

Our research projects in the fields of computer-aided drug design, computational biology and bioinformatics are listed as below:

ONGOING RESEARCH PROJECTS
  1. Comparative study of nature-derived FDA approved drugs and Traditional Chinese Medicine (TCM) to reveal the mechanism of TCM. The mechanism of majority of the TCM is still unclear, this study tries to conduct a comparison between FDA approved drugs and TCM on the phylogenetic perspective to have an understanding on the distribution pattern between these two drug systems, which is expected to give us a deep understanding of the mechanism of TCM.
  2. Deriving stable microarray cancer-differentiating signatures by machine learning and feature-elimination methods, and evaluating consensus scoring of multiple random sampling and Gene-Ranking’s consistency. Signatures identified reflect disease mechanism, and can provide indicators for disease diagnosis. My current interest lies in identifying biomarkers for breast cancer and major depression.
  3. Identifying next generation innovative therapeutic targets for specific disease types, such as Obesity, Major Depression, Cancer, and so on. Collective methods are applied, which include: A. genetic sequence similarity analysis between drug-binding domains; B. computation of number of human similarity proteins, number of affiliated human pathways, and number of human tissues of a target; C. structural comparison between drug-binding domain; D. target classification based on physicochemical characteristics detected by machine learning.
  4. Led and conduct the development of bioinformatics databases, which collect information of Biology, Pharmacy, Chemistry and so on. Moreover, we are interested in constructing innovative software for drug discovery and bioinformatics, which involves design and implementation of an integrated bioinformatics software system for novel therapeutic target agent explorations.
  5. Conducting biostatistics study on the distribution of molecules with therapeutic effect, especially drugs approved and in clinical trial, across all biological species, and identifying key species for ecological protection. Comprehensive biostatistics studies on therapeutic targets in clinical trial, and comparative analysis against targets with drugs approved. Studying correlating groups of genes by utilizing graph theory for filtering complex gene correlation network. Genetic variation identified indicate complex inter- and intra-individual differences.
PREVIOUS RESEARCH PROJECTS
  1. Drug discovery prospect from untapped species: indications from approved natural product drugs
  2. Due to extensive bioprospecting efforts of the past and technology factors, there have been questions about drug discovery prospect from untapped species. We analyzed recent trends of approved drugs derived from previously untapped species, which show no sign of untapped drug-productive species being near extinction and suggest high probability of deriving new drugs from new species in existing drug-productive species families and clusters. Case histories of recently approved drugs reveal useful strategies for deriving new drugs from the scaffolds and pharmacophores of the natural product leads of these untapped species. New technologies such as cryptic gene-cluster exploration may generate novel natural products with highly anticipated potential impact on drug discovery.

  3. Clustered patterns of species origins of nature-derived FDA approved and clinical-trial drugs and clues for future bioprospecting
  4. Many drugs are nature derived. Low drug productivity has renewed interest in natural products as drug-discovery sources. Nature-derived drugs are composed of dozens of molecular scaffolds generated by specific secondary-metabolite gene clusters in selected species. It can be hypothesized that drug-like structures probably are distributed in selective groups of species. We compared the species origins of 939 approved and 369 clinical-trial drugs with those of 119 preclinical drugs and 19,721 bioactive natural products. In contrast to the scattered distribution of bioactive natural products, these drugs are clustered into 144 of the 6,763 known species families in nature, with 80% of the approved drugs and 67% of the clinical-trial drugs concentrated in 17 and 30 drug-prolific families, respectively. Four lines of evidence from historical drug data, 13,548 marine natural products, 767 medicinal plants, and 19,721 bioactive natural products suggest that drugs are derived mostly from preexisting drug-productive families. Drug-productive clusters expand slowly by conventional technologies. The lack of drugs outside drug-productive families is not necessarily the result of under-exploration or late exploration by conventional technologies. New technologies that explore cryptic gene clusters, pathways, interspecies crosstalk, and high-throughput fermentation enable the discovery of novel natural products. The potential impact of these technologies on drug productivity and on the distribution patterns of drug-productive families is yet to be revealed.
    This work has been highlighted and reported by:
    The following media have covered this work:

  5. Construction of Therapeutic Target Database (TTD): a resource for facilitating target-oriented drug discovery
  6. Knowledge and investigation of therapeutic targets (responsible for drug efficacy) and the targeted drugs facilitate target and drug discovery and validation. Therapeutic Target Database (TTD, http://bidd.nus.edu.sg/group/ttd/ttd.asp) has been developed to provide comprehensive information about efficacy targets and the corresponding approved, clinical trial and investigative drugs. Since its last update, major improvements and updates have been made to TTD. In addition to the significant increase of data content (from 1894 targets and 5028 drugs to 2025 targets and 17,816 drugs), we added target validation information (drug potency against target, effect against disease models and effect of target knockout, knockdown or genetic variations) for 932 targets, and 841 quantitative structure activity relationship models for active compounds of 228 chemical types against 121 targets. Moreover, we added the data from our previous drug studies including 3681 multi-target agents against 108 target pairs, 116 drug combinations with their synergistic, additive, antagonistic, potentiative or reductive mechanisms, 1427 natural product-derived approved, clinical trial and pre-clinical drugs and cross-links to the clinical trial information page in the ClinicalTrials.gov database for 770 clinical trial drugs. These updates are useful for facilitating target discovery and validation, drug lead discovery and optimization, and the development of multi-target drugs and drug combinations.
    This work has been highlighted and reported by:
    • "FACULTYof1000" as "the top 2% of published articles in biology and medicine" and "a most useful resource for scientists and companies working on drug discovery and validation, drug lead discovery and optimization, and the development of multi-target drugs and drug combinations".
    • Prof. Chris Southan in his blog as "Therapeutic Target Database in PubChem".

  7. Construction of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequences
  8. Sequence-derived structural and physicochemical features have been extensively used for analyzing and predicting structural, functional, expression and interaction profiles of proteins and peptides. PROFEAT has been developed as a web server for computing commonly used features of proteins and peptides from amino acid sequence. To facilitate more extensive studies of protein and peptides, numerous improvements and updates have been made to PROFEAT. We added new functions for computing descriptors of proteinCprotein and proteinCsmall molecule interactions, segment descriptors for local properties of protein sequences, topological descriptors for peptide sequences and small molecule structures. We also added new feature groups for proteins and peptides (pseudo-amino acid composition, amphiphilic pseudo-amino acid composition, total amino acid properties and atomic-level topological descriptors) as well as for small molecules (atomic-level topological descriptors). Overall, PROFEAT computes 11 feature groups of descriptors for proteins and peptides, and a feature group of more than 400 descriptors for small molecules plus the derived features for proteinCprotein and proteinCsmall molecule interactions. Our computational algorithms have been extensively tested and used in a number of published works for predicting proteins of specific structural or functional classes, proteinCprotein interactions, peptides of specific functions and quantitative structure activity relationships of small molecules. PROFEAT is accessible free of charge at http://bidd.cz3.nus.edu.sg/cgi-bin/prof/protein/profnew.cgi.

  9. Identification of next generation innovative therapeutic targets by genetic, structural, physicochemical and system profile of successful targets
  10. Low target discovery rate has been linked to inadequate consideration of multiple factors that collectively contribute to druggability. These factors include sequence, structural, physicochemical, and systems profiles. Methods individually exploring each of these profiles for target identification have been developed, but they have not been collectively used. We evaluated the collective capability of these methods in identifying promising targets from 1019 research targets based on the multiple profiles of up to 348 successful targets. The collective method combining at least three profiles identified 50, 25, 10, and 4% of the 30, 84, 41, and 864 phase III, II, I, and nonclinical trial targets as promising, including eight to nine targets of positive phase III results. This method dropped 89% of the 19 discontinued clinical trial targets and 97% of the 65 targets failed in high-throughput screening or knockout studies. Collective consideration of multiple profiles demonstrated promising potential in identifying innovative targets.

  11. Analysis of mechanisms of drug combinations from interaction and network perspectives
  12. Understanding the molecular mechanisms underlying synergistic, potentiative and antagonistic effects of drug combinations could facilitate the discovery of novel efficacious combinations and multi-targeted agents. In this article, we describe an extensive investigation of the published literature on drug combinations for which the combination effect has been evaluated by rigorous analysis methods and for which relevant molecular interaction profiles of the drugs involved are available. Analysis of the 117 drug combinations identified reveals general and specific modes of action, and highlights the potential value of molecular interaction profiles in the discovery of novel multicomponent therapies.

  13. Homology-free prediction of functional class of proteins and peptides by Support Vector Machines (SVM)
  14. Protein and peptide sequences contain clues for functional prediction. A challenge is to predict sequences that show low or no homology to proteins or peptides of known function. A machine learning method, support vector machines (SVM), has recently been explored for predicting functional class of proteins and peptides from sequence-derived properties irrespective of sequence similarity, which has shown impressive performance for predicting a wide range of protein and peptide classes including certain low- and non- homologous sequences. This method serves as a new and valuable addition to complement the extensively-used alignment-based, clustering-based, and structure-based functional prediction methods. This article evaluates the strategies, current progresses, reported prediction performances, available software tools, and underlying difficulties in using SVM for predicting the functional class of proteins and peptides.

  15. Analysis on the trends of anticancer targets exploration and the strategies used to enhance the efficacy of drug targeting
  16. A number of therapeutic targets have been explored for developing anticancer drugs. Continuous efforts have been directed at the discovery of new targets as well as the improvement of therapeutic efficacy of agents directed at explored targets. There are 84 and 488 targets of marketed and investigational drugs for the treatment of cancer or cancer related illness. Analysis of these targets, particularly those of drugs in clinical trials and US patents, provides useful information and perspectives about the trends, strategies and progresses in targeting key cancer-related processes and in overcoming the difficulties in developing efficacious drugs against these targets. The efficacy of anticancer drugs directed at these targets is frequently compromised by counteractive molecular interactions and network crosstalk, negative and adverse secondary effects of drugs, and undesired ADMET profiles. Multi-component therapies directed at multiple targets and improved drug targeting methods are being explored for alleviating these efficacy-reducing processes. Investigation of the modes of actions of these combinations and targeting methods offers clues to aid the development of more effective anticancer therapies.
IDRB: Databases
DATABASE CONSTRUCTION

Our experiences on database construction have led to several bioinformatics and pharmacoinformatics databases as listed below:

TTD: THERAPEUTIC TARGET DATABASE

    Database URL: http://database.idrb.cqu.edu.cn/ttd/

    Last Update: Jan, 2012

    TTD is a database to provide information about the known and explored therapeutic protein and nucleic acid targets, the targeted disease, pathway information and the corresponding drugs directed at each of these targets. Also included in this database are links to relevant databases containing information about target function, sequence, 3D structure, ligand binding properties, enzyme nomenclature and drug structure, therapeutic class, clinical development status. All information provided are fully referenced. This database currently contains 2,025 targets, including 364 successful, 286 clinical trial, 44 discontinued and 1,331 research targets, 17,816 drugs, including 1,540 approved, 1,423 clinical trial, 14,853 experimental drugs and 3,681 multi-target agents (14,170 small molecules and 652 antisense drugs with available structure or oligonucleotide sequence). Targets and drugs in this database cover 61 protein biochemical class and 140 drug therapeutic classes respectively. TTD Version 4.3.02 release on Aug 25th 2011. New version of TTD provides Quantitative Structure-Activity Relationship (QSAR) models and Target Validation Information for TTD drug target. Moreover, Multi Target Agents Data with structure and potency information, Drug Combinations and Nature-derived Drugs together with drug species origin data are also included. TTD is accessible free of charge at http://bidd.nus.edu.sg/group/ttd/.

    Structures from this database have been available in PubChem since Feb 21st, 2012.
    This database has been highlighted and reported by:
    • "FACULTYof1000" as "the top 2% of published articles in biology and medicine" and "a most useful resource for scientists and companies working on drug discovery and validation, drug lead discovery and optimization, and the development of multi-target drugs and drug combinations".
    • Prof. Chris Southan in his blog as "Therapeutic Target Database in PubChem".
TVD: TARGET VALIDATION DATABASE

    Database URL: http://bidd.nus.edu.sg/group/TVDtest/TVD.asp

    Last Update: Dec, 2011

    Target validation is important in selecting the right targets for drug discovery, which evaluates multiple profiles including the expression and relevance of targets in disease models, the potencies of drugs in modulating target activity and disease models, and the correlation of these activities to the claimed therapeutic effects. Target validation data, particularly those of successful and clinical trial targets, is useful for facilitating target discovery, validation and analysis. We updated the Therapeutic Target Database to provide several types of target validation data for 243 (348) successful, 233 (292) clinical trial and 154 research targets. The data includes the potencies of drugs against their efficacy targets and disease relevant cell-lines expressing these targets (IC50, Ki, EC50), and the effects of target knock-out or variation in target sequence, expression and activity in disease models. These data can be accessed via the “target validation” button linked to a separate search page, where a particular target by keyword search or target selection from a target list manual. The relevant data is also shown in the corresponding target page in TTD. Target Validation Database (TVD) provides validation information for therapeutic targets. We would like to offer various targets validation information both in vivo and in vitro tests. DPD is accessible free of charge at http://bidd.nus.edu.sg/group/TVDtest/TVD.asp.

DPD: DRUG POTENCY DATABASE

    Database URL: http://bidd.nus.edu.sg/group/IDAD/IDAD_Home.asp

    Last Update: Apr, 2010

    DPD is a relational database focusing on providing activity information of drugs, clinical trial compounds and experimental agents directed at their corresponding therapeutic targets. Also included in this database are links to relevant databases containing information about target function, sequence, 3D structure, ligand biding properties, enzyme nomenclature and drug structure, therapeutic class, clinical development status. All information provided are fully referenced. Currently this database contains 5000 records of activity for around 3000 compounds directed at about 500 targets. Near 300 compounds in DPD have already approved or been in the clinical trial stage. All information provided can be traced back to TTD: Therapeutic Target Database. DPD is accessible free of charge at http://bidd.nus.edu.sg/group/IDAD/IDAD_Home.asp.

IDRB: Softwares
SOFTWARE DEVELOPMENT

Our experiences on software and server development have led to several bioinformatics and pharmacoinformatics servers as listed below:

MOLFEAT: MOLECULAR FEATURE SERVER

    Server URL: http://jing.cz3.nus.edu.sg/cgi-bin/molfeat/molfeat.cgi

    Last Update: Dec, 2011

    MOLFEAT (Version 2012) is designed for computing molecular fingerprints and molecular descriptors of molecules from their 3D structures, and for computing activity of compounds of specific chemical types against selected targets based on published Quantitative Structure-Activity Relationship (QSAR) models. Current version of MOLFEAT covers 1,114 molecular fingerprints which encode Hierarchic Element Counts, Ring Sets in a canonic ESSSR, Simple Atom Pairs, Simple Atom Nearest Neighbors, Detailed Atom Neighborhoods, Simple SMARTS patterns, Complex SMARTS patterns, and Molecular Frameworks in Drug. Moreover, MOLFEAT have also included 21 published Quantitative Structure-Activity Relationship (QSAR) models. These QSAR models cover 16 chemical types and 12 targets. MOLFEAT is accessible free of charge at http://jing.cz3.nus.edu.sg/cgi-bin/molfeat/molfeat.cgi.

PROFEAT: PROTEIN FEATURE SERVER

    Server URL: http://bidd.cz3.nus.edu.sg/cgi-bin/prof/protein/profnew.cgi

    Last Update: Jun, 2011

    Sequence-derived structural and physicochemical features have been extensively used for analyzing and predicting structural, functional, expression and interaction profiles of proteins and peptides. PROFEAT has been developed as a web server for computing commonly used features of proteins and peptides from amino acid sequence. To facilitate more extensive studies of protein and peptides, numerous improvements and updates have been made to PROFEAT. We added new functions for computing descriptors of protein-protein and protein-small molecule interactions, segment descriptors for local properties of protein sequences, topological descriptors for peptide sequences and small molecule structures. We also added new feature groups for proteins and peptides (pseudo-amino acid composition, amphiphilic pseudo-amino acid composition, total amino acid properties and atomic-level topological descriptors) as well as for small molecules (atomic-level topological descriptors). Overall, PROFEAT computes 11 feature groups of descriptors for proteins and peptides, and a feature group of more than 400 descriptors for small molecules plus the derived features for protein-protein and protein-small molecule interactions. Our computational algorithms have been extensively tested and used in a number of published works for predicting proteins of specific structural or functional classes, protein-protein interactions, peptides of specific functions and quantitative structure activity relationships of small molecules. PROFEAT is accessible free of charge at http://bidd.cz3.nus.edu.sg/cgi-bin/prof/protein/profnew.cgi.

IDRB: Innovative Drug Research and Bioinformatics Group

191263   visits since 2012
All rights are reserved by: Innovative Drug Research and Bioinformatics Group (IDRB)
College of Pharmaceutical Sciences, Zhejiang University
Hangzhou, P.R. China, 310058.
Contact number: (86-571)88208444

Last Update: