The identification of drug-target interactions is very useful for understanding drug effects. In recent years, researchers have designed many models and systems to predict drug-target interactions. Among those models, the bipartite local model (BLM) is well known for its high accuracy of prediction. BLM treats drug-target interactions as a bipartite graph, with the two sides of the graph depicting drugs and targets respectively. BLM is a supervised learning model and does not resolve the important problem of predicting drug-target interactions for new drugs. The authors of this paper propose a new bipartite local model with neighbor-based inferring (BLMN).
The significant difference between BLM and BLMN is that BLMN predicts drug-target interactions not only for existing drugs, but also for new drugs; this is shown in the first equation. For a new drug, BLMN first finds its nearest neighbors using pairwise similarity. It then predicts drug-target interactions for the new drug based on the drug-target interactions of its nearest neighbors. The inputs to BLMN are pairwise similarities for all drugs, pairwise similarities for all targets, and drug-target interactions for all drugs and targets. Pairwise similarities of drugs and targets are calculated through chem-seq, network-based, and hybrid inputs: “chem-seq denotes that chemical similarity is used for the drug and sequence similarity is used for the target; network-based denotes that the drug-drug similarity and target-target similarity are derived from the existing interaction network; [and] hybrid denotes that the drug-drug similarity and target-target similarity are combinations of the two types of similarities.” A binary matrix A represents interactions between a set of drugs and a set of targets. If drug i has an interaction with target j, then the cell value of A(i,j) will be one. If drug i has no interaction with target j, then the value of A(i,j) will be zero.
BLMN was evaluated on four different datasets: enzyme, ion channel, G protein coupled receptor (GPCR), and nuclear receptor. The experimental results indicate that BLMN has the highest scores for the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision-recall curve (AUPR), based on hybrid similarity calculations for all four datasets. In addition, the authors compared BLM and BLMN on the nuclear receptor database with three different types of similarity: chem-seq, network-based, and hybrid. BLMN outperforms BLM for all three types of similarity.
I did find three drawbacks to the paper. First, BLMN is a supervised learning method, which relies on a labeled training dataset. As we know, supervised learning methods perform better than semi-supervised and unsupervised methods, but they require many more resources. Second, in the third equation, there is a parameter β to focus on the neighbors with the highest similarity; however, the experimental section neither provides a value for β nor explains how different values of β influence the model’s performance. Third, the paper mostly relies on an earlier paper on BLM . Without reading the BLM-related paper, readers cannot easily understand how BLMN works.
Readers whose research interests are data mining and social analysis in bioinformatics are the paper’s target audience.