Journal
A Comprehensive Evaluation Framework for Benchmarking Multi-Objective Feature Selection in Omics-Based Biomarker Discovery
Identifying lncRNA and mRNA Co-Expression Modules from Matched Expression Data in Ovarian Cancer
Long non-coding RNAs (lncRNAs) have been shown to be involved in multiple biological processes and play critical roles in tumorigenesis. Numerous lncRNAs have been discovered in diverse species, but the functions of most lncRNAs still remain unclear. Meanwhile, their expression patterns and regulation mechanisms are also far from being fully understood. With the advances of high-throughput technologies, the increasing availability of genomic data creates opportunities for deciphering the molecular mechanism and underlying pathogenesis of human diseases. Here, we develop an integrative framework called JONMF to identify lncRNA-mRNA co-expression modules based on the sample-matched lncRNA and mRNA expression profiles. We formulate the module detection task as an optimization problem with joint orthogonal non-negative matrix factorization that could effectively prevent multicollinearity and produce a good modularity interpretation. The constructed lncRNA-mRNA co-expression network and the gene-gene interaction network are used as the network-regularized constraints to improve the module accuracy, while the sparsity constraints are simultaneously utilized to achieve modular sparse solutions. We applied JONMF to human ovarian cancer dataset and the experiment results demonstrate that the proposed method can effectively discover biologically functional co-expression modules, which may provide insights into the function of lncRNAs and molecular mechanism of human diseases.
Deep Pathway Analysis V2.0: A Pathway Analysis Framework Incorporating Multi-Dimensional Omics Data
Pathway analysis is essential in cancer research particularly when scientists attempt to derive interpretation from genome-wide high-throughput experimental data. If pathway information is organized into a network topology, its use in interpreting omics data can become very powerful. In this paper, we propose a topology-based pathway analysis method, called DPA V2.0, which can combine multiple heterogeneous omics data types in its analysis. In this method, each pathway route is encoded as a Bayesian network which is initialized with a sequence of conditional probabilities specifically designed to encode directionality of regulatory relationships defined in the pathway. Unlike other topology-based pathway tools, DPA is capable of identifying pathway routes as representatives of perturbed regulatory signals. We demonstrate the effectiveness of our model by applying it to two well-established TCGA data sets, namely, breast cancer study (BRCA) and ovarian cancer study (OV). The analysis combines mRNA-seq, mutation, copy number variation, and phosphorylation data publicly available for both TCGA data sets. We performed survival analysis and patient subtype analysis and the analysis outcomes revealed the anticipated strengths of our model. We hope that the availability of our model encourages wet lab scientists to generate extra data sets to reap the benefits of using multiple data types in pathway analysis. The majority of pathways distinguished can be confirmed by biological literature. Moreover, the proportion of correctly indentified pathways is 10 percent higher than previous work where only mRNA-seq and mutation data is incorporated for breast cancer patients. Consequently, such an in-depth pathway analysis incorporating more diverse data can give rise to the accuracy of perturbed pathway detection.
BiModule: Biclique Modularity Strategy for Identifying Transcription Factor and microRNA Co-Regulatory Modules
Systematic identification of gene regulatory modules can provide invaluable knowledge towards understanding aberrant transcriptional/post-transcriptional collaborative regulatory (co-regulatory) effects in cancer. Transcription factor (TF) and microRNA (miRNA) are known as two classes of prominent regulators that play crucial roles in gene regulation. Existing studies on gene regulatory modules identification mainly focused on the miRNA-mediated regulatory network, and few considered these two regulators in a co-occurring network. In this current study, we developed a computational method called BiModule for systematically identifying TF-miRNA co-regulatory modules. BiModule operates in two main stages: it first constructs a cancer-specific regulator-mRNA network and then identifies modules based on maximal bicliques by employing biclique modularity strategy, which is a novel flexible method for bipartite graph mining. We applied our model to a cervical cancer dataset. The results showed that the TF-miRNA co-regulatory modules identified by BiModule exhibit denser connections and stronger expression correlations than another existing related method. Moreover, the BiModule-modules exhibit high biological functional enrichment. In addition, based on Kaplan-Meier survival analysis, we found a number of modules with significant prognostic associations. Availability: the R source code of BiModule is available at https://github.com/chupan1218/BiModule.
A Corresponding Region Fusion Framework for Multi-Modal Cervical Lesion Detection
Cervical lesion detection (CLD) using colposcopic images of multi-modality (acetic and iodine) is critical to computer-aided diagnosis (CAD) systems for accurate, objective, and comprehensive cervical cancer screening. To robustly capture lesion features and conform with clinical diagnosis practice, we propose a novel corresponding region fusion network (CRFNet) for multi-modal CLD. CRFNet first extracts feature maps and generates proposals for each modality, then performs proposal shifting to obtain corresponding regions under large position shifts between modalities, and finally fuses those region features with a new corresponding channel attention to detect lesion regions on both modalities. To evaluate CRFNet, we build a large multi-modal colposcopic image dataset collected from our collaborative hospital. We show that our proposed CRFNet surpasses known single-modal and multi-modal CLD methods and achieves state-of-the-art performance, especially in terms of Average Precision.
Institute of Electrical and Electronics Engineers (IEEE)
1545-5963