Multi-omic Data Integration Using Multi-project and Multi-profile Kernel
Joint Non-negative Matrix Factorization to Identify and Analyze
Co-modules in Lung Adenocarcinoma
Abstract
Multi-omic data integration analyzes a vast amount of biological data
and contributes to understanding the biological processes underlying
organisms. Multiple machine learning techniques have been proposed to
solve this task, including extensions of the joint Non-negative Matrix
Factorization (jNMF) method, such as the Multi-project and Multi-profile
jNMF (M&M-jNMF). This method jointly factorizes input matrices from two
projects into low-rank matrices which have clustering properties.
However, the M&M-jNMF method does not capture the non-linear patterns
of the data. This paper proposes an extension of the M&M-jNMF approach
using projections into high-dimensional spaces through kernel functions;
therefore, we propose the M&M-KjNMF method. We compared the standard
M&M-jNMF and M&M-KjNMF methods using three different omic profiles of
the lung adenocarcinoma data. As M&M-jNMF, we used data from
experimental and observational data source. We evaluated the performance
of both methods by comparing the cophenetic coefficient, AUC, and
biological score. We found that M&M-KjNMF outperforms M&M-jNMF. The
new proposed method enables the identification of molecule co-modules
enriched in pathways tightly related to lung cancer emergence and
progression.