Learning Relationships Between Disparate Representations of Objects with Transformers and Contrastive Losses
Abstract
Large language models are moving into a multi-modal space covering text, images, audio, and other media, yet scientific datasets possess features that are not represented by this paradigm. In cosmology, for instance, galaxies are embedded within dark matter-dominated collapsed objects, termed halos. A galaxy can be described in many ways: star formation history (SFH) characterizing the evolution of stellar content, magnitudes describing luminosity across different wavelengths (MAG), images, or matter distribution of the hosting halo. Despite their differences, these descriptions represent the same cosmological object. We propose the Object Foundation Model (OFM) to learn disparate representations of objects in scientific domains. OFM learns the general underlying representation of sparse and indeterminate data and facilitates robust predictions on diverse inputs. Unlike traditional deep learning with architecture-specific predictions, the OFM predictions are requested via a language-based key that varies with the user query.