From molecules to materials: Pre-training large generalizable models for atomic property prediction

Published in International Conference on Learning Representations, 2024

Citation: Nima Shoghi, Adeesh Kolluru, John R Kitchin, Zachary W Ulissi, C Lawrence Zitnick, Brandon M Wood, International Conference on Learning Representations, 2024 https://arxiv.org/abs/2310.16802

This paper introduces Joint Multi-domain Pre-training (JMP), a supervised pre-training strategy that trains models on multiple datasets from different chemical domains simultaneously. By leveraging a combined dataset of 120M systems, JMP demonstrates significant improvements over training from scratch and achieves state-of-the-art performance on a wide range of downstream tasks, showcasing the potential of pre-training strategies that utilize diverse data for advancing atomic property prediction across chemical domains.

Access paper here