Unlocking the Potential of Pre-training for Accelerated Discovery in Chemistry
Date:
This talk explores the potential of pre-training methods to accelerate discovery in chemistry by learning general-purpose representations from large, diverse datasets. Building upon the speaker’s previous work on Joint Multi-domain Pre-training (JMP), which achieved state-of-the-art performance on a wide range of atomistic prediction tasks, the talk dives into key challenges and opportunities such as handling vast chemical space with limited data, developing pre-training objectives that leverage abundant simulation data, and scaling models to billions of parameters.