Talks and presentations

Unlocking the Potential of Pre-training for Accelerated Discovery in Chemistry

September 20, 2024

Invited Talk, AI for Science Institute (AISI), Beijing, Remote

This talk explores the potential of pre-training methods to accelerate discovery in chemistry by learning general-purpose representations from large, diverse datasets. Building upon the speaker’s previous work on Joint Multi-domain Pre-training (JMP), which achieved state-of-the-art performance on a wide range of atomistic prediction tasks, the talk dives into key challenges and opportunities such as handling vast chemical space with limited data, developing pre-training objectives that leverage abundant simulation data, and scaling models to billions of parameters.

Unlocking the Potential of Pre-training for Accelerated Discovery in Chemistry

August 27, 2024

Invited Talk, 2024 Machine Learning for Materials and Molecular Discoveries Symposium, Gothenburg, Sweden

This talk explores the potential of pre-training methods to accelerate discovery in chemistry by learning general-purpose representations from large, diverse datasets. Building upon the speaker’s previous work on Joint Multi-domain Pre-training (JMP), which achieved state-of-the-art performance on a wide range of atomistic prediction tasks, the talk dives into key challenges and opportunities such as handling vast chemical space with limited data, developing pre-training objectives that leverage abundant simulation data, and scaling models to billions of parameters.

From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction

July 02, 2024

Invited Talk, King Abdullah University of Science and Technology (KAUST), Remote

This talk introduces Joint Multi-Domain Pre-training (JMP), a robust supervised pre-training approach which simultaneously trains on data from multiple chemical domains. JMP demonstrates state-of-the-art results on key small molecule, large molecule, and materials datasets and offers insights into the influence of pre-training strategies on fine-tuning.

From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction

April 10, 2024

Invited Talk, Molecular ML Reading Group, Remote

This talk introduces Joint Multi-Domain Pre-training (JMP), a robust supervised pre-training approach which simultaneously trains on data from multiple chemical domains. JMP demonstrates state-of-the-art results on key small molecule, large molecule, and materials datasets and offers insights into the influence of pre-training strategies on fine-tuning.

From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction

August 01, 2023

Talk, ACS Fall Meeting, San Francisco, CA

This talk introduces Joint Multi-Domain Pre-training (JMP), a robust supervised pre-training approach which simultaneously trains on data from multiple chemical domains. JMP demonstrates state-of-the-art results on key small molecule, large molecule, and materials datasets and offers insights into the influence of pre-training strategies on fine-tuning.

SmaQ: Smart Quantization for DNN Training by Exploiting Value Clustering

April 01, 2021

Talk, Georgia Institute of Technology, Atlanta, GA

This talk introduces the Smart Quantization (SmaQ) technique for DNN training. SmaQ is a novel quantization which exploits the observed (normally distributed) value clustering in DNNs to quantize neural network weight, gradient, feature map, gradient map, and optimizer state values. SmaQ is able to reduce memory usage during training by up to 6.7x with no loss in accuracy.

Legal Text Summarization Using Transformer Models

November 01, 2020

Talk, Georgia Institute of Technology, Atlanta, GA

This talk presents our work on a transformer-based encoder-decoder architecture for abstractive legal text summarization. Combines PEGASUS’ (from Zhang et al. 2020) pre-training objective with Longformer’s (from Beltagy et al. 2020) dilated attention mechanism to create a model that can handle extremely long input sequences to generate summaries of legal documents. Achieves state-of-the-art summarization performance on the BIGPATENT dataset.

Attention is All You Need: The Transformer Architecture

November 01, 2020

Talk, Georgia Institute of Technology, Atlanta, GA

This talk presents the seminal Transformer paper by Vaswani et al. (2017) and discusses its impact on the field of natural language processing. The Transformer architecture has revolutionized the field by introducing self-attention mechanisms that can model long-range dependencies in sequences, enabling parallelization and scalability in training.