Education

Academic background and degrees

Georgia Institute of Technology

PhD in Machine Learning, 4.0 GPA

Featured
2024 2028 (expected)
  • Advisors: Pan Li and Victor Fung
  • Research Focus: Foundation models for atomic-scale simulation and scientific discovery.
  • CSGF Fellowship Alternate List (2025)
  • NSF Graduate Research Fellowship Honorable Mention (2024)
Georgia Institute of Technology

M.S. in Computer Science (ML Focus), 4.0 GPA

Featured
2020 2021
  • Advisor: Hyesoon Kim
  • "Thank a Teacher" Award, Georgia Tech Center for Teaching and Learning (2020, 2021)
Georgia Institute of Technology

B.S. in Computer Science (ML Focus)

Featured
2015 2019
  • ACM SIGBED Student Research Competition Bronze Medal (2019)
  • Zell Miller Scholarship Recipient (2015 - 2019)
Druid Hills High School

International Baccalaureate Diploma

2011 2015

Work Experience

My professional and research history

ByteDance Seed

Research Scientist Intern, AI for Science

Featured
May 2025 Present
  • Developed STAR-MD, a generative molecular dynamics model and the first to simulate stable protein dynamics at microsecond timescales---extrapolating up to 100x beyond training data where all prior methods fail catastrophically. Achieves SOTA on ATLAS with ~33% higher coverage and ~60% higher structural quality, providing up to 1000x speedup over traditional MD. (Under review*)
  • Designed an SE(3)-equivariant diffusion architecture with novel joint spatiotemporal attention that avoids cubic memory bottlenecks, scaling to proteins with 1000+ residues and rollouts of thousands of frames (10+ s).
  • Built end-to-end pipeline for generative atomic dynamics: distributed training, physics-based relaxation, quality/diversity evaluation.
Graph Computation and Machine Learning Lab @ GT
Aug 2024 Present
  • Researching robust fine-tuning strategies for large-scale pre-trained GNN models.
  • Co-developed MatterTune, an open-source platform for fine-tuning atomistic foundation models (UMA, JMP, EquiformerV2, MACE, etc.) with parameter-efficient methods; achieved near-SOTA on the MatBench Discovery benchmark. (Digital Discovery 2025)
  • Contributed to a benchmark study of 8 robust fine-tuning methods across 6 molecular graph foundation models on 12 downstream tasks, informing the design of improved fine-tuning strategies for molecular property prediction. (NeurIPS 2025)
  • Prior work at GT HPArch Lab: Developed efficient inference strategies for diffusion models (latent-space sampling, quantization) and memory-efficient training (~7x memory reduction during training and inference). (IEEE CAL 2021*, MemSys 2020*)
ProcessMiner

Machine Learning Intern

June 2024 Aug 2024
  • Developed novel pre-trained transformer models for manufacturing process data.
  • Developed transformer models pre-trained on approximately 500,000 time-series data points from manufacturing processes to predict process outcomes and detect anomalies.
  • Fine-tuned models to achieve accuracy improvements (relative to previous production models) on real-world manufacturing datasets.
  • Researched efficient inference strategies for pre-trained image diffusion models, with a focus on generating diverse, high-quality images.
  • Developed an efficient sampling method for Denoising Diffusion Probabilistic Models (DDPMs) which leverages the structure of the latent space to guide sampling, reducing the number of samples needed for high-quality image generation.
  • Led the development of Joint Multi-domain Pre-training (JMP), a foundation model for atomic property prediction pre-trained on 120M+ structures from diverse chemical domains (catalysis and small molecules). Achieved SOTA on 35 of 41 downstream tasks---including out-of-distribution domains (large molecules, materials, protein-ligand complexes). (ICLR 2024*)
  • Co-authored a perspective on challenges for large-scale generalizable ML potentials in catalyst discovery (ACS Catalysis 2022), and co-developed attention-based transfer learning methods for GNNs across molecular and catalyst domains (J Chem Phys 2022).
  • Contributed to the Open Catalyst 2022 paper by running baseline model benchmarks for oxide electrocatalysts. (ACS Catal. 2023)
  • Developed novel quantization techniques for efficient DNN training and inference: SmaQ exploits value clustering to achieve 6.7x memory reduction (IEEE CAL 2021*), and NNW-BDI compresses neural network weights for up to 7x memory reduction (MemSys 2020*).
  • Optimized SLAM and robotics algorithms for real-time deployment on resource-constrained platforms: achieved 5x speedup for ORB-SLAM2 on Raspberry Pi (SRC ESWEEK 2019, 3rd Place*), co-developed Pisces for 7.4x faster power-aware SLAM on FPGAs (DAC 2020), and developed context-aware task scheduling for mobile robots (IEEE Edge 2023).
  • Co-authored drone characterization studies with detailed software profiling of the ArduCopter flight stack (ISPASS 2020, ASPLOS 2021).
Georgia Institute of Technology
Aug 2020 May 2021
  • Led weekly recitations, graded assignments, and held office hours for CS 4510: Automata and Complexity, a senior-level undergraduate course on the theory of computation.
  • Received two "Thank a Teacher" awards from the Georgia Tech Center for Teaching and Learning in recognition of outstanding contributions and positive impact as a teaching assistant. (2020, 2021)