Scalable Spatio-Temporal
SE(3) Diffusion for
Long-Horizon Protein Dynamics
We present STAR-MD, a scalable SE(3)-equivariant diffusion model that generates physically plausible protein trajectories over microsecond timescales.
Abstract
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics, but their computational cost limits access to biologically relevant timescales. Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation due to architectural constraints, error accumulation, and inadequate modeling of spatiotemporal dynamics.
We present STAR-MD (Spatio-Temporal Autoregressive Rollout for Molecular Dynamics), a scalable SE(3)-equivariant diffusion model that generates physically plausible protein trajectories over microsecond timescales. Our key innovation is a causal diffusion transformer with joint spatiotemporal attention that efficiently captures complex space-time dependencies while avoiding the memory bottlenecks of existing methods.
On the standard ATLAS benchmark, STAR-MD achieves state-of-the-art performance across all metrics-substantially improving conformational coverage, structural validity, and dynamic fidelity compared to previous methods. STAR-MD successfully extrapolates to generate stable microsecond-scale trajectories where baseline methods fail catastrophically, maintaining high structural quality throughout the extended rollout.
Our comprehensive evaluation reveals severe limitations in current models for long-horizon generation, while demonstrating that STAR-MD's joint spatiotemporal modeling enables robust dynamics simulation at biologically relevant timescales, paving the way for accelerated exploration of protein function.
Method Overview
STAR-MD uses a causal diffusion transformer with joint spatio-temporal attention for autoregressive protein trajectory generation.
Attention
Frame
Results
We evaluate STAR-MD on the ATLAS benchmark across multiple timescales (100 ns, 240 ns, and 1 us). Results show that STAR-MD achieves state-of-the-art performance in conformational coverage, structural validity, and dynamic fidelity compared to existing methods. Best non-oracle values are highlighted with *.
Table 1: ATLAS 100 ns Evaluation
Coverage & Validity (JSD, Rec), Dynamic Fidelity (tICA, RMSD, AutoCor, VAMP-2), and Structural Validity (CA%, AA%, CA+AA%) metrics.
← Scroll horizontally to see all metrics →
| Model | JSD (down) | Rec (up) | tICA (up) | RMSD (down) | AutoCor (down) | VAMP-2 (down) | CA% (up) | AA% (up) | CA+AA% (up) |
|---|---|---|---|---|---|---|---|---|---|
MD (Oracle) | 0.31 | 0.67 | 0.17 | 0.00 | 0.00 | 0.02 | 98.37 | 98.07 | 96.43 |
| 0.56+/-0.01 | 0.28+/-0.01 | 0.12+/-0.00 | 0.38+/-0.01 | 0.05+/-0.00 | 0.38+/-0.01 | 71.83+/-1.90 | 95.03+/-0.59 | 68.31+/-2.20 | |
| 0.59+/-0.01 | 0.20+/-0.01 | N/A | 3.31+/-0.06 | 0.12+/-0.01 | 1.56+/-0.01 | 10.58+/-0.09 | 0.82+/-0.10 | 0.47+/-0.04 | |
| 0.52+/-0.01 | 0.36+/-0.01 | 0.15+/-0.01 | 0.20+/-0.00 | 0.08+/-0.00 | 0.47+/-0.02 | 56.94+/-0.52 | 92.47+/-0.25 | 52.06+/-0.36 | |
STAR-MDOurs | *0.43+/-0.01 | *0.54+/-0.01 | *0.17+/-0.00 | *0.07+/-0.02 | *0.02+/-0.00 | *0.10+/-0.02 | *86.81+/-0.64 | *98.18+/-0.05 | *85.29+/-0.62 |
Table 2a: ATLAS 240 ns Evaluation
Short-horizon evaluation comparing trajectory quality over 240 nanoseconds.
← Scroll horizontally to see all metrics →
| Model | JSD (down) | Rec (up) | RMSD (down) | AutoCor (down) | CA% (up) | AA% (up) | CA+AA% (up) |
|---|---|---|---|---|---|---|---|
MD (Oracle) | 0.26 | 0.75 | 0.01 | 0.00 | 99.53 | 96.83 | 96.36 |
| 0.52+/-0.01 | 0.38+/-0.01 | 0.48+/-0.01 | 0.25+/-0.01 | 63.25+/-2.10 | 87.83+/-1.13 | 56.60+/-2.10 | |
| 0.57+/-0.03 | 0.20+/-0.03 | 1.76+/-0.05 | 0.14+/-0.01 | 8.96+/-0.25 | 0.94+/-0.12 | 0.63+/-0.16 | |
| 0.51+/-0.01 | 0.42+/-0.02 | 0.35+/-0.01 | 0.39+/-0.01 | 44.71+/-1.55 | 73.13+/-0.84 | 36.51+/-1.22 | |
STAR-MDOurs | *0.44+/-0.01 | *0.59+/-0.01 | *0.20+/-0.02 | *0.03+/-0.01 | *85.16+/-1.91 | *97.57+/-0.13 | *83.15+/-1.99 |
Table 2b: ATLAS 1 us Evaluation
Long-horizon evaluation demonstrating STAR-MD's stability over microsecond timescales.
← Scroll horizontally to see all metrics →
| Model | JSD (down) | Rec (up) | RMSD (down) | AutoCor (down) | CA% (up) | AA% (up) | CA+AA% (up) |
|---|---|---|---|---|---|---|---|
MD (Oracle) | 0.23 | 0.91 | 0.00 | 0.00 | 96.25 | 86.50 | 82.75 |
| 0.56+/-0.01 | 0.36+/-0.03 | 0.37+/-0.02 | 0.39+/-0.01 | 36.11+/-7.34 | 56.99+/-4.52 | 24.81+/-4.30 | |
| 0.65 | 0.20 | 0.78+/-0.03 | *0.04+/-0.01 | 9.64+/-0.02 | 0.19+/-0.00 | 0.06+/-0.00 | |
| 0.55+/-0.02 | 0.45+/-0.02 | 0.33+/-0.02 | 0.38+/-0.03 | 54.74+/-1.79 | 62.32+/-3.43 | 36.91+/-1.39 | |
STAR-MDOurs | *0.46+/-0.01 | *0.61+/-0.02 | *0.13+/-0.02 | 0.10+/-0.02 | *88.47+/-1.09 | *89.81+/-0.65 | *79.93+/-1.04 |
Figure 3: Long-horizon Validity Analysis
Comparison of CA + AA Validity (%) across different timescales (100 ns, 240 ns, and 1 us). STAR-MD (red) maintains high structural validity even over microsecond-scale rollouts, significantly outperforming baseline methods which degrade rapidly. Shaded regions indicate standard error across multiple runs.
100 ns
240 ns
1000 ns
Generated Trajectories
Explore generated protein dynamics trajectories in 3D. STAR-MD produces stable, physically plausible conformational changes.