avatar

Sizhuang He

Second-Year Ph.D. Student in Computer Science
Yale University
sizhuang.he (at) yale.edu

Loading...
Citations
Loading...
H-Index
Last updated: Loading...

Curriculum Vitae

Last updated: March 25, 2026.

PDF

Research Directions

LLM Agentic Systems: LLM multi-agent systems for large-scale DNA methylation data curation

Post-Training for LLM Agents: Designing dense, verifiable step-level reward chains for LLM agents over sparse outcome rewards

Generative Modeling: Flow Matching, Diffusion, Discrete Diffusion

Computational Biology: Single-cell Transcriptomics Data Modeling

Education

Yale University New Haven, CT
Ph.D. in Computer Science Aug. 2024 -- Present
  • Advisor: Dr. David van Dijk
  • Research Focus: Machine Learning for Computational Biology
University of Michigan, Ann Arbor Ann Arbor, MI
Bachelor of Science in Honors Mathematics (Minor in Computer Science) Sep. 2019 -- May 2023
  • Graduated with Highest Distinction
  • GPA: 4.0 / 4.0

Research Experience

LLM Agents
  • Actively developing an LLM multi-agent system for large-scale, automated curation of DNA methylation datasets.
    • Designed a fully automated and generalizable pipeline that adapts to unseen datasets and aligns with the workflow needs of biological researchers.
    • The pipeline scales to thousands of publicly available datasets and reduces curation workloads that previously required years of manual effort by multiple biologists to just a few hours.
    • Designed for deployment as a continuous system that automatically downloads, curates, and updates relevant public datasets, providing biologists with clean, standardized data to accelerate downstream research.
    • Developing a RAG system for agent memory storage and retrieval, enabling more accurate generation and allowing agents to self-evolve by learning from past experience.
  • Developing a post-training method for LLM agents that replaces sparse final-outcome rewards with dense, verifiable step-level supervision for multi-step reasoning and tool calling.
    • Designing a framework that extracts verifiable intermediate rewards from environmental responses without requiring human-annotated labels or pretrained reward models.
    • Training Qwen-2.5 7B with GRPO on interactive benchmarks (ALFWorld, ScienceWorld) for stable multi-step policy optimization.
Generative Models and LLMs
  • Developed Soft-Rank Diffusion, the first discrete diffusion framework that lifts permutations into a continuous soft-rank space, enabling scalable generative modeling over the combinatorially intractable symmetric group.
    • Designed an innovative forward process that replaces abrupt shuffle-based corruption with smooth reflected diffusion in continuous soft-rank space, yielding more structured trajectories that scale gracefully with sequence length.
    • Derived a hybrid reverse sampler that augments intractable discrete steps in permutation space with tractable continuous updates in soft-rank space, enabling efficient and principled generation.
    • Demonstrated superior performance on sorting and combinatorial optimization benchmarks, with particularly strong gains in the long-sequence regime where prior methods degrade.
    • Currently under peer review.
  • Developed CaDDi, a Non-Markovian Discrete Diffusion framework that unifies discrete diffusion with causal LLMs—two core paradigms in natural language generation previously viewed as orthogonal.
    • By breaking the longstanding Markovian assumption in discrete diffusion models, CaDDi achieves more coherent text generation—demonstrated by lower generative perplexity compared to prior discrete diffusion approaches.
    • Finetuned a Qwen-3 1.5B LLM to perform diffusion-style denoising, achieving improved performance on multiple reasoning benchmarks.
    • Published as a co-first author at NeurIPS 2025.
  • Developed CaLMFlow, a flow-matching paradigm that reformulates the problem from solving an ODE to a Volterra integral equation, mitigating stiffness issues commonly encountered in ODE-based formulations.
    • Applied CaLMFlow to single-cell perturbation response prediction, demonstrating superior ability to model the underlying data distribution and significantly improved extrapolation to OOD perturbations compared to specialized baselines.

Publications

  1. Learning Permutation Distributions via Reflected Diffusion on Ranks
    S. He*, Y. Zhang*, et al.
    In Review
  2. Non-Markovian Discrete Diffusion with Causal Language Models
    Y. Zhang*, S. He*, et al.
    NeurIPS 2025
  3. STRIDE: Post-Training LLMs to Reason and Refine Bio-Sequences via Edit Trajectories
    D. Zhang, S. Zhang, S. He, Y. Zhang and D. van Dijk
    In Review
  4. TANTE: Time-Adaptive Operator Learning via Neural Taylor Expansion
    Z. Wu, S. Wang, S. Zhang, S. He, et al.
    In Review
  5. Intelligence at the Edge of Chaos
    S. Zhang*, A. Patel*, S. Rizvi, N. Liu, S. He, et al.
    ICLR 2025
  6. COAST: Intelligent Time-Adaptive Neural Operators
    Z. Wu, S. Zhang, S. He, et al.
    AI4MATH Workshop at ICML 2025
  7. Scaling Large Language Models for Next-Generation Single-Cell Analysis
    S. Rizvi*, D. Levine*, A. Patel*, S. Zhang*, E. Wang*, S. He, et al.
    In Review
  8. CaLMFlow: Flow Matching using Causal Language Models
    S. He*, D. Levine*, et al.
    arXiv
  9. Operator Learning Meets Numerical Analysis: Improving Neural Networks through Iterative Methods
    E. Zappala, D. Levine, S. He, et al.
    arXiv

* denotes equal contribution.

Honors & Awards

Services


Powered by Jekyll and Minimal Light theme.