Sayak Paul

Hi there 👋 I am Sayak Paul (সায়ক পাল). I work on 🧨 diffusion models at Hugging Face. Know more about me from here. I maintain a Google Doc answering some FAQs at length. You can check it out here.

My external articles and other publishing engagements are listed here. Decks from my speaking engagements are listed here. A detailed account of the things that are not directly available from the top navbar (interviews, talks etc.) can be found here.

The structure of this website is inspired by Omar’s site.

News

New work on few-step sampling of diffusion models SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation with NVIDIA (accepted to ICCV’25).
New work: From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning with Le Zhuo, Liangbing Zhao, and others (accepted to ICCV’25).
New work (CVPR’25): Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis with Bingda Tang, Boyang Zheng, Xichen Pan, and Saining Xie.
New work: Fine-Grained Perturbation Guidance via Attention Head Selection with Donghoon Ahn et. al.
Gave an invited talk at Stanford (CS25) about DiTs: Transformers in Diffusion Models for Image Generation and Beyond (slides and recording).

To know more about my projects, please refer to my GitHub profile. For an up-to-date list of my publications, refer to my Google Scholar page.

Apart from the blogs here, I try to contribute to other platforms in the form of writing. Please refer here for more details.