Sayak Paul
Hi there 👋 I am Sayak Paul (সায়ক পাল). I work on 🧨 diffusion models at Hugging Face. Know more about me from here. I maintain a Google Doc answering some FAQs at length. You can check it out here.
My external articles and other publishing engagements are listed here. Decks from my speaking engagements are listed here. A detailed account of the things that are not directly available from the top navbar (interviews, talks etc.) can be found here.
The structure of this website is inspired by Omar’s site.
News
- New work (CVPR’25): Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis with Bingda Tang, Boyang Zheng, Xichen Pan, and Saining Xie.
- SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation with NVIDIA (ICCV’25; Spotlight 💡).
- From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning with Le Zhuo, Liangbing Zhao, and others (ICCV’25).
- New work (NeurIPS’25): Fine-Grained Perturbation Guidance via Attention Head Selection with Donghoon Ahn et. al.
- New work: Factuality Matters: When Image Generation and Editing Meet Structured Visuals with my good friend and collaborator Le Zhuo (and et al.).
To know more about my projects, please refer to my GitHub profile. For an up-to-date list of my publications, refer to my Google Scholar page.
Apart from the blogs here, I try to contribute to other platforms in the form of writing. Please refer here for more details.