Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback Paper • 2510.16888 • Published Oct 19 • 21
FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation Paper • 2509.25187 • Published Sep 29 • 2
Can Understanding and Generation Truly Benefit Together -- or Just Coexist? Paper • 2509.09666 • Published Sep 11 • 34
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published Jun 3 • 58
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published Jun 3 • 58
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 33
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 37
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 37
ConsisID Collection Identity-Preserving Text-to-Video Generation by Frequency Decomposition • 4 items • Updated Dec 3, 2024 • 12