next BIG future

next BIG future

New DeepSeek Jan Pro 7B Beats OpenAI Dall-E 3 on Image Generation

By Brian Wang

NextBigFuture's avatar
NextBigFuture
Jan 27, 2025
∙ Paid

DeepSeek just dropped a new open-source multmodal AI model, Janus-Pro-7B. It is MIT opensource license.

It's multimodal (can generate images) and beats OpenAI's DALL-E 3 and Stable Diffusion across GenEval and DPG-Bench benchmarks.

This comes on top of all the R1 hype.

Here is the link to the Deepseek Janus 7B Github.

Here is the Huggingface area for DeepSeek Janus Pro 7B.

Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.

Keep reading with a 7-day free trial

Subscribe to next BIG future to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Nextbigfuture
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture