ERNIE-ViLG 2.0

image
open-source
ERNIE-ViLG 2.0 is Baidu's 24B-parameter text-to-image diffusion model. It generates high-quality images from Chinese tex...
Version: 2.0
Released: 2y 11m 25d ago on 11/07/2022
Pricing:
  • details: free
Repository: Hugging Face

Architecture

  • parameters: 24 billion
  • context_length: N/A
  • inference_type: Latent diffusion text-to-image
  • training_data: Large-scale Chinese/English image-text pairs

Capabilities

  • Chinese
  • English
  • Text-2-Image High-fidelity image generation
  • Text-2-Image for Chinese cultural themes

Benchmarks

  • COCO-FID: 6.75 (state-of-the-art at release)

Safety

  • Blocks politically sensitive prompts (e.g. 'Tiananmen')
  • No RLHF; uses rule-based moderation
  • Can produce stereotyped content reflecting training data

Deployment

  • regions: Global
  • hosting: Hugging Face, Baidu AI Studio
  • integrations: Used via Baidu AI open platform

Tags

text-to-imagediffusionChineseopen-source

Join our community

Connect with others, share experiences, and stay in the loop.