ERNIE-ViLG 2.0
image
open-source
ERNIE-ViLG 2.0 is Baidu's 24B-parameter text-to-image diffusion model. It generates high-quality images from Chinese tex...
Version: 2.0
Released: 2y 11m 25d ago on 11/07/2022
Pricing:
- details: free
Repository: Hugging Face
Architecture
- parameters: 24 billion
- context_length: N/A
- inference_type: Latent diffusion text-to-image
- training_data: Large-scale Chinese/English image-text pairs
Capabilities
- Chinese
- English
- Text-2-Image High-fidelity image generation
- Text-2-Image for Chinese cultural themes
Benchmarks
- COCO-FID: 6.75 (state-of-the-art at release)
Safety
- Blocks politically sensitive prompts (e.g. 'Tiananmen')
- No RLHF; uses rule-based moderation
- Can produce stereotyped content reflecting training data
Deployment
- regions: Global
- hosting: Hugging Face, Baidu AI Studio
- integrations: Used via Baidu AI open platform
Tags
text-to-imagediffusionChineseopen-source