Gemini 1.5 Pro
text image video audio
paid
Google's multimodal foundation model for text, audio, video, and image understanding with long-context reasoning.
Version: 1.5-pro
Released: 1y 8m 17d ago on 02/15/2024
Pricing:
- tier: per-minute compute
- currency: USD
- details: Pricing not public; available via Vertex AI
Architecture
- family: Gemini
- parameters: Unknown
- training_data: Multimodal large-scale datasets
- context_length: 1000000
- inference_type: cloud
Capabilities
- multimodal-reasoning
- long-context-reasoning
- video-analysis
- speech-synthesis
- text-generation
- code
Languages Supported
enzhhijafrdees
Benchmarks
- MMLU: 90.1
- GSM8K: 95
- VideoQA: 89.3
Safety
- content filtering
- DeepMind responsible AI policy
- High reliability and alignment focus.
Deployment
- regions: US, EU, APAC
- hosting: Google Cloud
- integrations: Google Cloud Vertex AI, Workspace AI, Android Studio
API Access
Auth: OAuth2
Tags
proprietarymultimodallong-contextenterprise