Model Catalog
One API for all models. Search our library, deploy and run inference on NVIDIA GPUs in seconds
Generate up to ~4M tokens on first deposit of $5
| Model | Category | Context | Provider | Parameters | Price(/1M tokens) | Action |
|---|---|---|---|---|---|---|
| Vision | Per MiniMax M3 limits | MiniMax | MiniMax-M3 | Context <= 512K50% off Input: $0.60$0.30 Cached: $0.12$0.060 Output: $2.4$1.2 See more (2 tiers)Pricing by context · 2 tiers Context <= 512K50% off Input: $0.60$0.30 Cached: $0.12$0.060 Output: $2.4$1.2 Context 512K ~ 1M20% off Input: $0.75$0.60 Cached: $0.15$0.12 Output: $3$2.4 | ||
| Vision | Up to 256K tokens / multi-image | Alibaba (Cloud) | Undisclosed | 20% off Input: $0.50$0.40 Cached: $0.50$0.40 Output: $2$1.6 | ||
| Chat | 128K Tokens | Alibaba (Cloud) | Undisclosed | 20% off Input: $3.13$2.5 Cached: $3.13$2.5 Output: $9.38$7.5 | ||
| Chat | 300K Tokens | NVIDIA | Undisclosed | 20% off Input: $0.0863$0.069 Cached: $0.0863$0.069 Output: $0.35$0.28 | ||
| Vision | 256K Tokens | Moonshot AI | 1T (32B active) | 20% off Input: $1.12$0.89 Cached: $0.11$0.0894 Output: $4.64$3.71 | ||
| Chat | 256K Tokens (up to 1M) | NVIDIA | 120B (12B active) | 20% off Input: $0.43$0.35 Cached: $0.43$0.35 Output: $1.29$1.04 | ||
| Chat | 393,216 Tokens | DeepSeek | V4 family | 20% off Input: $0.17$0.14 Cached: $0.035$0.028 Output: $0.34$0.28 | ||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 744B (40B active) | Input <=32k20% off Input: $0.72$0.57 Cached: $0.14$0.12 Output: $3.23$2.58 See more (2 tiers)Pricing by context · 2 tiers Input <=32k20% off Input: $0.72$0.57 Cached: $0.14$0.12 Output: $3.23$2.58 32k < Input <= 200k20% off Input: $1.08$0.86 Cached: $0.21$0.17 Output: $3.94$3.15 | ||
| Chat | 200K Tokens | MiniMax | MoE (MiniMax-M2.7) | 20% off Input: $0.37$0.30 Cached: $0.37$0.30 Output: $1.5$1.2 | ||
| Chat | 200K Tokens | MiniMax | 230B (10B active) | 20% off Input: $0.38$0.30 Cached: $0.0763$0.061 Output: $1.52$1.21 | ||
| Vision | 256K Tokens | Moonshot AI | 1T (32B active) | 20% off Input: $0.72$0.57 Cached: $0.0713$0.057 Output: $3.76$3.01 | ||
| Code | Up to 1M Tokens | Alibaba (Cloud) | Input <=32k20% off Input: $1.25$1 Cached: $0.13$0.10 Output: $6.25$5 See more (4 tiers)Pricing by context · 4 tiers Input <=32k20% off Input: $1.25$1 Cached: $0.13$0.10 Output: $6.25$5 32k < Input <= 128k20% off Input: $2.25$1.8 Cached: $0.22$0.18 Output: $11.25$9 128k < Input <= 256k20% off Input: $3.75$3 Cached: $0.37$0.30 Output: $18.75$15 256k < Input <= 1m20% off Input: $7.5$6 Cached: $0.75$0.60 Output: $75$60 | |||
| Vision | 262,144 Tokens | Moonshot AI | 1T (32B active) | 20% off Input: $1.19$0.95 Cached: $0.24$0.19 Output: $5$4 | ||
| Chat | 32K Tokens | MistralAI | 7.3B | 20% off Input: $0.11$0.088 Cached: $0.11$0.088 Output: $0.19$0.15 | ||
| Code | 262K Tokens | Alibaba (Cloud) | 79.7B (3B active) | Input <= 32k20% off Input: $0.37$0.30 Cached: $0.37$0.30 Output: $1.88$1.5 See more (3 tiers)Pricing by context · 3 tiers Input <= 32k20% off Input: $0.37$0.30 Cached: $0.37$0.30 Output: $1.88$1.5 32k < Input <= 128k20% off Input: $0.63$0.50 Cached: $0.63$0.50 Output: $3.13$2.5 128k < Input <= 256k20% off Input: $1$0.80 Cached: $1$0.80 Output: $5$4 | ||
| Chat | 128K | Alibaba (Cloud) | Input <= 256k20% off Input: $0.50$0.40 Cached: $0.50$0.40 Output: $1.5$1.2 See more (2 tiers)Pricing by context · 2 tiers Input <= 256k20% off Input: $0.50$0.40 Cached: $0.50$0.40 Output: $1.5$1.2 256k < Input <= 1m20% off Input: $1.5$1.2 Cached: $1.5$1.2 Output: $4.5$3.6 | |||
| Code | Up to 1M Tokens | Alibaba (Cloud) | Input <= 32k20% off Input: $0.37$0.30 Cached: $0.0375$0.030 Output: $1.88$1.5 See more (4 tiers)Pricing by context · 4 tiers Input <= 32k20% off Input: $0.37$0.30 Cached: $0.0375$0.030 Output: $1.88$1.5 32k < Input <= 128k20% off Input: $0.63$0.50 Cached: $0.0625$0.050 Output: $3.13$2.5 128k < Input <= 256k20% off Input: $1$0.80 Cached: $0.100$0.080 Output: $5$4 256k < Input <= 1m20% off Input: $2$1.6 Cached: $0.20$0.16 Output: $12$9.6 | |||
| Vision | Up to 256K Tokens | Alibaba (Cloud) | Input <= 32k20% off Input: $0.25$0.20 Cached: $0.25$0.20 Output: $2$1.6 See more (3 tiers)Pricing by context · 3 tiers Input <= 32k20% off Input: $0.25$0.20 Cached: $0.25$0.20 Output: $2$1.6 32k < Input <= 128k20% off Input: $0.37$0.30 Cached: $0.37$0.30 Output: $3$2.4 128k < Input <= 256k20% off Input: $0.75$0.60 Cached: $0.75$0.60 Output: $6$4.8 | |||
| Vision | Up to 256K Tokens | Alibaba (Cloud) | Input <= 32k20% off Input: $0.0625$0.050 Cached: $0.0062$0.005 Output: $0.50$0.40 See more (3 tiers)Pricing by context · 3 tiers Input <= 32k20% off Input: $0.0625$0.050 Cached: $0.0062$0.005 Output: $0.50$0.40 32k < Input <= 128k20% off Input: $0.0937$0.075 Cached: $0.0094$0.0075 Output: $0.75$0.60 128k < Input <= 256k20% off Input: $0.15$0.12 Cached: $0.015$0.012 Output: $1.2$0.96 | |||
| Vision | Up to 256K tokens / 10 images | Alibaba Cloud | Undisclosed (frontier-scale) | Input <= 256k20% off Input: $0.63$0.50 Cached: $0.0625$0.050 Output: $3.75$3 See more (2 tiers)Pricing by context · 2 tiers Input <= 256k20% off Input: $0.63$0.50 Cached: $0.0625$0.050 Output: $3.75$3 256k < Input <= 1m20% off Input: $2.5$2 Cached: $0.25$0.20 Output: $7.5$6 | ||
| Chat | 128K Tokens | Alibaba (Cloud) | Undisclosed | Input <= 128k20% off Input: $1.63$1.3 Cached: $0.16$0.13 Output: $9.75$7.8 See more (2 tiers)Pricing by context · 2 tiers Input <= 128k20% off Input: $1.63$1.3 Cached: $0.16$0.13 Output: $9.75$7.8 128k < Input <= 256k20% off Input: $2.5$2 Cached: $0.25$0.20 Output: $15$12 | ||
| Vision | 256K Tokens | Alibaba (Cloud) | 235B | 20% off Input: $0.50$0.40 Cached: $0.50$0.40 Output: $2$1.6 | ||
| Vision | 256K Tokens | Alibaba (Cloud) | 9B | 20% off Input: $0.080$0.064 Cached: $0.080$0.064 Output: $0.080$0.064 | ||
| Vision | 256K Tokens | Alibaba (Cloud) | 30B(3B active) | 20% off Input: $0.13$0.10 Cached: $0.13$0.10 Output: $0.13$0.10 | ||
| Chat | 128k Tokens | OpenAI | 117B | 20% off Input: $0.15$0.12 Cached: $0.015$0.012 Output: $0.60$0.48 | ||
| Code | 262K Tokens | Alibaba (Cloud) | 30.5B | Input <= 32k20% off Input: $0.56$0.45 Cached: $0.56$0.45 Output: $2.81$2.25 See more (4 tiers)Pricing by context · 4 tiers Input <= 32k20% off Input: $0.56$0.45 Cached: $0.56$0.45 Output: $2.81$2.25 32k < Input <= 128k20% off Input: $0.94$0.75 Cached: $0.94$0.75 Output: $4.69$3.75 128k < Input <= 256k20% off Input: $1.5$1.2 Cached: $1.5$1.2 Output: $7.5$6 256k < Input <= 1m20% off Input: $3$2.4 Cached: $3$2.4 Output: $18$14.4 | ||
| Chat | 8192 Tokens | Microsoft | 7B | 20% off Input: $0.21$0.17 Cached: $0.21$0.17 Output: $0.25$0.20 | ||
| Chat | 128k Tokens | DeepSeek | 70B | 20% off Input: $0.70$0.56 Cached: $0.70$0.56 Output: $0.80$0.64 | ||
| OCR | 16K Tokens | Tencent Hunyuan | 1.0B | Input: $0.17 Output: $0.28 | ||
| Chat | 128K Tokens | Meta | 70B | 20% off Input: $0.12$0.096 Cached: $0.12$0.096 Output: $0.38$0.30 | ||
| Chat | 128k Tokens | NVIDIA | 31.6B Total / 3.2B Active | 20% off Input: $0.050$0.040 Cached: $0.050$0.040 Output: $0.20$0.16 | ||
| Chat | 256K Tokens | Alibaba Cloud | 80B (3.9B active) | 20% off Input: $0.19$0.15 Cached: $0.19$0.15 Output: $1.5$1.2 | ||
| Code | 262K Tokens | Alibaba Cloud | 480B (35B active) | Input <= 32k20% off Input: $1.88$1.5 Cached: $1.88$1.5 Output: $9.38$7.5 See more (4 tiers)Pricing by context · 4 tiers Input <= 32k20% off Input: $1.88$1.5 Cached: $1.88$1.5 Output: $9.38$7.5 32k < Input <= 128k20% off Input: $3.38$2.7 Cached: $3.38$2.7 Output: $16.88$13.5 128k < Input <= 256k20% off Input: $5.63$4.5 Cached: $5.63$4.5 Output: $28.13$22.5 256k < Input <= 1m20% off Input: $11.25$9 Cached: $11.25$9 Output: $112.5$90 | ||
| Vision | 256K Tokens (up to 1M) | Alibaba Cloud | 235B (22B active) | 20% off Input: $0.50$0.40 Cached: $0.50$0.40 Output: $5$4 | ||
| Chat | 128K Tokens | Alibaba (Cloud) | 235B (22B active) | Input <= 32k20% off Input: $1.5$1.2 Cached: $1.5$1.2 Output: $7.5$6 See more (3 tiers)Pricing by context · 3 tiers Input <= 32k20% off Input: $1.5$1.2 Cached: $1.5$1.2 Output: $7.5$6 32k < Input <= 128k20% off Input: $3$2.4 Cached: $3$2.4 Output: $15$12 128k < Input <= 256k20% off Input: $3.75$3 Cached: $3.75$3 Output: $18.75$15 | ||
| Chat | 128K Tokens | DeepSeek | 671B (37B active) | 20% off Input: $0.72$0.57 Cached: $0.72$0.57 Output: $2.87$2.29 | ||
| Chat | 160K Tokens | DeepSeek | 685B(37B active) | 20% off Input: $0.36$0.29 Cached: $0.0363$0.029 Output: $0.54$0.43 | ||
| Chat | 128K Tokens | DeepSeek | 671B (37B active) | 20% off Input: $0.36$0.29 Cached: $0.0713$0.057 Output: $1.43$1.15 | ||
| Chat | 393,216 Tokens | DeepSeek | V4 family | 20% off Input: $2.06$1.65 Cached: $0.17$0.14 Output: $4.13$3.3 | ||
| Chat | 256K Tokens | Moonshot AI | 1T (32B active) | 20% off Input: $0.72$0.57 Cached: $0.14$0.12 Output: $2.87$2.29 | ||
| Vision | 1M Tokens (API) / 256K Tokens (self-hosted base) | Alibaba (Cloud) | 35B (3B active) — hosted | 20% off Input: $0.13$0.10 Cached: $0.0125$0.010 Output: $0.50$0.40 | ||
| Vision | 1M Tokens (API) / 262K Tokens (self-hosted base) | Alibaba (Cloud) | 397B (17B active) — hosted | Input <= 256k20% off Input: $0.50$0.40 Cached: $0.050$0.040 Output: $3$2.4 See more (2 tiers)Pricing by context · 2 tiers Input <= 256k20% off Input: $0.50$0.40 Cached: $0.050$0.040 Output: $3$2.4 256k < Input <= 1m20% off Input: $0.63$0.50 Cached: $0.0625$0.050 Output: $3.75$3 | ||
| Vision | 256K Tokens (up to 1M) | Alibaba (Cloud) | 27B (dense) | 20% off Input: $0.37$0.30 Cached: $0.37$0.30 Output: $3$2.4 | ||
| Vision | 256K Tokens (up to 1M) | Alibaba (Cloud) | 35B (3B active) | 20% off Input: $0.31$0.25 Cached: $0.31$0.25 Output: $2.5$2 | ||
| Vision | 256K Tokens (up to 1M) | Alibaba (Cloud) | 122B (10B active) | 20% off Input: $0.50$0.40 Cached: $0.50$0.40 Output: $4$3.2 | ||
| Vision | 256K Tokens (up to 1M) | Alibaba (Cloud) | 27B | 20% off Input: $0.75$0.60 Cached: $0.75$0.60 Output: $4.5$3.6 | ||
| Vision | 256K Tokens (up to 1M) | Alibaba (Cloud) | 35B (A3B active) | 20% off Input: $0.31$0.25 Cached: $0.31$0.25 Output: $1.86$1.49 | ||
| Vision | 256K Tokens (up to 1M via Qwen3.5-Plus API) | Alibaba Cloud | 397B (17B active) | 20% off Input: $0.75$0.60 Cached: $0.75$0.60 Output: $4.5$3.6 | ||
| Chat | 205K Tokens | Z.ai (Zhipu AI) | 355B (32B active) | 20% off Input: $0.54$0.43 Cached: $0.11$0.086 Output: $2.51$2.01 | ||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | Flagship MoE | 20% off Input: $1.75$1.4 Cached: $0.33$0.26 Output: $5.5$4.4 | ||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | Foundation model | 20% off Input: $1.5$1.2 Cached: $0.30$0.24 Output: $5$4 | ||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 106B (12B active) | 20% off Input: $0.25$0.20 Cached: $0.0375$0.030 Output: $1.38$1.1 | ||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 30B | Input: -- Cached: -- Output: -- | ||
| Chat | 256K Tokens | Moonshot AI | 1T (32B active) | 20% off Input: $0.72$0.57 Cached: $0.14$0.12 Output: $2.87$2.29 | ||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.75$0.60 Cached: $0.14$0.11 Output: $2.75$2.2 | |||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.75$0.60 Cached: $0.14$0.11 Output: $2.75$2.2 | |||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $2.75$2.2 Cached: $0.56$0.45 Output: $11.13$8.9 | |||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $1.38$1.1 Cached: $0.27$0.22 Output: $5.63$4.5 | |||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.0875$0.070 Cached: $0.0125$0.010 Output: $0.50$0.40 | |||
| Chat | 200K Tokens | Z.ai (Zhipu AI) | Input: -- Cached: -- Output: -- | |||
| Chat | 128K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.13$0.10 Output: $0.13$0.10 | |||
| Vision | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $1.5$1.2 Cached: $0.30$0.24 Output: $5$4 | |||
| Vision | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.37$0.30 Cached: $0.0625$0.050 Output: $1.13$0.90 | |||
| Vision | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.050$0.040 Cached: $0.005$0.004 Output: $0.50$0.40 | |||
| Vision | 200K Tokens | Z.ai (Zhipu AI) | 20% off Input: $0.75$0.60 Cached: $0.14$0.11 Output: $2.25$1.8 | |||
| Vision | 200K Tokens | Z.ai (Zhipu AI) | Input: -- Cached: -- Output: -- |
No models match your search. Try a different keyword or category.
Sign up to get $1.00 free API credit on first deposit of $5. Test out the latest models now.
Access enterprise-grade open-source AI models including Llama 3, DeepSeek, Qwen, and more via our high-performance serverless API. Experience low-latency inference on the latest NVIDIA GPUs optimized for production workloads.
"Qubrid scaled our personalized outreach from hundreds to tens of thousands of prospects. AI-driven research and content generation doubled our campaign velocity without sacrificing quality."
Demand Generation Team
Marketing & Sales Operations
