Offering
Replicate Open-Source AI Model API
Offering details10,000+ Models: 10,000+; Custom Model Deployment: true; +3 more
APIs for running AI and machine learning model inference.
50 offerings on this page with service context, pricing, regions, and links.
Offering
Replicate Open-Source AI Model API
Offering details10,000+ Models: 10,000+; Custom Model Deployment: true; +3 more
Offering
Resemble AI Voice API
Offering detailsInstant Voice Cloning: 3 seconds of audio; Real-Time Synthesis: < 200ms latency; +3 more
Offering
RunPod Serverless AI Inference
Offering detailsScale to Zero: $0 when idle; Custom Containers: Any Docker image; +3 more
Offering
Runway Research API
Offering detailsGen-3 Alpha: Text-to-video API; Image-to-Video: Animate reference images; +3 more
Offering
Salesforce Einstein AI
Offering detailsEinstein Copilot: true; Predictive Scoring: true; +2 more
Offering
SambaNova Cloud AI Inference
Offering detailsWorld-Record Speed: 2,300+ tokens/second; OpenAI Compatible: Drop-in replacement; +3 more
Offering
SAP AI Core
Offering detailsGenerative AI Hub: Multi-LLM gateway; Custom Model Training: Kubernetes-based; +2 more
Offering
Scale Generative AI Platform
Offering detailsRLHF at Scale: Human feedback pipelines; Model Evaluation: Benchmarks & red teaming; +2 more
Offering
Snorkel Flow Model Training
Offering detailsProgrammatic Labeling: Labeling functions; LLM-Assisted Labeling: Foundation model bootstrap; +2 more
Offering
Snowflake Cortex AI
Offering detailsNo Data Movement: AI runs in Snowflake; LLM Functions: Complete, Summarize, Translate; +2 more
Offering
Sora Video Generation API
Offering detailsText-to-Video: Up to 1080p, 20 seconds; Image-to-Video: Animate still images; +3 more
Offering
Stability AI Inference API
Offering detailsStable Diffusion 3.5: State-of-the-art image generation; Image Editing: Inpaint, outpaint, transform; +3 more
Offering
Stability AI API - Stable Diffusion
Offering detailsText-to-Image: SDXL & SD3 models; Image-to-Image: Guided transformation; +3 more
Offering
Step 3.5 Flash
Offering detailsContext Window: 262144 tokens; Input Modalities: text
Offering
Step 3.5 Flash
Offering detailsContext Window: 256000 tokens; Input Modalities: text
Offering
Tabnine AI Code Inference
Offering detailsPrivate Deployment: Self-hosted or VPC; Custom Model Training: Codebase-specific models; +3 more
Offering
Hunyuan A13B Instruct
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Tencent Cloud AI — NLP, Vision & LLM APIs
Offering detailsHunyuan LLM: Bilingual Chinese-English LLM; Computer Vision: Face, OCR, image recognition; +3 more
Offering
Tenstorrent AI Inference Cloud
Offering detailsLLM Inference: Cost-efficient token generation; Open Software Stack: TT-Buda inference framework; +1 more
Offering
Together AI — Open-Source Model Inference Platform
Offering detailsModel Catalog: 100+ open-source models; Serverless Inference: Pay-per-token pricing; +3 more
Offering
DeepSeek V3.1 on Together AI
Offering detailsContext Window: 128000 tokens; Input Modalities: text
Offering
GLM-5 on Together AI
Offering detailsContext Window: 202752 tokens; Input Modalities: text
Offering
GPT-OSS 120B on Together AI
Offering detailsContext Window: 128000 tokens; Input Modalities: text
Offering
GPT-OSS 20B on Together AI
Offering detailsContext Window: 128000 tokens; Input Modalities: text
Offering
Kimi K2.5 on Together AI
Offering detailsContext Window: 256000 tokens; Input Modalities: text, image
Offering
Llama 4 Maverick on Together AI
Offering detailsContext Window: 524288 tokens; Input Modalities: text, image
Offering
Qwen3.5 397B on Together AI
Offering detailsContext Window: 262144 tokens; Input Modalities: text
Offering
Qwen3 Coder 480B on Together AI
Offering detailsContext Window: 256000 tokens; Input Modalities: text
Offering
TSMC AI Chip Fabrication
Offering detailsLeading-Edge Processes: N3 and N2 nodes; CoWoS Packaging: HBM memory integration; +1 more
Offering
Unsloth Enterprise
Offering details2-5x Faster Fine-tuning: Custom CUDA kernels; 80% Memory Reduction: Fit larger models on smaller GPUs; +2 more
Offering
Unsloth AI Inference Optimization
Offering detailsTriton Kernels: Custom CUDA rewrite; Memory Reduction: Up to 70% less VRAM; +3 more
Offering
Upstage Solar API — Enterprise LLM & Document AI
Offering detailsSolar LLM: 10.7B parameter, top leaderboard; Document Parse API: Structured data extraction; +3 more
Offering
Vercel AI SDK & Inference
Offering detailsUnified Streaming API: 20+ AI providers; AI Gateway: Multi-provider routing; +3 more
Offering
vLLM — High-Throughput LLM Inference Engine
Offering detailsPagedAttention: Near 100% KV cache utilization; Throughput: 2-24x vs HuggingFace Transformers; +3 more
Offering
WhyLabs LLM Monitoring
Offering detailsPrompt Injection: Real-time detection; Toxicity Detection: Content safety scoring; +3 more
Offering
Writer — Full-Stack Enterprise Generative AI
Offering detailsPalmyra LLM: Enterprise-tuned LLM family; Knowledge Graph: Structured enterprise knowledge; +3 more
Offering
Palmyra X5
Offering detailsContext Window: 1040000 tokens; Input Modalities: text
Offering
Writesonic — AI Writing & SEO Platform
Offering detailsChatsonic: ChatGPT alternative with web search; Content Templates: 100+ templates; +3 more
Offering
xAI API (Grok Models)
Offering detailsReal-Time Data: X (Twitter) integration; Context Window: 128K-131K tokens; +3 more
Offering
Grok (xAI)
Offering detailsReal-time X Data: true; Context Window: 131K tokens; +3 more
Offering
Grok 3
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Grok 3 Beta
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Grok 3 Mini
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Grok 3 Mini Beta
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Grok 4
Offering detailsContext Window: 256000 tokens; Input Modalities: image, text, file
Offering
Grok 4.1 Fast
Offering detailsContext Window: 2000000 tokens; Input Modalities: text, image, file
Offering
Grok 4.20
Offering detailsContext Window: 2000000 tokens; Input Modalities: text, image
Offering
Grok 4.20 Beta
Offering detailsContext Window: 2000000 tokens; Input Modalities: text, image
Offering
Grok 4.20 Multi-Agent
Offering detailsContext Window: 2000000 tokens; Input Modalities: text, image, file
Offering
Grok 4 Fast
Offering detailsContext Window: 2000000 tokens; Input Modalities: text, image
| Service | Offering | Pricing model | Starting price | Regions | Features | Links |
|---|---|---|---|---|---|---|
ReplicateService details | Replicate Open-Source AI Model API Offering details | Usage-based | $0.0002 per second CPU (varies by model) | 2 | 10,000+ Models: 10,000+; Custom Model Deployment: true; +3 more | |
Resemble AIService details | Resemble AI Voice API Offering details | Usage-based | $0.006 per second of audio | 1 | Instant Voice Cloning: 3 seconds of audio; Real-Time Synthesis: < 200ms latency; +3 more | |
RunPodService details | RunPod Serverless AI Inference Offering details | Usage-based | $0.0002 per second (RTX 3090) | 4 | Scale to Zero: $0 when idle; Custom Containers: Any Docker image; +3 more | |
Runway ResearchService details | Runway Research API Offering details | Usage-based | $0.050 per generation credit | 1 | Gen-3 Alpha: Text-to-video API; Image-to-Video: Animate reference images; +3 more | |
SalesforceService details | Salesforce Einstein AI Offering details | Subscription | $50 per user/month (Einstein for Sales) | 4 | Einstein Copilot: true; Predictive Scoring: true; +2 more | |
SambaNova SystemsService details | SambaNova Cloud AI Inference Offering details | Usage-based | $0.0005 per 1K tokens | 1 | World-Record Speed: 2,300+ tokens/second; OpenAI Compatible: Drop-in replacement; +3 more | |
SAP Business Technology PlatformService details | SAP AI Core Offering details | Usage-based | $0.020 per capability unit hour | 3 | Generative AI Hub: Multi-LLM gateway; Custom Model Training: Kubernetes-based; +2 more | |
Scale AIService details | Scale Generative AI Platform Offering details | Usage-based | Free | 1 | RLHF at Scale: Human feedback pipelines; Model Evaluation: Benchmarks & red teaming; +2 more | |
Snorkel AIService details | Snorkel Flow Model Training Offering details | Subscription | Free | 1 | Programmatic Labeling: Labeling functions; LLM-Assisted Labeling: Foundation model bootstrap; +2 more | |
SnowflakeService details | Snowflake Cortex AI Offering details | Usage-based | $0.040 per 1M tokens (Llama 3 8B) | 4 | No Data Movement: AI runs in Snowflake; LLM Functions: Complete, Summarize, Translate; +2 more | |
SoraService details | Sora Video Generation API Offering details | Subscription | $20 per month (ChatGPT Plus) | 1 | Text-to-Video: Up to 1080p, 20 seconds; Image-to-Video: Animate still images; +3 more | |
Stability AIService details | Stability AI Inference API Offering details | Usage-based | $0.065 per image (SD3.5) | 1 | Stable Diffusion 3.5: State-of-the-art image generation; Image Editing: Inpaint, outpaint, transform; +3 more | |
Stability AI APIService details | Stability AI API - Stable Diffusion Offering details | Usage-based | $0.003 per image (SDXL 512px) | 1 | Text-to-Image: SDXL & SD3 models; Image-to-Image: Guided transformation; +3 more | |
StepFunService details | Step 3.5 Flash Offering details | Pay-as-you-go | $0.100 1M input tokens | 0 | Context Window: 262144 tokens; Input Modalities: text | |
StepFunService details | Step 3.5 Flash Offering details | Free | — | 0 | Context Window: 256000 tokens; Input Modalities: text | |
TabnineService details | Tabnine AI Code Inference Offering details | Freemium | Free | 1 | Private Deployment: Self-hosted or VPC; Custom Model Training: Codebase-specific models; +3 more | |
TencentService details | Hunyuan A13B Instruct Offering details | Pay-as-you-go | $0.140 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
Tencent Cloud AIService details | Tencent Cloud AI — NLP, Vision & LLM APIs Offering details | Usage-based | $0.0008 per 1000 tokens (Hunyuan Lite) | 5 | Hunyuan LLM: Bilingual Chinese-English LLM; Computer Vision: Face, OCR, image recognition; +3 more | |
TenstorrentService details | Tenstorrent AI Inference Cloud Offering details | Usage-based | $0.500 per hour (Wormhole card) | 1 | LLM Inference: Cost-efficient token generation; Open Software Stack: TT-Buda inference framework; +1 more | |
Together AIService details | Together AI — Open-Source Model Inference Platform Offering details | Usage-based | $0.0001 per 1M tokens (Llama 3.2 8B) | 3 | Model Catalog: 100+ open-source models; Serverless Inference: Pay-per-token pricing; +3 more | |
Together AIService details | DeepSeek V3.1 on Together AI Offering details | Pay-as-you-go | $0.600 1M input tokens | 0 | Context Window: 128000 tokens; Input Modalities: text | |
Together AIService details | GLM-5 on Together AI Offering details | Pay-as-you-go | $1 1M input tokens | 0 | Context Window: 202752 tokens; Input Modalities: text | |
Together AIService details | GPT-OSS 120B on Together AI Offering details | Pay-as-you-go | $0.150 1M input tokens | 0 | Context Window: 128000 tokens; Input Modalities: text | |
Together AIService details | GPT-OSS 20B on Together AI Offering details | Pay-as-you-go | $0.050 1M input tokens | 0 | Context Window: 128000 tokens; Input Modalities: text | |
Together AIService details | Kimi K2.5 on Together AI Offering details | Pay-as-you-go | $0.500 1M input tokens | 0 | Context Window: 256000 tokens; Input Modalities: text, image | |
Together AIService details | Llama 4 Maverick on Together AI Offering details | Pay-as-you-go | $0.270 1M input tokens | 0 | Context Window: 524288 tokens; Input Modalities: text, image | |
Together AIService details | Qwen3.5 397B on Together AI Offering details | Pay-as-you-go | $0.600 1M input tokens | 0 | Context Window: 262144 tokens; Input Modalities: text | |
Together AIService details | Qwen3 Coder 480B on Together AI Offering details | Pay-as-you-go | $2 1M input tokens | 0 | Context Window: 256000 tokens; Input Modalities: text | |
TSMCService details | TSMC AI Chip Fabrication Offering details | Custom | $20,000 per wafer (N5 process) | 4 | Leading-Edge Processes: N3 and N2 nodes; CoWoS Packaging: HBM memory integration; +1 more | |
UnslothService details | Unsloth Enterprise Offering details | Subscription | Free | 1 | 2-5x Faster Fine-tuning: Custom CUDA kernels; 80% Memory Reduction: Fit larger models on smaller GPUs; +2 more | |
UnslothService details | Unsloth AI Inference Optimization Offering details | Freemium | Free | 1 | Triton Kernels: Custom CUDA rewrite; Memory Reduction: Up to 70% less VRAM; +3 more | |
UpstageService details | Upstage Solar API — Enterprise LLM & Document AI Offering details | Usage-based | $0.0001 per 1K tokens (Solar Mini) | 1 | Solar LLM: 10.7B parameter, top leaderboard; Document Parse API: Structured data extraction; +3 more | |
VercelService details | Vercel AI SDK & Inference Offering details | Freemium | Free | 1 | Unified Streaming API: 20+ AI providers; AI Gateway: Multi-provider routing; +3 more | |
vLLMService details | vLLM — High-Throughput LLM Inference Engine Offering details | Open source | Free | 1 | PagedAttention: Near 100% KV cache utilization; Throughput: 2-24x vs HuggingFace Transformers; +3 more | |
WhyLabsService details | WhyLabs LLM Monitoring Offering details | Freemium | Free | 1 | Prompt Injection: Real-time detection; Toxicity Detection: Content safety scoring; +3 more | |
WriterService details | Writer — Full-Stack Enterprise Generative AI Offering details | Subscription | $18 per user per month | 1 | Palmyra LLM: Enterprise-tuned LLM family; Knowledge Graph: Structured enterprise knowledge; +3 more | |
WriterService details | Palmyra X5 Offering details | Pay-as-you-go | $0.600 1M input tokens | 0 | Context Window: 1040000 tokens; Input Modalities: text | |
WritesonicService details | Writesonic — AI Writing & SEO Platform Offering details | Freemium | Free | 1 | Chatsonic: ChatGPT alternative with web search; Content Templates: 100+ templates; +3 more | |
xAIService details | xAI API (Grok Models) Offering details | Usage-based | $0.0002 per 1K input tokens (Grok-2-Mini) | 1 | Real-Time Data: X (Twitter) integration; Context Window: 128K-131K tokens; +3 more | |
xAIService details | Grok (xAI) Offering details | Usage-based | $0.000 per input token (grok-4-1-fast) | 1 | Real-time X Data: true; Context Window: 131K tokens; +3 more | |
xAIService details | Grok 3 Offering details | Pay-as-you-go | $3 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
xAIService details | Grok 3 Beta Offering details | Pay-as-you-go | $3 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
xAIService details | Grok 3 Mini Offering details | Pay-as-you-go | $0.300 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
xAIService details | Grok 3 Mini Beta Offering details | Pay-as-you-go | $0.300 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
xAIService details | Grok 4 Offering details | Pay-as-you-go | $3 1M input tokens | 0 | Context Window: 256000 tokens; Input Modalities: image, text, file | |
xAIService details | Grok 4.1 Fast Offering details | Pay-as-you-go | $0.200 1M input tokens | 0 | Context Window: 2000000 tokens; Input Modalities: text, image, file | |
xAIService details | Grok 4.20 Offering details | Pay-as-you-go | $2 1M input tokens | 0 | Context Window: 2000000 tokens; Input Modalities: text, image | |
xAIService details | Grok 4.20 Beta Offering details | Pay-as-you-go | $3 1M input tokens | 0 | Context Window: 2000000 tokens; Input Modalities: text, image | |
xAIService details | Grok 4.20 Multi-Agent Offering details | Pay-as-you-go | $2 1M input tokens | 0 | Context Window: 2000000 tokens; Input Modalities: text, image, file | |
xAIService details | Grok 4 Fast Offering details | Pay-as-you-go | $0.200 1M input tokens | 0 | Context Window: 2000000 tokens; Input Modalities: text, image |
Showing 451–500 of 515 offerings