losclouds
Compare · AI Inference

What's on offer.

APIs for running AI and machine learning model inference.

Offerings

50 offerings on this page with service context, pricing, regions, and links.

Replicate
ReplicateService details

Offering

Replicate Open-Source AI Model API

Offering details
Usage-based$0.0002 per second CPU (varies by model)2 regions

10,000+ Models: 10,000+; Custom Model Deployment: true; +3 more

Resemble AI
Resemble AIService details

Offering

Resemble AI Voice API

Offering details
Usage-based$0.006 per second of audio1 region

Instant Voice Cloning: 3 seconds of audio; Real-Time Synthesis: < 200ms latency; +3 more

RunPod
RunPodService details

Offering

RunPod Serverless AI Inference

Offering details
Usage-based$0.0002 per second (RTX 3090)4 regions

Scale to Zero: $0 when idle; Custom Containers: Any Docker image; +3 more

Runway Research
Runway ResearchService details

Offering

Runway Research API

Offering details
Usage-based$0.050 per generation credit1 region

Gen-3 Alpha: Text-to-video API; Image-to-Video: Animate reference images; +3 more

Salesforce
SalesforceService details

Offering

Salesforce Einstein AI

Offering details
Subscription$50 per user/month (Einstein for Sales)4 regions

Einstein Copilot: true; Predictive Scoring: true; +2 more

SambaNova Systems
SambaNova SystemsService details

Offering

SambaNova Cloud AI Inference

Offering details
Usage-based$0.0005 per 1K tokens1 region

World-Record Speed: 2,300+ tokens/second; OpenAI Compatible: Drop-in replacement; +3 more

Scale AI
Scale AIService details

Offering

Scale Generative AI Platform

Offering details
Usage-basedFree1 region

RLHF at Scale: Human feedback pipelines; Model Evaluation: Benchmarks & red teaming; +2 more

Snorkel AI
Snorkel AIService details

Offering

Snorkel Flow Model Training

Offering details
SubscriptionFree1 region

Programmatic Labeling: Labeling functions; LLM-Assisted Labeling: Foundation model bootstrap; +2 more

Snowflake
SnowflakeService details

Offering

Snowflake Cortex AI

Offering details
Usage-based$0.040 per 1M tokens (Llama 3 8B)4 regions

No Data Movement: AI runs in Snowflake; LLM Functions: Complete, Summarize, Translate; +2 more

Sora
SoraService details

Offering

Sora Video Generation API

Offering details
Subscription$20 per month (ChatGPT Plus)1 region

Text-to-Video: Up to 1080p, 20 seconds; Image-to-Video: Animate still images; +3 more

Stability AI
Stability AIService details

Offering

Stability AI Inference API

Offering details
Usage-based$0.065 per image (SD3.5)1 region

Stable Diffusion 3.5: State-of-the-art image generation; Image Editing: Inpaint, outpaint, transform; +3 more

Stability AI API
Stability AI APIService details

Offering

Stability AI API - Stable Diffusion

Offering details
Usage-based$0.003 per image (SDXL 512px)1 region

Text-to-Image: SDXL & SD3 models; Image-to-Image: Guided transformation; +3 more

StepFun
StepFunService details

Offering

Step 3.5 Flash

Offering details
Pay-as-you-go$0.100 1M input tokens0 regions

Context Window: 262144 tokens; Input Modalities: text

StepFun
StepFunService details

Offering

Step 3.5 Flash

Offering details
FreePrice pending0 regions

Context Window: 256000 tokens; Input Modalities: text

Tabnine
TabnineService details

Offering

Tabnine AI Code Inference

Offering details
FreemiumFree1 region

Private Deployment: Self-hosted or VPC; Custom Model Training: Codebase-specific models; +3 more

Tencent
TencentService details

Offering

Hunyuan A13B Instruct

Offering details
Pay-as-you-go$0.140 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

Tencent Cloud AI
Tencent Cloud AIService details

Offering

Tencent Cloud AI — NLP, Vision & LLM APIs

Offering details
Usage-based$0.0008 per 1000 tokens (Hunyuan Lite)5 regions

Hunyuan LLM: Bilingual Chinese-English LLM; Computer Vision: Face, OCR, image recognition; +3 more

Tenstorrent
TenstorrentService details

Offering

Tenstorrent AI Inference Cloud

Offering details
Usage-based$0.500 per hour (Wormhole card)1 region

LLM Inference: Cost-efficient token generation; Open Software Stack: TT-Buda inference framework; +1 more

Together AI
Together AIService details

Offering

Together AI — Open-Source Model Inference Platform

Offering details
Usage-based$0.0001 per 1M tokens (Llama 3.2 8B)3 regions

Model Catalog: 100+ open-source models; Serverless Inference: Pay-per-token pricing; +3 more

Together AI
Together AIService details

Offering

DeepSeek V3.1 on Together AI

Offering details
Pay-as-you-go$0.600 1M input tokens0 regions

Context Window: 128000 tokens; Input Modalities: text

Together AI
Together AIService details

Offering

GLM-5 on Together AI

Offering details
Pay-as-you-go$1 1M input tokens0 regions

Context Window: 202752 tokens; Input Modalities: text

Together AI
Together AIService details

Offering

GPT-OSS 120B on Together AI

Offering details
Pay-as-you-go$0.150 1M input tokens0 regions

Context Window: 128000 tokens; Input Modalities: text

Together AI
Together AIService details

Offering

GPT-OSS 20B on Together AI

Offering details
Pay-as-you-go$0.050 1M input tokens0 regions

Context Window: 128000 tokens; Input Modalities: text

Together AI
Together AIService details

Offering

Kimi K2.5 on Together AI

Offering details
Pay-as-you-go$0.500 1M input tokens0 regions

Context Window: 256000 tokens; Input Modalities: text, image

Together AI
Together AIService details

Offering

Llama 4 Maverick on Together AI

Offering details
Pay-as-you-go$0.270 1M input tokens0 regions

Context Window: 524288 tokens; Input Modalities: text, image

Together AI
Together AIService details

Offering

Qwen3.5 397B on Together AI

Offering details
Pay-as-you-go$0.600 1M input tokens0 regions

Context Window: 262144 tokens; Input Modalities: text

Together AI
Together AIService details

Offering

Qwen3 Coder 480B on Together AI

Offering details
Pay-as-you-go$2 1M input tokens0 regions

Context Window: 256000 tokens; Input Modalities: text

TSMC
TSMCService details

Offering

TSMC AI Chip Fabrication

Offering details
Custom$20,000 per wafer (N5 process)4 regions

Leading-Edge Processes: N3 and N2 nodes; CoWoS Packaging: HBM memory integration; +1 more

Unsloth
UnslothService details

Offering

Unsloth Enterprise

Offering details
SubscriptionFree1 region

2-5x Faster Fine-tuning: Custom CUDA kernels; 80% Memory Reduction: Fit larger models on smaller GPUs; +2 more

Unsloth
UnslothService details

Offering

Unsloth AI Inference Optimization

Offering details
FreemiumFree1 region

Triton Kernels: Custom CUDA rewrite; Memory Reduction: Up to 70% less VRAM; +3 more

Upstage
UpstageService details

Offering

Upstage Solar API — Enterprise LLM & Document AI

Offering details
Usage-based$0.0001 per 1K tokens (Solar Mini)1 region

Solar LLM: 10.7B parameter, top leaderboard; Document Parse API: Structured data extraction; +3 more

Vercel
VercelService details

Offering

Vercel AI SDK & Inference

Offering details
FreemiumFree1 region

Unified Streaming API: 20+ AI providers; AI Gateway: Multi-provider routing; +3 more

vLLM
vLLMService details

Offering

vLLM — High-Throughput LLM Inference Engine

Offering details
Open sourceFree1 region

PagedAttention: Near 100% KV cache utilization; Throughput: 2-24x vs HuggingFace Transformers; +3 more

WhyLabs
WhyLabsService details

Offering

WhyLabs LLM Monitoring

Offering details
FreemiumFree1 region

Prompt Injection: Real-time detection; Toxicity Detection: Content safety scoring; +3 more

Writer
WriterService details

Offering

Writer — Full-Stack Enterprise Generative AI

Offering details
Subscription$18 per user per month1 region

Palmyra LLM: Enterprise-tuned LLM family; Knowledge Graph: Structured enterprise knowledge; +3 more

Writer
WriterService details

Offering

Palmyra X5

Offering details
Pay-as-you-go$0.600 1M input tokens0 regions

Context Window: 1040000 tokens; Input Modalities: text

Writesonic
WritesonicService details

Offering

Writesonic — AI Writing & SEO Platform

Offering details
FreemiumFree1 region

Chatsonic: ChatGPT alternative with web search; Content Templates: 100+ templates; +3 more

xAI
xAIService details

Offering

xAI API (Grok Models)

Offering details
Usage-based$0.0002 per 1K input tokens (Grok-2-Mini)1 region

Real-Time Data: X (Twitter) integration; Context Window: 128K-131K tokens; +3 more

xAI
xAIService details

Offering

Grok (xAI)

Offering details
Usage-based$0.000 per input token (grok-4-1-fast)1 region

Real-time X Data: true; Context Window: 131K tokens; +3 more

xAI
xAIService details

Offering

Grok 3

Offering details
Pay-as-you-go$3 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

xAI
xAIService details

Offering

Grok 3 Beta

Offering details
Pay-as-you-go$3 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

xAI
xAIService details

Offering

Grok 3 Mini

Offering details
Pay-as-you-go$0.300 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

xAI
xAIService details

Offering

Grok 3 Mini Beta

Offering details
Pay-as-you-go$0.300 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

xAI
xAIService details

Offering

Grok 4

Offering details
Pay-as-you-go$3 1M input tokens0 regions

Context Window: 256000 tokens; Input Modalities: image, text, file

xAI
xAIService details

Offering

Grok 4.1 Fast

Offering details
Pay-as-you-go$0.200 1M input tokens0 regions

Context Window: 2000000 tokens; Input Modalities: text, image, file

xAI
xAIService details

Offering

Grok 4.20

Offering details
Pay-as-you-go$2 1M input tokens0 regions

Context Window: 2000000 tokens; Input Modalities: text, image

xAI
xAIService details

Offering

Grok 4.20 Beta

Offering details
Pay-as-you-go$3 1M input tokens0 regions

Context Window: 2000000 tokens; Input Modalities: text, image

xAI
xAIService details

Offering

Grok 4.20 Multi-Agent

Offering details
Pay-as-you-go$2 1M input tokens0 regions

Context Window: 2000000 tokens; Input Modalities: text, image, file

xAI
xAIService details

Offering

Grok 4 Fast

Offering details
Pay-as-you-go$0.200 1M input tokens0 regions

Context Window: 2000000 tokens; Input Modalities: text, image

Showing 451–500 of 515 offerings