losclouds
Compare · AI Inference

What's on offer.

APIs for running AI and machine learning model inference.

Offerings

50 offerings on this page with service context, pricing, regions, and links.

Google Gemini
Google GeminiService details

Offering

Gemma 3 4B

Offering details
Pay-as-you-go$0.040 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text, image

Google Gemini
Google GeminiService details

Offering

Gemma 4 26B A4B

Offering details
Pay-as-you-go$0.130 1M input tokens0 regions

Context Window: 262144 tokens; Input Modalities: image, text, video

Google Gemini
Google GeminiService details

Offering

Gemma 4 31B

Offering details
Pay-as-you-go$0.140 1M input tokens0 regions

Context Window: 262144 tokens; Input Modalities: image, text, video

Google Workspace
Google WorkspaceService details

Offering

Gemini for Google Workspace

Offering details
Subscription$20 per user/month (Gemini Business add-on)1 region

Gmail AI: Draft, summarize, reply; Docs AI: Write, rewrite, proofread; +2 more

Grammarly
GrammarlyService details

Offering

Grammarly AI Writing Assistant

Offering details
FreemiumFree1 region

Real-time Suggestions: true; Generative AI Writing: true; +3 more

Grammarly
GrammarlyService details

Offering

Grammarly Business

Offering details
Subscription$15 per member/month (billed annually, minimum 3 seats)1 region

Style Guide: Company style guide enforcement; Brand Tone: Custom tone settings; +2 more

Grammarly
GrammarlyService details

Offering

Grammarly GO (AI Writing Features)

Offering details
Subscription$12 per month (Pro plan, billed annually)1 region

Generation: Draft emails and documents; Rewriting: Full-paragraph rewrites; +2 more

Graphcore
GraphcoreService details

Offering

Graphcore Poplar SDK

Offering details
FreeFree1 region

Compiler: Poplar Graph Compiler; PyTorch Integration: PopTorch; +2 more

Groq
GroqService details

Offering

Groq Compound

Offering details
CustomPrice pending0 regions

Context Window: 131072 tokens; Input Modalities: text

Groq
GroqService details

Offering

GPT-OSS 120B on Groq

Offering details
Pay-as-you-go$0.150 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

Groq
GroqService details

Offering

GPT-OSS 20B on Groq

Offering details
Pay-as-you-go$0.075 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

Groq
GroqService details

Offering

Groq LPU AI Inference API

Offering details
Usage-basedFree2 regions

Inference Speed: 500+ tokens/second; Latency: <1ms per token; +3 more

Groq
GroqService details

Offering

Groq LLaMA Inference

Offering details
Usage-based$0.0001 per 1K input tokens (LLaMA 3 8B)1 region

Inference Speed: 750+ tokens/sec; Models Available: LLaMA 3, Mixtral, Gemma, Whisper; +3 more

Groq
GroqService details

Offering

Llama 3.1 8B Instant on Groq

Offering details
Pay-as-you-go$0.050 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

Groq
GroqService details

Offering

Llama 3.3 70B Versatile on Groq

Offering details
Pay-as-you-go$0.590 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

Groq
GroqService details

Offering

Llama 4 Scout on Groq

Offering details
Pay-as-you-go$0.110 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text, image

Groq
GroqService details

Offering

Groq Mixtral Inference

Offering details
Usage-based$0.0002 per 1K input tokens1 region

Architecture: Mixture of Experts 8x7B; Speed: 500+ tokens/sec; +3 more

Groq
GroqService details

Offering

Qwen3 32B on Groq

Offering details
Pay-as-you-go$0.290 1M input tokens0 regions

Context Window: 131072 tokens; Input Modalities: text

H2O.ai
H2O.aiService details

Offering

H2O.ai Model Deployment (Driverless AI + MLOps)

Offering details
SubscriptionFree3 regions

Scoring: REST API + batch; Champion-Challenger: A/B traffic splitting; +2 more

Hailuo AI
Hailuo AIService details

Offering

Hailuo AI MiniMax API

Offering details
Usage-based$0.0002 per 1K input tokens1 region

MiniMax-Text-01: 456B parameter model; Multimodal: Text, image, audio, video; +3 more

Helicone
HeliconeService details

Offering

Helicone - AI Gateway and Observability

Offering details
FreemiumFree1 region

Provider Support: OpenAI, Anthropic, Azure, Gemini, 30+; Request Logging: Full request/response capture; +3 more

Hugging Face
Hugging FaceService details

Offering

Hugging Face Inference Endpoints

Offering details
Usage-based$0.032 per hour (CPU)4 regions

One-Click Deployment: true; Auto-scaling: true; +3 more

Humanloop
HumanloopService details

Offering

Humanloop LLM Development Platform

Offering details
FreemiumFree1 region

Prompt Management: true; Evaluation Framework: true; +3 more

HyperWrite
HyperWriteService details

Offering

HyperWrite AI Models

Offering details
Subscription$19.99 per month (Premium)1 region

Multi-Model Access: GPT-4, Claude, proprietary; Intelligent Routing: Auto model selection; +3 more

IBM Cloud
IBM CloudService details

Offering

IBM Cloud - Watson Machine Learning

Offering details
Usage-basedFree5 regions

Model Serving: Online + batch scoring; AutoAI: Automated ML pipeline; +3 more

IBM Research
IBM ResearchService details

Offering

Granite 4.0 Micro

Offering details
Pay-as-you-go$0.017 1M input tokens0 regions

Context Window: 131000 tokens; Input Modalities: text

IBM watsonx
IBM watsonxService details

Offering

IBM watsonx.ai Foundation Models

Offering details
Usage-based$0.0001 per 1K tokens (Granite 3B)5 regions

Granite Models: 3B–20B parameters; Prompt Engineering: true; +3 more

Insilico Medicine
Insilico MedicineService details

Offering

Insilico Medicine PandaOmics & Chemistry42

Offering details
EnterpriseFree1 region

PandaOmics: AI target discovery; Chemistry42: Generative drug design; +3 more

Offering

Intel Gaudi AI Inference

Offering details
Usage-based$13.11 per hour (AWS DL1 instance, dl1.24xlarge)2 regions

Hardware: Gaudi 1/2/3 HPU; SDK: Optimum Habana (PyTorch); +2 more

InvokeAI
InvokeAIService details

Offering

InvokeAI - Generative AI API

Offering details
Open sourceFree1 region

Model Support: SD 1.x, SDXL, SD3; Workflows: Node-based pipeline editor; +2 more

Jasper
JasperService details

Offering

Jasper AI Marketing Content Platform

Offering details
Subscription$39 per month1 region

Brand Voice: true; Long-form Content: true; +3 more

Kagi
KagiService details

Offering

Kagi - AI Search Summarization

Offering details
Subscription$5 per month1 region

FastGPT: Instant AI answers; Universal Summarizer: Any URL summarization; +2 more

Khan Academy
Khan AcademyService details

Offering

Khanmigo - AI Tutoring Assistant

Offering details
Subscription$4 per month1 region

Socratic Tutoring: Question-based guidance; Subject Coverage: Math, science, humanities; +2 more

Kling AI
Kling AIService details

Offering

Kling AI - Video & Image Generation API

Offering details
Usage-based$0.140 per video generation1 region

Video API: 5-10 second clips, 1080p; Async Processing: Job queue + webhooks; +2 more

Lambda
LambdaService details

Offering

Lambda - GPU Cloud AI Inference

Offering details
Usage-based$0.500 per hour (A10 GPU)5 regions

GPU Options: H100, A100, A10, V100; Persistent Storage: Attached storage volumes; +3 more

LanceDB
LanceDBService details

Offering

LanceDB RAG & AI Application Backend

Offering details
FreemiumFree3 regions

RAG-Optimized Search: Hybrid ANN + BM25 retrieval; LangChain Integration: Native LangChain vector store; +3 more

LangChain
LangChainService details

Offering

LangChain - AI Inference and LLM Integrations

Offering details
Open sourceFree1 region

LLM Integrations: 100+ providers; LCEL: Composable chain syntax; +3 more

Lepton AI
Lepton AIService details

Offering

Lepton AI Inference & Deployment Platform

Offering details
Usage-based$0.0003 per 1K tokens (Llama 3 8B)2 regions

Python SDK: true; OpenAI-Compatible API: true; +3 more

Lightning AI
Lightning AIService details

Offering

Lightning AI Inference & Deployment

Offering details
Usage-basedFree3 regions

PyTorch-Native Deployment: Lightning App + PyTorch Serve; Automatic Batching: Dynamic request batching; +3 more

Liquid AI
Liquid AIService details

Offering

LFM2-24B-A2B

Offering details
Pay-as-you-go$0.030 1M input tokens0 regions

Context Window: 32768 tokens; Input Modalities: text

Liquid AI
Liquid AIService details

Offering

LFM2.5-1.2B-Instruct

Offering details
FreePrice pending0 regions

Context Window: 32768 tokens; Input Modalities: text

Showing 151–200 of 515 offerings