Offering
Gemma 3 4B
Offering detailsContext Window: 131072 tokens; Input Modalities: text, image
APIs for running AI and machine learning model inference.
50 offerings on this page with service context, pricing, regions, and links.
Offering
Gemma 3 4B
Offering detailsContext Window: 131072 tokens; Input Modalities: text, image
Offering
Gemma 3 4B
Offering detailsContext Window: 32768 tokens; Input Modalities: text, image
Offering
Gemma 3n 2B
Offering detailsContext Window: 8192 tokens; Input Modalities: text
Offering
Gemma 3n 4B
Offering detailsContext Window: 32768 tokens; Input Modalities: text
Offering
Gemma 3n 4B
Offering detailsContext Window: 8192 tokens; Input Modalities: text
Offering
Gemma 4 26B A4B
Offering detailsContext Window: 262144 tokens; Input Modalities: image, text, video
Offering
Gemma 4 31B
Offering detailsContext Window: 262144 tokens; Input Modalities: image, text, video
Offering
Lyria 3 Clip Preview
Offering detailsContext Window: 1048576 tokens; Input Modalities: text, image
Offering
Lyria 3 Pro Preview
Offering detailsContext Window: 1048576 tokens; Input Modalities: text, image
Offering
Gemini for Google Workspace
Offering detailsGmail AI: Draft, summarize, reply; Docs AI: Write, rewrite, proofread; +2 more
Offering
Grammarly AI Writing Assistant
Offering detailsReal-time Suggestions: true; Generative AI Writing: true; +3 more
Offering
Grammarly Business
Offering detailsStyle Guide: Company style guide enforcement; Brand Tone: Custom tone settings; +2 more
Offering
Grammarly GO (AI Writing Features)
Offering detailsGeneration: Draft emails and documents; Rewriting: Full-paragraph rewrites; +2 more
Offering
Graphcore Poplar SDK
Offering detailsCompiler: Poplar Graph Compiler; PyTorch Integration: PopTorch; +2 more
Offering
Groq Compound
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
GPT-OSS 120B on Groq
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
GPT-OSS 20B on Groq
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Groq LPU AI Inference API
Offering detailsInference Speed: 500+ tokens/second; Latency: <1ms per token; +3 more
Offering
Groq LLaMA Inference
Offering detailsInference Speed: 750+ tokens/sec; Models Available: LLaMA 3, Mixtral, Gemma, Whisper; +3 more
Offering
Llama 3.1 8B Instant on Groq
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Llama 3.3 70B Versatile on Groq
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
Llama 4 Scout on Groq
Offering detailsContext Window: 131072 tokens; Input Modalities: text, image
Offering
Groq Mixtral Inference
Offering detailsArchitecture: Mixture of Experts 8x7B; Speed: 500+ tokens/sec; +3 more
Offering
Qwen3 32B on Groq
Offering detailsContext Window: 131072 tokens; Input Modalities: text
Offering
H2O.ai Model Deployment (Driverless AI + MLOps)
Offering detailsScoring: REST API + batch; Champion-Challenger: A/B traffic splitting; +2 more
Offering
Hailuo AI MiniMax API
Offering detailsMiniMax-Text-01: 456B parameter model; Multimodal: Text, image, audio, video; +3 more
Offering
Helicone - AI Gateway and Observability
Offering detailsProvider Support: OpenAI, Anthropic, Azure, Gemini, 30+; Request Logging: Full request/response capture; +3 more
Offering
Hugging Face Inference Endpoints
Offering detailsOne-Click Deployment: true; Auto-scaling: true; +3 more
Offering
Humanloop LLM Development Platform
Offering detailsPrompt Management: true; Evaluation Framework: true; +3 more
Offering
HyperWrite AI Models
Offering detailsMulti-Model Access: GPT-4, Claude, proprietary; Intelligent Routing: Auto model selection; +3 more
Offering
IBM Cloud - Watson Machine Learning
Offering detailsModel Serving: Online + batch scoring; AutoAI: Automated ML pipeline; +3 more
Offering
Granite 4.0 Micro
Offering detailsContext Window: 131000 tokens; Input Modalities: text
Offering
IBM watsonx.ai Foundation Models
Offering detailsGranite Models: 3B–20B parameters; Prompt Engineering: true; +3 more
Offering
Mercury
Offering detailsContext Window: 128000 tokens; Input Modalities: text
Offering
Mercury 2
Offering detailsContext Window: 128000 tokens; Input Modalities: text
Offering
Mercury Coder
Offering detailsContext Window: 128000 tokens; Input Modalities: text
Offering
Insilico Medicine PandaOmics & Chemistry42
Offering detailsPandaOmics: AI target discovery; Chemistry42: Generative drug design; +3 more
Offering
Intel Gaudi AI Inference
Offering detailsHardware: Gaudi 1/2/3 HPU; SDK: Optimum Habana (PyTorch); +2 more
Offering
InvokeAI - Generative AI API
Offering detailsModel Support: SD 1.x, SDXL, SD3; Workflows: Node-based pipeline editor; +2 more
Offering
Jasper AI Marketing Content Platform
Offering detailsBrand Voice: true; Long-form Content: true; +3 more
Offering
Kagi - AI Search Summarization
Offering detailsFastGPT: Instant AI answers; Universal Summarizer: Any URL summarization; +2 more
Offering
Khanmigo - AI Tutoring Assistant
Offering detailsSocratic Tutoring: Question-based guidance; Subject Coverage: Math, science, humanities; +2 more
Offering
Kling AI - Video & Image Generation API
Offering detailsVideo API: 5-10 second clips, 1080p; Async Processing: Job queue + webhooks; +2 more
Offering
Lambda - GPU Cloud AI Inference
Offering detailsGPU Options: H100, A100, A10, V100; Persistent Storage: Attached storage volumes; +3 more
Offering
LanceDB RAG & AI Application Backend
Offering detailsRAG-Optimized Search: Hybrid ANN + BM25 retrieval; LangChain Integration: Native LangChain vector store; +3 more
Offering
LangChain - AI Inference and LLM Integrations
Offering detailsLLM Integrations: 100+ providers; LCEL: Composable chain syntax; +3 more
Offering
Lepton AI Inference & Deployment Platform
Offering detailsPython SDK: true; OpenAI-Compatible API: true; +3 more
Offering
Lightning AI Inference & Deployment
Offering detailsPyTorch-Native Deployment: Lightning App + PyTorch Serve; Automatic Batching: Dynamic request batching; +3 more
Offering
LFM2-24B-A2B
Offering detailsContext Window: 32768 tokens; Input Modalities: text
Offering
LFM2.5-1.2B-Instruct
Offering detailsContext Window: 32768 tokens; Input Modalities: text
| Service | Offering | Pricing model | Starting price | Regions | Features | Links |
|---|---|---|---|---|---|---|
Google GeminiService details | Gemma 3 4B Offering details | Pay-as-you-go | $0.040 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text, image | |
Google GeminiService details | Gemma 3 4B Offering details | Free | — | 0 | Context Window: 32768 tokens; Input Modalities: text, image | |
Google GeminiService details | Gemma 3n 2B Offering details | Free | — | 0 | Context Window: 8192 tokens; Input Modalities: text | |
Google GeminiService details | Gemma 3n 4B Offering details | Pay-as-you-go | $0.020 1M input tokens | 0 | Context Window: 32768 tokens; Input Modalities: text | |
Google GeminiService details | Gemma 3n 4B Offering details | Free | — | 0 | Context Window: 8192 tokens; Input Modalities: text | |
Google GeminiService details | Gemma 4 26B A4B Offering details | Pay-as-you-go | $0.130 1M input tokens | 0 | Context Window: 262144 tokens; Input Modalities: image, text, video | |
Google GeminiService details | Gemma 4 31B Offering details | Pay-as-you-go | $0.140 1M input tokens | 0 | Context Window: 262144 tokens; Input Modalities: image, text, video | |
Google GeminiService details | Lyria 3 Clip Preview Offering details | Free | — | 0 | Context Window: 1048576 tokens; Input Modalities: text, image | |
Google GeminiService details | Lyria 3 Pro Preview Offering details | Free | — | 0 | Context Window: 1048576 tokens; Input Modalities: text, image | |
Google WorkspaceService details | Gemini for Google Workspace Offering details | Subscription | $20 per user/month (Gemini Business add-on) | 1 | Gmail AI: Draft, summarize, reply; Docs AI: Write, rewrite, proofread; +2 more | |
GrammarlyService details | Grammarly AI Writing Assistant Offering details | Freemium | Free | 1 | Real-time Suggestions: true; Generative AI Writing: true; +3 more | |
GrammarlyService details | Grammarly Business Offering details | Subscription | $15 per member/month (billed annually, minimum 3 seats) | 1 | Style Guide: Company style guide enforcement; Brand Tone: Custom tone settings; +2 more | |
GrammarlyService details | Grammarly GO (AI Writing Features) Offering details | Subscription | $12 per month (Pro plan, billed annually) | 1 | Generation: Draft emails and documents; Rewriting: Full-paragraph rewrites; +2 more | |
GraphcoreService details | Graphcore Poplar SDK Offering details | Free | Free | 1 | Compiler: Poplar Graph Compiler; PyTorch Integration: PopTorch; +2 more | |
GroqService details | Groq Compound Offering details | Custom | — | 0 | Context Window: 131072 tokens; Input Modalities: text | |
GroqService details | GPT-OSS 120B on Groq Offering details | Pay-as-you-go | $0.150 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
GroqService details | GPT-OSS 20B on Groq Offering details | Pay-as-you-go | $0.075 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
GroqService details | Groq LPU AI Inference API Offering details | Usage-based | Free | 2 | Inference Speed: 500+ tokens/second; Latency: <1ms per token; +3 more | |
GroqService details | Groq LLaMA Inference Offering details | Usage-based | $0.0001 per 1K input tokens (LLaMA 3 8B) | 1 | Inference Speed: 750+ tokens/sec; Models Available: LLaMA 3, Mixtral, Gemma, Whisper; +3 more | |
GroqService details | Llama 3.1 8B Instant on Groq Offering details | Pay-as-you-go | $0.050 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
GroqService details | Llama 3.3 70B Versatile on Groq Offering details | Pay-as-you-go | $0.590 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
GroqService details | Llama 4 Scout on Groq Offering details | Pay-as-you-go | $0.110 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text, image | |
GroqService details | Groq Mixtral Inference Offering details | Usage-based | $0.0002 per 1K input tokens | 1 | Architecture: Mixture of Experts 8x7B; Speed: 500+ tokens/sec; +3 more | |
GroqService details | Qwen3 32B on Groq Offering details | Pay-as-you-go | $0.290 1M input tokens | 0 | Context Window: 131072 tokens; Input Modalities: text | |
H2O.aiService details | H2O.ai Model Deployment (Driverless AI + MLOps) Offering details | Subscription | Free | 3 | Scoring: REST API + batch; Champion-Challenger: A/B traffic splitting; +2 more | |
Hailuo AIService details | Hailuo AI MiniMax API Offering details | Usage-based | $0.0002 per 1K input tokens | 1 | MiniMax-Text-01: 456B parameter model; Multimodal: Text, image, audio, video; +3 more | |
HeliconeService details | Helicone - AI Gateway and Observability Offering details | Freemium | Free | 1 | Provider Support: OpenAI, Anthropic, Azure, Gemini, 30+; Request Logging: Full request/response capture; +3 more | |
Hugging FaceService details | Hugging Face Inference Endpoints Offering details | Usage-based | $0.032 per hour (CPU) | 4 | One-Click Deployment: true; Auto-scaling: true; +3 more | |
HumanloopService details | Humanloop LLM Development Platform Offering details | Freemium | Free | 1 | Prompt Management: true; Evaluation Framework: true; +3 more | |
HyperWriteService details | HyperWrite AI Models Offering details | Subscription | $19.99 per month (Premium) | 1 | Multi-Model Access: GPT-4, Claude, proprietary; Intelligent Routing: Auto model selection; +3 more | |
IBM CloudService details | IBM Cloud - Watson Machine Learning Offering details | Usage-based | Free | 5 | Model Serving: Online + batch scoring; AutoAI: Automated ML pipeline; +3 more | |
IBM ResearchService details | Granite 4.0 Micro Offering details | Pay-as-you-go | $0.017 1M input tokens | 0 | Context Window: 131000 tokens; Input Modalities: text | |
IBM watsonxService details | IBM watsonx.ai Foundation Models Offering details | Usage-based | $0.0001 per 1K tokens (Granite 3B) | 5 | Granite Models: 3B–20B parameters; Prompt Engineering: true; +3 more | |
Inception LabsService details | Mercury Offering details | Pay-as-you-go | $0.250 1M input tokens | 0 | Context Window: 128000 tokens; Input Modalities: text | |
Inception LabsService details | Mercury 2 Offering details | Pay-as-you-go | $0.250 1M input tokens | 0 | Context Window: 128000 tokens; Input Modalities: text | |
Inception LabsService details | Mercury Coder Offering details | Pay-as-you-go | $0.250 1M input tokens | 0 | Context Window: 128000 tokens; Input Modalities: text | |
Insilico MedicineService details | Insilico Medicine PandaOmics & Chemistry42 Offering details | Enterprise | Free | 1 | PandaOmics: AI target discovery; Chemistry42: Generative drug design; +3 more | |
Intel Gaudi (Habana Labs)Service details | Intel Gaudi AI Inference Offering details | Usage-based | $13.11 per hour (AWS DL1 instance, dl1.24xlarge) | 2 | Hardware: Gaudi 1/2/3 HPU; SDK: Optimum Habana (PyTorch); +2 more | |
InvokeAIService details | InvokeAI - Generative AI API Offering details | Open source | Free | 1 | Model Support: SD 1.x, SDXL, SD3; Workflows: Node-based pipeline editor; +2 more | |
JasperService details | Jasper AI Marketing Content Platform Offering details | Subscription | $39 per month | 1 | Brand Voice: true; Long-form Content: true; +3 more | |
KagiService details | Kagi - AI Search Summarization Offering details | Subscription | $5 per month | 1 | FastGPT: Instant AI answers; Universal Summarizer: Any URL summarization; +2 more | |
Khan AcademyService details | Khanmigo - AI Tutoring Assistant Offering details | Subscription | $4 per month | 1 | Socratic Tutoring: Question-based guidance; Subject Coverage: Math, science, humanities; +2 more | |
Kling AIService details | Kling AI - Video & Image Generation API Offering details | Usage-based | $0.140 per video generation | 1 | Video API: 5-10 second clips, 1080p; Async Processing: Job queue + webhooks; +2 more | |
LambdaService details | Lambda - GPU Cloud AI Inference Offering details | Usage-based | $0.500 per hour (A10 GPU) | 5 | GPU Options: H100, A100, A10, V100; Persistent Storage: Attached storage volumes; +3 more | |
LanceDBService details | LanceDB RAG & AI Application Backend Offering details | Freemium | Free | 3 | RAG-Optimized Search: Hybrid ANN + BM25 retrieval; LangChain Integration: Native LangChain vector store; +3 more | |
LangChainService details | LangChain - AI Inference and LLM Integrations Offering details | Open source | Free | 1 | LLM Integrations: 100+ providers; LCEL: Composable chain syntax; +3 more | |
Lepton AIService details | Lepton AI Inference & Deployment Platform Offering details | Usage-based | $0.0003 per 1K tokens (Llama 3 8B) | 2 | Python SDK: true; OpenAI-Compatible API: true; +3 more | |
Lightning AIService details | Lightning AI Inference & Deployment Offering details | Usage-based | Free | 3 | PyTorch-Native Deployment: Lightning App + PyTorch Serve; Automatic Batching: Dynamic request batching; +3 more | |
Liquid AIService details | LFM2-24B-A2B Offering details | Pay-as-you-go | $0.030 1M input tokens | 0 | Context Window: 32768 tokens; Input Modalities: text | |
Liquid AIService details | LFM2.5-1.2B-Instruct Offering details | Free | — | 0 | Context Window: 32768 tokens; Input Modalities: text |
Showing 151–200 of 515 offerings