Volver a Herramientas IA
Back End Developer Tendencia Free tier / Pay per token
Groq
Ultra-fast AI inference with custom LPU hardware
Descripción General
Groq provides the fastest AI inference available using their custom Language Processing Unit (LPU) hardware. Run models like Llama and Mixtral at hundreds of tokens per second.
Características
Ultra-fast inference (500+ tok/s)
Custom LPU hardware
OpenAI-compatible API
Free tier available
Multiple open models
Low latency
Cómo Empezar
Sign up at console.groq.com for a free API key. Use the OpenAI-compatible endpoint with base_url='https://api.groq.com/openai/v1'.
Precios
Free tier / Pay per token
Ver detalles de preciosPreguntas Frecuentes
How fast is Groq?
Groq delivers 500+ tokens/second for Llama 3 — roughly 10x faster than typical GPU-based inference.