Contáctanos
Webflow Premium Partner Ehab Fayez
Volver a Herramientas IA
Back End Developer Tendencia Free tier / Pay per token
Groq

Groq

Ultra-fast AI inference with custom LPU hardware

Descripción General

Groq provides the fastest AI inference available using their custom Language Processing Unit (LPU) hardware. Run models like Llama and Mixtral at hundreds of tokens per second.

Características

Ultra-fast inference (500+ tok/s)
Custom LPU hardware
OpenAI-compatible API
Free tier available
Multiple open models
Low latency

Cómo Empezar

Sign up at console.groq.com for a free API key. Use the OpenAI-compatible endpoint with base_url='https://api.groq.com/openai/v1'.

Precios

Free tier / Pay per token

Ver detalles de precios

Preguntas Frecuentes

How fast is Groq?

Groq delivers 500+ tokens/second for Llama 3 — roughly 10x faster than typical GPU-based inference.