ForkBenchForkBench

Developer Tools

Compare LLMs Side-by-Side

Test the same prompt across multiple AI models simultaneously. See quality, cost, and latency in real-time. Make informed decisions about which model to use.

Key Features

  • • Side-by-side comparison of up to 4 models at once
  • • Real-time streaming responses
  • • Transparent cost tracking per request
  • • Performance metrics (latency, tokens/sec, cost)

Supported Models

OpenAI

GPT-4o, GPT-4o Mini, GPT-4 Turbo

Anthropic

Claude 3.5 Sonnet, Opus, Haiku

Google & More

Gemini, DeepSeek, Sherlock

How It Works

  1. 1) Enter your prompt in the text area
  2. 2) Select 2-4 models you want to compare
  3. 3) Click "Run Test" and watch responses stream in real-time
  4. 4) Review metrics to find the best model for your use case

Metrics Tracked

  • • Response latency (how fast each model responds)
  • • Token usage (input + output tokens)
  • • Cost per request (based on actual usage)
  • • Tokens per second (generation speed)

Get Started

Sign up for a free account and add your API keys. You'll need an OpenAI key and an OpenRouter key to access all models.

OpenRouter gives you access to Claude, Gemini, DeepSeek, and more through a single API.