How to Build an AI App with Next.js, Supabase & OpenAI (2026 Guide)
A step-by-step guide to building an AI-powered application with Next.js 16, the Vercel AI SDK, Supabase (including pgvector for RAG), and OpenAI — from streaming chat to deployment on Vercel.
What You'll Build
By the end of this guide you'll have a production-ready AI app with streaming chat, optional RAG (retrieval-augmented generation), persistence in Supabase, and rate limiting. Here's the architecture:
[User] → [Next.js App] → [API Route]
↓
[Vercel AI SDK] streamText / useChat
↓
[OpenAI API] (GPT-5 Mini / GPT-5.2)
↓
[Supabase] conversations, embeddings (pgvector)
↓
[Stream] → [Client]The AI App Stack
Cost Estimate (MVP)
- • 1,000 chat messages/day with GPT-5 Mini ≈ $5–15/month
- • RAG with 10,000 documents ≈ $0.50 one-time embedding cost
- • Supabase Free tier handles most AI apps
- • Total for MVP: $0–25/month (mainly OpenAI API)
Why This Stack?
Next.js plus the Vercel AI SDK is purpose-built for AI apps: streaming responses, edge functions, and React Server Components keep load times fast. According to Vercel's State of AI report, 79% of AI builders prioritize product features over chatbots, and 70% use vector databases. OpenAI has ~88% adoption among AI builders, though developers use an average of two LLM providers. This stack gives you the best default with room to add more providers later.
1Set Up the Project
Create a new Next.js app and install the AI SDK, OpenAI provider, and Supabase client.
npx create-next-app@latest my-ai-app --typescript --tailwind --app
cd my-ai-app
npm install ai @ai-sdk/openai @supabase/supabase-js2Configure OpenAI
Add your OpenAI API key to .env.local and use the AI SDK's OpenAI provider. GPT-5 Mini is the cheapest capable model ($0.25/1M input, $2/1M output); use GPT-5.2 for harder tasks.
// .env.local
OPENAI_API_KEY=sk-...
// lib/openai.ts (optional wrapper)
import { createOpenAI } from '@ai-sdk/openai';
export const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });3Build a Chat Interface
Use the Vercel AI SDK useChat hook for chat UIs. It handles message history, loading state, and streaming automatically.
'use client';
import { useChat } from 'ai/react';
export function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
api: '/api/chat',
});
return (
<div>
{messages.map(m => (
<div key={m.id}>{m.role}: {m.content}</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} disabled={isLoading} />
<button type="submit">Send</button>
</form>
</div>
);
}4Create the API Route
Use streamText from the AI SDK for streaming. Streaming is critical for UX — users see output immediately instead of waiting for the full response.
// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai('gpt-4o-mini'), // or 'gpt-5-mini' when available
system: 'You are a helpful assistant.',
messages,
});
return result.toDataStreamResponse();
}5Add Supabase for Persistence
Store conversations and user history in Supabase. Create tables for conversations and messages, and optionally track token usage for cost management.
// After each completed response, save to Supabase
const { data, error } = await supabase
.from('messages')
.insert({ conversation_id, role: 'assistant', content: fullContent, user_id });6Implement RAG with pgvector
Enable the pgvector extension in Supabase. Generate embeddings with OpenAI text-embedding-3-small ($0.02/1M tokens). Store document chunks and their embeddings, then run similarity search and inject the top results into the prompt.
-- Enable pgvector in Supabase SQL editor
create extension if not exists vector;
create table documents (
id uuid primary key default gen_random_uuid(),
content text,
embedding vector(1536) -- dimension for text-embedding-3-small
);
-- Query similar chunks
select content from documents
order by embedding <-> $1::vector
limit 5;7Add Authentication
Protect API routes and track per-user usage. Use Supabase Auth or Clerk; both integrate cleanly with Next.js. Validate the session in your chat API route and attach user_id to conversations.
// In API route: get user from Supabase or Clerk
const session = await getSession(req);
if (!session?.user) return new Response('Unauthorized', { status: 401 });
// Then use session.user.id for conversations and rate limits8Rate Limiting & Cost Control
Protect your API budget with per-user limits. Use Upstash Redis or in-memory rate limiting. Track token usage from the AI SDK response and set spending limits in the OpenAI dashboard.
// Example: check rate limit before calling OpenAI
const key = `ratelimit:${userId}`;
const { success } = await ratelimit.limit(key);
if (!success) return new Response('Too many requests', { status: 429 });9Deploy to Vercel
Add environment variables (OpenAI, Supabase, Clerk if used) in Vercel. Edge functions work well for streaming AI responses. Push to your connected repo to deploy.
// vercel.json or next.config – optional edge for chat route
export const runtime = 'edge'; // in route.ts for lower latencyOpenAI API Pricing (2026)
| Model | Input | Output |
|---|---|---|
| GPT-5 Mini | $0.25/1M | $2.00/1M |
| GPT-5.2 | $1.75/1M | $14.00/1M |
| GPT-5.2 Pro | $21.00/1M | $168.00/1M |
| text-embedding-3-small | $0.02/1M | — |
Batch API: 50% discount for async processing.
Cost Optimization Tips
- Use GPT-5 Mini for simple tasks; reserve GPT-5.2 for complex reasoning.
- Set spending limits in the OpenAI dashboard and enable usage alerts.
- Track token usage from
usagein the AI SDK response and log per user. - Use the Batch API for non-real-time jobs (e.g. nightly embedding runs) to cut costs 50%.
Going Further
Add multi-model support (e.g. OpenAI + Anthropic) via the AI SDK. Use function calling (tools) for structured outputs and actions. Explore agents and multi-step reasoning for more advanced flows. The same patterns — streaming, persistence, rate limiting — apply as you scale.
FAQ
Why use the Vercel AI SDK instead of calling OpenAI directly?
The AI SDK gives you streaming out of the box, the useChat/useCompletion hooks, and a provider-agnostic API so you can switch or add models (e.g. Anthropic) without rewriting your app.
Do I need a vector database for a simple chatbot?
No. Use RAG (pgvector) when you need the model to answer from your own documents or knowledge base. For general chat, a database for conversation history is enough.
How do I reduce OpenAI costs?
Use GPT-5 Mini for most requests, set usage limits and alerts, implement per-user rate limiting, and use the Batch API for non-real-time workloads. Track tokens per user to spot heavy usage.
Can I use Supabase Auth and Clerk together?
Typically you pick one. Supabase Auth is free and integrates with your existing Supabase project; Clerk offers more pre-built UI and org features. Both work well with Next.js and the AI SDK.
Is streaming required?
Not required, but strongly recommended for chat. Users see output immediately, which feels much faster. Use streamText for streaming and generateText when you need the full response before returning (e.g. for post-processing).
Get a Personalized Stack Recommendation
Building an AI app or a full SaaS? Tell us your project and budget — our AI will recommend the perfect stack for you.
No signup required. Instant results.
Related Articles
Complete guide to building a modern SaaS
Supabase vs Firebase for StartupsBackend comparison for your AI app
Vercel vs Railway HostingWhere to deploy your Next.js AI app
Solo Founder Tech Stack 2026Budget-friendly tools for indie hackers
Build SaaS for Free 2026Free tier tools to launch at $0
Clerk vs Auth0 vs Supabase AuthAuth options for your AI app