Year in Review 2025

Commercial VIN decoders charge $0.10-0.50 per lookup. They rate-limit you, require API keys, and make you pay for data that's public domain from NHTSA. The entire industry exists because nobody bothered to make the government data usable.

We fixed that, then built an entire automotive platform on top of it.

The Problem: VIN Decoding is a Racket

NHTSA publishes the VPIC database - every vehicle specification for every VIN pattern ever registered in North America. It's free, public domain data. But it's distributed as a 5GB+ SQL dump that's unusable without serious infrastructure.

So companies charge you to query it. Per lookup. With rate limits. Some charge $50-200/month for API access. Others charge per thousand lookups. It's rent-seeking on public data.

Corgi: 50MB, Fully Offline, Sub-Millisecond Decoding

We took NHTSA's database, optimized it down to 50MB, and released it as open source. Read the full technical breakdown in our Corgi deep dive.

The entire decoder runs offline. No API calls, no rate limits, no costs. It works in Node.js, browsers (via sql.js), and Cloudflare Workers (via D1). We wrote database adapters for each runtime so the core decoding logic stays identical.

Here's how it works:

Database optimization: We analyzed NHTSA's schema, removed redundant columns, normalized strings, and compressed the SQLite database. The final database is 50MB gzipped, 150MB uncompressed. Commercial solutions ship 1GB+ databases or require server roundtrips.

Pattern matching engine: VINs encode vehicle data in the first 11 characters. We built a pattern matcher that queries the optimized database and returns full vehicle specs in <1ms. No external API calls required.

Multi-runtime support: Same core logic, different database adapters. Better-sqlite3 for Node.js, sql.js for browsers, D1 binding for Cloudflare Workers. The decoder doesn't care where it runs.

import { decodeVIN } from "@cardog/corgi";
const result = await decodeVIN("1HGBH41JXMN109186");

That's it. No API keys, no HTTP requests, no vendor lock-in. Just VIN in, vehicle data out.

Market Intelligence: Real-Time Automotive Analytics

The core product is market analysis. We built a system that analyzes millions of vehicle listings and provides actual insights, not just aggregated averages that tell you nothing.

The Architecture

At the bottom is a Postgres database (on Neon) with every active vehicle listing in North America. On top of that, we built a market analysis service that computes:

Price distribution histograms - Not just "average price" but actual distribution curves showing where listings cluster
Odometer-price correlation - Depreciation analysis with enough data points to be statistically meaningful
Geographic breakdowns - Where vehicles are listed and how prices vary by region
Time series trends - Week-over-week and month-over-month price movements
Trim-level analysis - Price deltas between trims, showing which configurations hold value
Individual listing positioning - Where any specific listing sits relative to the market (percentile rank, pricing guidance)
Similar listings algorithm - Proximity-based matching considering make/model/year/trim/mileage/price

The service layer is written in TypeScript using Drizzle ORM for type-safe SQL queries. Every endpoint is contract-validated with Zod schemas to prevent schema drift between API and clients.

API routes look like this:

app.get("/:make/:model/:year/overview", async (c) => {
  const { make, model, year } = c.req.valid("param");
  const svc = new MarketService(c.env.CORE_DB_URL);
  return c.json(await svc.overview(make, model, year));
});

Simple, fast, type-safe. Hono + Cloudflare Workers means sub-100ms response times globally. Explore the full API documentation.

Research Database: Vehicle Specs That Don't Lie

We maintain a comprehensive vehicle specification database covering every make/model/year/trim variant. This includes:

Full trim lineups by manufacturer and year
Technical specifications (engine, transmission, drivetrain)
OEM imagery (exterior shots, interior, detail photos)
Factory color options with hex codes
Media categorization (shot types, angles, contexts)

The research API exposes all of this through clean REST endpoints:

const lineup = await client.research.lineup("Toyota", { year: 2024 });
const images = await client.research.getImages({
  make: "Honda",
  model: "Civic",
  year: 2023,
});
const colors = await client.research.getColors({
  make: "Ford",
  model: "F-150",
  year: 2024,
});

This powers both our consumer app and dealer tools. Same API, different use cases.

Recalls Pipeline: Government Data That Actually Works

Government recall data is a mess. Transport Canada and NHTSA both publish recall information, but in different formats, with inconsistent schemas, and zero normalization.

We built beaver - an ETL pipeline that:

Extracts raw recall CSVs from government sources
Transforms the data into a normalized schema
Loads it into SQLite with proper indexes

The schema is straightforward:

CREATE TABLE transport_canada_recalls (
  id TEXT PRIMARY KEY,
  recall_number TEXT NOT NULL,
  year INTEGER,
  make TEXT NOT NULL,
  model TEXT NOT NULL,
  units_affected INTEGER,
  system_type TEXT,
  comment TEXT,
  recall_date DATE
);

Now we can query recalls by VIN, make/model/year, or recall number instantly. The recalls tool exposes this through simple endpoints that return clean JSON instead of government CSV soup.

MCP Server: AI That Knows Cars

We built an MCP (Model Context Protocol) server that exposes our entire data platform to AI agents. This means Claude (or any MCP-compatible AI) can:

Search vehicle listings with complex filters
Analyze market conditions for specific vehicles
Look up recalls by VIN
Get comprehensive vehicle specifications
Find similar listings
Calculate market positioning

The tools are defined with Zod schemas and return structured JSON. For example:

agent.server.tool(
  "get_market_overview",
  "Get market statistics for a make/model/year",
  makeModelYearSchema,
  async (params) => {
    const overview = await client.market.overview(
      params.make,
      params.model,
      params.year
    );
    return { content: [{ type: "text", text: JSON.stringify(overview) }] };
  }
);

This turns our entire platform into a knowledge base that AI can query programmatically. No prompt engineering required - just clean tool definitions and structured data.

The Client Layer: Type-Safe Everything

We abstracted all API interaction into a typed client library (@cardog/client) that both our web and mobile apps consume. It's built on top of Zod-validated contracts (@cardog/contracts), so if the API changes, the client breaks at compile time, not runtime.

The client exposes methods for every API endpoint and includes TanStack Query hooks for React:

const { data, isLoading } = useMarketOverview("Toyota", "Camry", 2024);

No manual cache management, no stale data, no imperative fetching logic. Just declare what data you need and the hooks handle the rest.

The Apps: Mobile-First, Web Second

We shipped two consumer-facing applications this year:

Mobile app - React Native using Expo 52. Multi-platform (iOS + Android) with native performance. Uses the same API client as web, same authentication system, same data contracts.

Web app - Next.js 15 on Vercel. Tailwind + shadcn/ui primitives. Server components where it makes sense, client components where we need interactivity. Fast, responsive, doesn't feel like a web app pretending to be native.

Both apps hit the same Hono API running on Cloudflare Workers. Authentication is handled by better-auth with custom plugins.

The Data Layer: Postgres + Drizzle

We use two Postgres databases on Neon:

Core DB - Vehicle listings, specs, images, market data, recalls. Read-heavy, optimized for analytical queries.
App DB - User accounts, auth, subscriptions, billing, garage. OLTP workload, optimized for transactional consistency.

Drizzle ORM gives us type-safe queries without the bloat of Prisma. Migrations are SQL files we actually understand. Schema definitions live in TypeScript and get inferred everywhere.

Neon's branching feature means we can spin up ephemeral database branches for testing without contaminating production data. Delete the branch when done.

The Monorepo: Turborepo + pnpm

Everything lives in a monorepo with apps/ and packages/:

Apps:

apps/api - Hono + Cloudflare Workers API
apps/crdg.ai - Next.js web application
apps/cardog-native - React Native mobile app
apps/corgi.vin - Corgi VIN decoder demo site
apps/beaver - Recalls ETL pipeline

Packages:

packages/client - API client library
packages/contracts - Zod schemas (single source of truth)
packages/corgi - Open source VIN decoder
packages/data/postgres - Drizzle ORM schemas
packages/mcp - MCP server implementation

Turborepo handles build orchestration. pnpm handles dependency management. We don't think about either one anymore - they just work.

What We Learned

Building automotive infrastructure taught us a few things:

Government data is always worse than you expect - Plan for inconsistency, missing fields, and schema changes with zero notice.
Offline-first is underrated - Corgi works offline because we bundled the data. No network = no problem. This matters more than you think.
Type safety prevents production fires - Zod contracts between API and clients caught hundreds of bugs before they shipped. Worth the extra code.
Monorepos scale - Sharing code between mobile and web is trivial when everything's in one repo with proper package boundaries.
Simple architectures win - Hono + Workers + Postgres is fast, cheap, and easy to reason about. No Kubernetes, no microservices, no complexity budget blown on infrastructure.

What's Next

We're building dealer tools on top of this infrastructure. Same APIs, different UI. Market analysis for inventory optimization, listing insights, competitive intelligence.

The platform is flexible enough to support both consumer and B2B use cases without rewriting core services. That's the benefit of building proper abstractions from day one.

All the code runs in production today. Corgi is open source at github.com/cardog-ai/corgi. The rest powers cardog.app and our mobile app.

Not bad for a year of work.