Key Takeaways
- 1OpenAI's response_format with strict: true enforces schema compliance at the model level — the API will reject completions that do not conform, eliminating defensive regex parsing.
- 2Define your Zod schema first and derive the JSON Schema from it using zodToJsonSchema — single source of truth for both runtime validation and the API contract.
- 3Always distinguish API errors from Zod parse failures in your catch blocks — they require different remediation strategies.
- 4Route escalations synchronously before persisting: check priority immediately after parsing so P0s hit your on-call webhook in the same request cycle.
- 5gpt-4o-mini handles 90%+ of standard triage cases at a fraction of the cost — benchmark both models on your historical tickets before committing.
The Problem with Freeform AI Categorization
Classifying support tickets with a standard chat completion endpoint works until it doesn't. The model might return 'Billing Issue' in one response and 'billing_issue' in the next. It might include an explanation paragraph instead of a bare label. It might invent a priority level you never defined. Any of these breaks a downstream routing rule. You end up writing brittle parsers, adding retry logic, and still shipping incidents where a P0 ticket sat in the wrong queue.
OpenAI's structured outputs feature solves this at the protocol level. When you pass a JSON Schema via response_format with strict: true, the model is constrained to emit only tokens that keep the output valid against that schema. You get guaranteed JSON, every time, with the exact fields and enum values you declared.
Why Zod Is the Right Schema Layer
You need a schema definition that serves two purposes: generate the JSON Schema to send to OpenAI, and validate the parsed response at runtime. Maintaining two separate artifacts is a maintenance trap. Zod lets you define the shape once and derive both using zod-to-json-schema. Any mismatch between what you described and what the model returned surfaces as a Zod parse error, not a silent bug.
Step 1: Define the Schema
Step 2: Call gpt-4o with Structured Outputs
Step 3: Routing Logic
What to Watch Out For
Schema restrictions: strict mode supports object, array, string, number, integer, boolean, null, and enum. No $ref, no default values, no format keywords — the API returns 400 otherwise.
Schema caching: OpenAI caches your schema by name. If you change the schema, change the name — otherwise responses may be validated against your old schema.
Cost at scale: at ~500 input tokens per ticket, gpt-4o costs roughly $0.003-$0.005 per ticket. gpt-4o-mini drops this to under $2/day at 10k tickets. Benchmark both on your historical data before committing.
Preguntas Frecuentes
Does this require a dedicated vector database or RAG setup?
No. Triage relies purely on zero-shot classification via the system prompt and the base intelligence of gpt-4o. You only need the raw email text.
What happens if the model hallucinates a category?
Structured Outputs mathematically guarantee the model's output matches your Zod schema. It cannot return a category outside of the z.enum() array you define.
Kyto
AI & Automation Firm
We design and build AI automations and business operating systems. Agency results + Academy sovereignty.

