AscendLab
Article

Generate JSON Schema from API Samples Without Overfitting

A practical guide to turning sample JSON into starter schemas while reviewing required fields, optional values, arrays, and real API variation.

developerjsonschemaapi

Introduction

A JSON sample is a useful starting point for a schema, but it is not the full contract. One response can show shape, field names, and examples; it cannot prove which fields are always present or which values are allowed over time.

The JSON Schema Generator helps turn a pasted sample into a draft schema. Treat that draft as a checklist to review, not as a finished API contract.

Real-world scenario

You receive a sample response:

{
  "id": "ord_123",
  "status": "paid",
  "total": 42.5,
  "items": [{ "sku": "book", "quantity": 1 }]
}

A generated schema can infer strings, numbers, arrays, objects, and example values. You still need to decide whether status is an enum, whether total can be null, and whether items can be empty.

What to review

Required fields. A single sample may include fields that are optional in production.

Null values. If real responses can return null, add that to the schema deliberately.

Arrays. Check whether arrays can be empty and whether item shapes vary.

Enums. Do not infer a strict enum from one value unless the API contract confirms it.

Common mistakes

Overfitting to one response. Generate from more than one representative sample when possible.

Skipping formatter checks. Invalid JSON or copied comments can break schema generation.

Treating examples as validation rules. Examples help documentation; they do not define every allowed value.

Practical QA pass

Collect at least two samples when the API has meaningful variation: a normal success response, an empty result, and a response with optional fields if available. Generate a starter schema from the richest sample, then manually review which fields are truly required. This avoids making a field required only because it appeared in one happy-path payload.

Next, annotate decisions outside the schema if needed. For example, "status is currently paid, pending, or failed" is different from "status may never contain another value." A generated schema is stronger when paired with short notes about what the team has actually confirmed.

Data handling note

Processing is handled in the browser for this tool based on the current public implementation. Avoid entering sensitive API payloads unless you have reviewed the implementation and your own data handling requirements.

Next steps

Final practical note

After generating a schema, mark every field as confirmed, optional, nullable, or unknown. That review is where a generated draft becomes useful documentation.

Related docs

Related tools