Generic keywords

参考文档: https://json-schema.org/understanding-json-schema/reference/generic.html

This chapter lists some miscellaneous properties that are available for all JSON types.

Annotations

JSON Schema includes a few keywords, that aren’t strictly used for validation, but are used to describe parts of a schema. None of these “annotation” keywords are required, but they are encouraged for good practice, and can make your schema “self-documenting”.

The title and description keywords must be strings. A “title” will preferably be short, whereas a “description” will provide a more lengthy explanation about the purpose of the data described by the schema.

The default keyword specifies a default value. This value is not used to fill in missing values during the validation process. Non-validation tools such as documentation generators or form generators may use this value to give hints to users about how to use a value. However, default is typically used to express that if a value is missing, then the value is semantically the same as if the value was present with the default value. The value of default should validate against the schema in which it resides, but that isn’t required.

New in draft 6 The examples keyword is a place to provide an array of examples that validate against the schema. This isn’t used for validation, but may help with explaining the effect and purpose of the schema to a reader. Each entry should validate against the schema in which it resides, but that isn’t strictly required. There is no need to duplicate the default value in the examples array, since default will be treated as another example.

New in draft 7 The boolean keywords readOnly and writeOnly are typically used in an API context. readOnly indicates that a value should not be modified. It could be used to indicate that a PUT request that changes a value would result in a 400 Bad Request response. writeOnly indicates that a value may be set, but will remain hidden. In could be used to indicate you can set a value with a PUT request, but it would not be included when retrieving that record with a GET request.

New in draft 2019-09 The deprecated keyword is a boolean that indicates that the instance value the keyword applies to should not be used and may be removed in the future.

{
  "title": "Match anything",
  "description": "This is a schema that matches anything.",
  "default": "Default value",
  "examples": [
    "Anything",
    4035
  ],
  "deprecated": true,
  "readOnly": true,
  "writeOnly": false
}

Comments

New in draft 7 $comment

The $comment keyword is strictly intended for adding comments to a schema. Its value must always be a string. Unlike the annotations title, description, and examples, JSON schema implementations aren’t allowed to attach any meaning or behavior to it whatsoever, and may even strip them at any time. Therefore, they are useful for leaving notes to future editors of a JSON schema, but should not be used to communicate to users of the schema.

Enumerated values

The enum keyword is used to restrict a value to a fixed set of values. It must be an array with at least one element, where each element is unique.

The following is an example for validating street light colors:

{
  "enum": ["red", "amber", "green"]
}

ok: "red"

error: "blue"

You can use enum even without a type, to accept values of different types. Let’s extend the example to use null to indicate “off”, and also add 42, just for fun.

{
  "enum": ["red", "amber", "green", null, 42]
}

Constant values

New in draft 6

The const keyword is used to restrict a value to a single value.

For example, if you only support shipping to the United States for export reasons:

{
  "properties": {
    "country": {
      "const": "United States of America"
    }
  }
}

ok: { "country": "United States of America" }

error: { "country": "Canada" }

Media: string-encoding non-JSON data

参考文档: https://json-schema.org/understanding-json-schema/reference/non_json_data.html

New in draft 7

JSON schema has a set of keywords to describe and optionally validate non-JSON data stored inside JSON strings. Since it would be difficult to write validators for many media types, JSON schema validators are not required to validate the contents of JSON strings based on these keywords. However, these keywords are still useful for an application that consumes validated JSON.

contentMediaType

The contentMediaType keyword specifies the MIME type of the contents of a string, as described in RFC 2046. There is a list of MIME types officially registered by the IANA, but the set of types supported will be application and operating system dependent. Mozilla Developer Network also maintains a shorter list of MIME types that are important for the web

contentEncoding

The contentEncoding keyword specifies the encoding used to store the contents, as specified in RFC 2054, part 6.1 and RFC 4648.

The acceptable values are 7bit, 8bit, binary, quoted-printable, base16, base32, and base64. If not specified, the encoding is the same as the containing JSON document.

Without getting into the low-level details of each of these encodings, there are really only two options useful for modern usage:

  • If the content is encoded in the same encoding as the enclosing JSON document (which for practical purposes, is almost always UTF-8), leave contentEncoding unspecified, and include the content in a string as-is. This includes text-based content types, such as text/html or application/xml.
  • If the content is binary data, set contentEncoding to base64 and encode the contents using Base64. This would include many image types, such as image/png or audio types, such as audio/mpeg.

contentSchema

New in draft 2019-09

Documentation Coming soon

Examples

The following schema indicates the string contains an HTML document, encoded using the same encoding as the surrounding document:

{
  "type": "string",
  "contentMediaType": "text/html"
}

ok: "<!DOCTYPE html><html xmlns=\"http://www.w3.org/1999/xhtml\"><head></head></html>"

The following schema indicates that a string contains a PNG image, encoded using Base64:

{
  "type": "string",
  "contentEncoding": "base64",
  "contentMediaType": "image/png"
}

ok: "iVBORw0KGgoAAAANSUhEUgAAABgAAAAYCAYAAADgdz34AAAABmJLR0QA/wD/AP+gvaeTAAAA..."

Schema Composition

参考文档: https://json-schema.org/understanding-json-schema/reference/combining.html

JSON Schema includes a few keywords for combining schemas together. Note that this doesn’t necessarily mean combining schemas from multiple files or JSON trees, though these facilities help to enable that and are described in Structuring a complex schema. Combining schemas may be as simple as allowing a value to be validated against multiple criteria at the same time.

These keywords correspond to well known boolean algebra concepts like AND, OR, XOR, and NOT. You can often use these keywords to express complex constraints that can’t otherwise be expressed with standard JSON Schema keywords.

The keywords used to combine schemas are:

  • allOf: (AND) Must be valid against all of the subschemas
  • anyOf: (OR) Must be valid against any of the subschemas
  • oneOf: (XOR) Must be valid against exactly one of the subschemas

All of these keywords must be set to an array, where each item is a schema.

In addition, there is:

  • not: (NOT) Must not be valid against the given schema

allOf

To validate against allOf, the given data must be valid against all of the given subschemas.

{
  "allOf": [
    { "type": "string" },
    { "maxLength": 5 }
  ]
}

ok: "short"

error: "too long"

Note:

allOf can not be used to “extend” a schema to add more details to it in the sense of object-oriented inheritance. Instances must independently be valid against “all of” the schemas in the allOf. See the section on Extending Closed Schemas for more information.

anyOf

To validate against anyOf, the given data must be valid against any (one or more) of the given subschemas.

{
  "anyOf": [
    { "type": "string", "maxLength": 5 },
    { "type": "number", "minimum": 0 }
  ]
}

ok:"short" ,12

error: "too long", -5

oneOf

To validate against oneOf, the given data must be valid against exactly one of the given subschemas.

{
  "oneOf": [
    { "type": "number", "multipleOf": 5 },
    { "type": "number", "multipleOf": 3 }
  ]
}

ok: 10, 9

error:

Not a multiple of either 5 or 3.

2

Multiple of both 5 and 3 is rejected.

15

not

The not keyword declares that an instance validates if it doesn’t validate against the given subschema.

For example, the following schema validates against anything that is not a string:

{ "not": { "type": "string" } }

ok: 42 , { "key": "value" }

error: "I am a string"

Properties of Schema Composition

Illogical Schemas

Note that it’s quite easy to create schemas that are logical impossibilities with these keywords. The following example creates a schema that won’t validate against anything (since something may not be both a string and a number at the same time):

{
  "allOf": [
    { "type": "string" },
    { "type": "number" }
  ]
}

error: No way , -1

Factoring Schemas

Note that it’s possible to “factor” out the common parts of the subschemas. The following two schemas are equivalent.

{
  "oneOf": [
    { "type": "number", "multipleOf": 5 },
    { "type": "number", "multipleOf": 3 }
  ]
}

等效

{
   "type": "number",
   "oneOf": [
     { "multipleOf": 5 },
     { "multipleOf": 3 }
   ]
 }

Applying Subschemas Conditionally

dependentRequired

The dependentRequired keyword conditionally requires that certain properties must be present if a given property is present in an object. For example, suppose we have a schema representing a customer. If you have their credit card number, you also want to ensure you have a billing address. If you don’t have their credit card number, a billing address would not be required. We represent this dependency of one property on another using the dependentRequired keyword. The value of the dependentRequired keyword is an object. Each entry in the object maps from the name of a property, p, to an array of strings listing properties that are required if p is present.

In the following example, whenever a credit_card property is provided, a billing_address property must also be present:

{
  "type": "object",

  "properties": {
    "name": { "type": "string" },
    "credit_card": { "type": "number" },
    "billing_address": { "type": "string" }
  },

  "required": ["name"],

  "dependentRequired": {
    "credit_card": ["billing_address"]
  }
}

ok:

{
  "name": "John Doe",
  "credit_card": 5555555555555555,
  "billing_address": "555 Debtor's Lane"
}

This is okay, since we have neither a credit_card, or a billing_address.

{
  "name": "John Doe"
}

Note that dependencies are not bidirectional. It’s okay to have a billing address without a credit card number.

{
  "name": "John Doe",
  "billing_address": "555 Debtor's Lane"
}

error:

This instance has a credit_card, but it’s missing a billing_address.

{
  "name": "John Doe",
  "credit_card": 5555555555555555
}

To fix the last issue above (that dependencies are not bidirectional), you can, of course, define the bidirectional dependencies explicitly:

{
  "type": "object",

  "properties": {
    "name": { "type": "string" },
    "credit_card": { "type": "number" },
    "billing_address": { "type": "string" }
  },

  "required": ["name"],

  "dependentRequired": {
    "credit_card": ["billing_address"],
    "billing_address": ["credit_card"]
  }
}

error:

This instance has a credit_card, but it’s missing a billing_address.

{
  "name": "John Doe",
  "credit_card": 5555555555555555
}

This has a billing_address, but is missing a credit_card.

{
  "name": "John Doe",
  "billing_address": "555 Debtor's Lane"
}

Draft 4-7

Previously to Draft 2019-09, dependentRequired and dependentSchemas were one keyword called dependencies. If the dependency value was an array, it would behave like dependentRequired and if the dependency value was a schema, it would behave like dependentSchemas.

dependentSchemas

The dependentSchemas keyword conditionally applies a subschema when a given property is present. This schema is applied in the same way allOf applies schemas. Nothing is merged or extended. Both schemas apply independently.

For example, here is another way to write the above:

{
  "type": "object",

  "properties": {
    "name": { "type": "string" },
    "credit_card": { "type": "number" }
  },

  "required": ["name"],

  "dependentSchemas": {
    "credit_card": {
      "properties": {
        "billing_address": { "type": "string" }
      },
      "required": ["billing_address"]
    }
  }
}

ok:

{
  "name": "John Doe",
  "credit_card": 5555555555555555,
  "billing_address": "555 Debtor's Lane"
}

This has a billing_address, but is missing a credit_card. This passes, because here billing_address just looks like an additional property:

{
  "name": "John Doe",
  "billing_address": "555 Debtor's Lane"
}

error:

This instance has a credit_card, but it’s missing a billing_address:

{
  "name": "John Doe",
  "credit_card": 5555555555555555
}

Draft 4-7

Previously to Draft 2019-09, dependentRequired and dependentSchemas were one keyword called dependencies. If the dependency value was an array, it would behave like dependentRequired and if the dependency value was a schema, it would behave like dependentSchemas.

If-Then-Else

New in draft 7** The if, then and else keywords allow the application of a subschema based on the outcome of another schema, much like the if/then/else constructs you’ve probably seen in traditional programming languages.

If if is valid, then must also be valid (and else is ignored.) If if is invalid, else must also be valid (and then is ignored).

If then or else is not defined, if behaves as if they have a value of true.

If then and/or else appear in a schema without if, then and else are ignored.

We can put this in the form of a truth table, showing the combinations of when if, then, and else are valid and the resulting validity of the entire schema:

if then else whole schema
T T n/a T
- - - -
T F n/a F
F n/a T T
F n/a F F
n/a n/a n/a T

For example, let’s say you wanted to write a schema to handle addresses in the United States and Canada. These countries have different postal code formats, and we want to select which format to validate against based on the country. If the address is in the United States, the postal_code field is a “zipcode”: five numeric digits followed by an optional four digit suffix. If the address is in Canada, the postal_code field is a six digit alphanumeric string where letters and numbers alternate.

{
  "type": "object",
  "properties": {
    "street_address": {
      "type": "string"
    },
    "country": {
      "default": "United States of America",
      "enum": ["United States of America", "Canada"]
    }
  },
  "if": {
    "properties": { "country": { "const": "United States of America" } }
  },
  "then": {
    "properties": { "postal_code": { "pattern": "[0-9]{5}(-[0-9]{4})?" } }
  },
  "else": {
    "properties": { "postal_code": { "pattern": "[A-Z][0-9][A-Z] [0-9][A-Z][0-9]" } }
  }
}

ok:

{
  "street_address": "1600 Pennsylvania Avenue NW",
  "country": "United States of America",
  "postal_code": "20500"
}
{
  "street_address": "1600 Pennsylvania Avenue NW",
  "postal_code": "20500"
}
{
  "street_address": "24 Sussex Drive",
  "country": "Canada",
  "postal_code": "K1M 1M4"
}

error:

{
  "street_address": "24 Sussex Drive",
  "country": "Canada",
  "postal_code": "10000"
}
{
  "street_address": "1600 Pennsylvania Avenue NW",
  "postal_code": "K1M 1M4"
}

Note

In this example, “country” is not a required property. Because the “if” schema also doesn’t require the “country” property, it will pass and the “then” schema will apply. Therefore, if the “country” property is not defined, the default behavior is to validate “postal_code” as a USA postal code. The “default” keyword doesn’t have an effect, but is nice to include for readers of the schema to more easily recognize the default behavior.

Unfortunately, this approach above doesn’t scale to more than two countries. You can, however, wrap pairs of if and then inside an allOf to create something that would scale. In this example, we’ll use United States and Canadian postal codes, but also add Netherlands postal codes, which are 4 digits followed by two letters. It’s left as an exercise to the reader to expand this to the remaining postal codes of the world.the world.

{
  "type": "object",
  "properties": {
    "street_address": {
      "type": "string"
    },
    "country": {
      "default": "United States of America",
      "enum": ["United States of America", "Canada", "Netherlands"]
    }
  },
  "allOf": [
    {
      "if": {
        "properties": { "country": { "const": "United States of America" } }
      },
      "then": {
        "properties": { "postal_code": { "pattern": "[0-9]{5}(-[0-9]{4})?" } }
      }
    },
    {
      "if": {
        "properties": { "country": { "const": "Canada" } },
        "required": ["country"]
      },
      "then": {
        "properties": { "postal_code": { "pattern": "[A-Z][0-9][A-Z] [0-9][A-Z][0-9]" } }
      }
    },
    {
      "if": {
        "properties": { "country": { "const": "Netherlands" } },
        "required": ["country"]
      },
      "then": {
        "properties": { "postal_code": { "pattern": "[0-9]{4} [A-Z]{2}" } }
      }
    }
  ]
}

ok:

{
  "street_address": "1600 Pennsylvania Avenue NW",
  "country": "United States of America",
  "postal_code": "20500"
}
{
  "street_address": "1600 Pennsylvania Avenue NW",
  "postal_code": "20500"
}
{
  "street_address": "24 Sussex Drive",
  "country": "Canada",
  "postal_code": "K1M 1M4"
}
{
  "street_address": "Adriaan Goekooplaan",
  "country": "Netherlands",
  "postal_code": "2517 JX"
}

error:

{
  "street_address": "24 Sussex Drive",
  "country": "Canada",
  "postal_code": "10000"
}
{
  "street_address": "1600 Pennsylvania Avenue NW",
  "postal_code": "K1M 1M4"
}

Note

The “required” keyword is necessary in the “if” schemas or they would all apply if the “country” is not defined. Leaving “required” off of the “United States of America” “if” schema makes it effectively the default if no “country” is defined.

Note

Even if “country” was a required field, it’s still recommended to have the “required” keyword in each “if” schema. The validation result will be the same because “required” will fail, but not including it will add noise to error results because it will validate the “postal_code” against all three of the “then” schemas leading to irrelevant errors.

Implication

Before Draft 7, you can express an “if-then” conditional using the Schema Composition keywords and a boolean algebra concept called “implication”. A -> B (pronounced, A implies B) means that if A is true, then B must also be true. It can be expressed as !A || B which can be expressed as a JSON Schema.

{
  "type": "object",
  "properties": {
    "restaurantType": { "enum": ["fast-food", "sit-down"] },
    "total": { "type": "number" },
    "tip": { "type": "number" }
  },
  "anyOf": [
    {
      "not": {
        "properties": { "restaurantType": { "const": "sit-down" } },
        "required": ["restaurantType"]
      }
    },
    { "required": ["tip"] }
  ]
}

ok:

{
  "restaurantType": "sit-down",
  "total": 16.99,
  "tip": 3.4
}
{
  "restaurantType": "fast-food",
  "total": 6.99
}
{ "total": 5.25 }

error:

{
  "restaurantType": "sit-down",
  "total": 16.99
}

Variations of implication can be used to express the same things you can express with the if/then/else keywords. if/then can be expressed as A -> B, if/else can be expressed as !A -> B, and if/then/else can be expressed as A -> B AND !A -> C.

Note:

Since this pattern is not very intuitive, it’s recommended to put your conditionals in $defs with a descriptive name and $ref it into your schema with "allOf": [{ "$ref": "#/$defs/sit-down-restaurant-implies-tip-is-required" }].

Declaring a Dialect

参考文档: https://json-schema.org/understanding-json-schema/reference/schema.html

A version of JSON Schema is called a dialect. A dialect represents the set of keywords and semantics that can be used to evaluate a schema. Each JSON Schema release is a new dialect of JSON Schema. JSON Schema provides a way for you to declare which dialect a schema conforms to and provides ways to describe your own custom dialects.

$schema

The $schema keyword is used to declare which dialect of JSON Schema the schema was written for. The value of the $schema keyword is also the identifier for a schema that can be used to verify that the schema is valid according to the dialect $schema identifies. A schema that describes another schema is called a “meta-schema”.

$schema applies to the entire document and must be at the root level. It does not apply to externally referenced ($ref, $dynamicRef) documents. Those schemas need to declare their own $schema.

If $schema is not used, an implementation might allow you to specify a value externally or it might make assumptions about which specification version should be used to evaluate the schema. It’s recommended that all JSON Schemas have a $schema keyword to communicate to readers and tooling which specification version is intended. Therefore most of the time, you’ll want this at the root of your schema:

"$schema": "https://json-schema.org/draft/2020-12/schema"

Draft 4

The identifier for Draft 4 is http://json-schema.org/draft-04/schema#.

Draft 4 defined a value for $schema without a specific dialect (http://json-schema.org/schema#) which meant, use the latest dialect. This has since been deprecated and should no longer be used.

You might come across references to Draft 5. There is no Draft 5 release of JSON Schema. Draft 5 refers to a no-change revision of the Draft 4 release. It does not add, remove, or change any functionality. It only updates references, makes clarifications, and fixes bugs. Draft 5 describes the Draft 4 release. If you came here looking for information about Draft 5, you’ll find it under Draft 4. We no longer use the “draft” terminology to refer to patch releases to avoid this confusion.

Draft 6

The identifier for Draft 6 is http://json-schema.org/draft-06/schema#.

Draft 7

The identifier for Draft 7 is http://json-schema.org/draft-07/schema#.

Draft 2019-09

The identifier for Draft 2019-09 is https://json-schema.org/draft/2019-09/schema.

Vocabularies

New in draft 2019-09

Documentation Coming Soon

Draft 4-7

Before the introduction of Vocabularies, you could still extend JSON Schema with your custom keywords but the process was much less formalized. The first thing you’ll need is a meta-schema that includes your custom keywords. The best way to do this is to make a copy of the meta-schema for the version you want to extend and make your changes to your copy. You will need to choose a custom URI to identify your custom version. This URI must not be one of the URIs used to identify official JSON Schema specification drafts and should probably include a domain name you own. You can use this URI with the $schema keyword to declare that your schemas use your custom version.

Note

Not all implementations support custom meta-schemas and custom keyword implementations.

Guidelines

One of the strengths of JSON Schema is that it can be written in JSON and used in a variety of environments. For example, it can be used for both front-end and back-end HTML Form validation. The problem with using custom vocabularies is that every environment where you want to use your schemas needs to understand how to evaluate your vocabulary’s keywords. Meta-schemas can be used to ensure that schemas are written correctly, but each implementation will need custom code to understand how to evaluate the vocabulary’s keywords.

Meta-data keywords are the most interoperable because they don’t affect validation. For example, you could add a units keyword. This will always work as expecting with an compliant validator.

{
  "type": "number",
  "units": "kg"
}

ok: 42

error: "42"

The next best candidates for custom keywords are keywords that don’t apply other schemas and don’t modify the behavior of existing keywords. An isEven keyword is an example. In contexts where some validation is better than no validation such as validating an HTML Form in the browser, this schema will perform as well as can be expected. Full validation would still be required and should use a validator that understands the custom keyword.

{
  "type": "integer",
  "isEven": true
}

ok:

2

This passes because the validator doesn’t understand isEven

3

error:

The schema isn’t completely impaired because it doesn’t understand isEven

"3"

The least interoperable type of custom keyword is one that applies other schemas or modifies the behavior of existing keywords. An example would be something like requiredProperties that declares properties and makes them required. This example shows how the schema becomes almost completely useless when evaluated with a validator that doesn’t understand the custom keyword. That doesn’t necessarily mean that requiredProperties is a bad idea for a keyword, it’s just not the right choice if the schema might need to be used in a context that doesn’t understand custom keywords.

{
  "type": "object",
  "requiredProperties": {
    "foo": { "type": "string" }
  }
}

ok:

{ "foo": "bar" }

This passes because requiredProperties is not understood

{}

This passes because requiredProperties is not understood

{ "foo": 42 }