JsonSchema 介绍
参考文档:
SON Schema指的是数据交换中的一种虚拟的“合同”。
JSON验证器负责验证语法错误,JSON Schema负责提供一致性检验。 JSON Schema是数据接收方额第一道防线,也是数据发送方节约时间、保证数据正确的好工具。 JSON Schema可以解决下列有关一致性验证的问题。
- 值的数据类型是否正确:可以具体规定一个值是数字、字符串等类型;
- 是否包含所需的数据:可以规定哪些数据是需要的,哪些是不需要的;
- 值的形式是不是我需要的:可以指定范围、最小值和最大值。
关键字描述:
关键字 | 描述 |
---|---|
$schema |
关键字状态,表示这个模式与v4规范草案书写一致。 |
title | 用它给我们的模式提供了标题。 |
description | 关于模式的描述。 |
type | type关键字在我们json数据定义了第一个约束:必须是一个JSON对象。 |
properties | 定义各种键和它们的值类型,以及用于JSON文件中的最小值和最大值。 |
required | 存放必要属性列表。 |
minimum | 给值设置的约束条件,表示可以接受的最小值。 |
exclusiveMinimum | 如果存在exclusiveMinimum并且具有布尔值true,如果它严格意义上大于minimum的值则实例有效。 |
maximum | 给值设置的约束条件,表示可以接受的最大值。 |
exclusiveMaximum | 如果存在exclusiveMaximum并且具有布尔值true,如果它严格意义上小于maximum的值则实例有效。maximum |
multipleOf | 如果通过这个关键字的值分割实例的结果是一个数字则表示紧靠multipleOf的数字实例是有效的。 |
maxLength | 字符串实例字符的最大长度数值。 |
minLength | 字符串实例字符的最小长度数值。 |
pattern | 如果正则表达式匹配实例成功则字符串实例被认为是有效的。 |
Basic Example
Here is a basic example of a JSON Schema:
{
"title": "Example Schema",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
}
},
"required": ["firstName", "lastName"]
}
Walkthroughs
The two examples below are step-by-step guides into building a schema:
- a simple example which covers a classical product catalog description.
- a more advanced example, using JSON Schema to describe filesystem entries in a Unix-like
/etc/fstab
file.
The Space Telescope Science Institute has also published an guide aimed at schema authors.
Example data
Let’s pretend we’re interacting with a JSON based product catalog. This catalog has a product which has an id , a name , a price , and an optional set of tags .
Example JSON data for a product API
An example product in this API is:
{
"id": 1,
"name": "A green door",
"price": 12.50,
"tags": ["home", "green"]
}
While generally straightforward, that example leaves some open questions. For example, one may ask:
- What is id?
- Is name required?
- Can price be 0?
- Are all tags strings?
When you’re talking about a data format, you want to have metadata about what fields mean, and what valid inputs for those fields are. JSON schema is a specification for standardizing how to answer those questions for JSON data.
Starting the schema
To start a schema definition, let’s begin with a basic JSON schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object"
}
The above schema has four properties
called keywords. The title and description keywords are descriptive only, in that they do not add constraints to the data being validated. The intent of the schema is stated with these two keywords (that is, this schema describes a product).
The type keyword defines the first constraint on our JSON data: it has to be a JSON Object.
Finally, the $schema
keyword states that this schema is written according to the draft v4 specification.
Defining the properties
Next let’s answer our previous questions about this API, starting with id.
What is id?
id is a numeric value that uniquely identifies a product. Since this is the canonical identifier for a product, it doesn’t make sense to have a product without one, so it is required.
In JSON Schema terms, we can update our schema to:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for a product",
"type": "integer"
}
},
"required": ["id"]
}
Is name required?
name is a string value that describes a product. Since there isn’t much to a product without a name, it also is required. Adding this gives us the schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for a product",
"type": "integer"
},
"name": {
"description": "Name of the product",
"type": "string"
}
},
"required": ["id", "name"]
}
Can price be 0?
According to Acme’s docs, there are no free products. In JSON schema a number can have a minimum. By default this minimum is inclusive, so we need to specify exclusiveMinimum . Therefore we can update our schema with price :
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for a product",
"type": "integer"
},
"name": {
"description": "Name of the product",
"type": "string"
},
"price": {
"type": "number",
"minimum": 0,
"exclusiveMinimum": true
}
},
"required": ["id", "name", "price"]
}
Are all tags strings?
Finally, we come to the tags property. Unlike the previous properties, tags have many values, and is represented as a JSON array. According to Acme’s docs, all tags must be strings, but you aren’t required to specify tags. We simply leave tags out of the list of required properties.
However, Acme’s docs add two constraints:
- there must be at least one tag,
- all tags must be unique.
The first constraint can be added with minItems , and the second one by specifying uniqueItems as being true:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product",
"description": "A product from Acme's catalog",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for a product",
"type": "integer"
},
"name": {
"description": "Name of the product",
"type": "string"
},
"price": {
"type": "number",
"minimum": 0,
"exclusiveMinimum": true
},
"tags": {
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
}
},
"required": ["id", "name", "price"]
}
Summary
The above example is by no means definitive of all the types of data JSON schema can define. For more definitive information see the full standard draft.
As a final example, here’s a spec for an array of products, with the products having 2 new properties. The first is a dimensions property for the size of the product, and the second is a warehouseLocationfield for where the warehouse that stores them is geographically located.
And also, since JSON Schema defines a reference schema for a geographic location, instead of coming up with our own, we’ll reference the canonical one.
Set of products:
[
{
"id": 2,
"name": "An ice sculpture",
"price": 12.50,
"tags": ["cold", "ice"],
"dimensions": {
"length": 7.0,
"width": 12.0,
"height": 9.5
},
"warehouseLocation": {
"latitude": -78.75,
"longitude": 20.4
}
},
{
"id": 3,
"name": "A blue mouse",
"price": 25.50,
"dimensions": {
"length": 3.1,
"width": 1.0,
"height": 1.0
},
"warehouseLocation": {
"latitude": 54.4,
"longitude": -32.7
}
}
]
Set of products schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product set",
"type": "array",
"items": {
"title": "Product",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for a product",
"type": "number"
},
"name": {
"type": "string"
},
"price": {
"type": "number",
"minimum": 0,
"exclusiveMinimum": true
},
"tags": {
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
},
"dimensions": {
"type": "object",
"properties": {
"length": {"type": "number"},
"width": {"type": "number"},
"height": {"type": "number"}
},
"required": ["length", "width", "height"]
},
"warehouseLocation": {
"description": "Coordinates of the warehouse with the product",
"$ref": "http://json-schema.org/geo"
}
},
"required": ["id", "name", "price"]
}
}
Purpose
This example shows a possible JSON representation of a hypothetical machine’s mount points as represented in an /etc/fstab
file。
An entry in the fstab file can have many different forms. Here is a possible representation of a full fstab:
{
"/": {
"storage": {
"type": "disk",
"device": "/dev/sda1"
},
"fstype": "btrfs",
"readonly": true
},
"/var": {
"storage": {
"type": "disk",
"label": "8f3ba6f4-5c70-46ec-83af-0d5434953e5f"
},
"fstype": "ext4",
"options": [ "nosuid" ]
},
"/tmp": {
"storage": {
"type": "tmpfs",
"sizeInMB": 64
}
},
"/var/www": {
"storage": {
"type": "nfs",
"server": "my.nfs.server",
"remotePath": "/exports/mypath"
}
}
}
Not all constraints to an fstab file can be modeled using JSON Schema alone; however, it can already represent a good number of them. We will add constraints one after the other until we get to a satisfactory result.
Base schema
We will start with a base schema expressing the following constraints:
- the list of entries is a JSON object;
- the member names (or property names) of this object must all be valid, absolute paths;
- there must be an entry for the root filesystem (ie,
/
)
We also want the schema to be regarded as a draft v4 schema, we must therefore specify $schema :
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"/": {}
},
"patternProperties": {
"^(/[^/]+)+$": {}
},
"additionalProperties": false,
"required": [ "/" ]
}
Note how the valid paths constraint is enforced here:
- we have a properties keyword with only a
/
entry。 - we use patternProperties to match other property names via a regular expression (note that it does not match
/
) - as additionalProperties is false, it constrains object properties to be either
/
or to match the regular expression.
You will notice that the regular expression is explicitly anchored (with ^
and $
): in JSON Schema, regular expressions (in patternProperties and in pattern ) are not anchored by default.
For now, the schemas describing individual entries are empty: we will start describing the constraints in the following paragraphs, using another schema, which we will reference from the main schema when we are ready.
The entry schema - starting out
Here again we will proceed step by step. We will start with the global structure of our schema, which will be as such:
{
"id": "http://some.site.somewhere/entry-schema#",
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "schema for an fstab entry",
"type": "object",
"required": [ "storage" ],
"properties": {
"storage": {
"type": "object",
"oneOf": [
{ "$ref": "#/definitions/diskDevice" },
{ "$ref": "#/definitions/diskUUID" },
{ "$ref": "#/definitions/nfs" },
{ "$ref": "#/definitions/tmpfs" }
]
}
},
"definitions": {
"diskDevice": {},
"diskUUID": {},
"nfs": {},
"tmpfs": {}
}
}
You should already be familiar with some of the constraints:
- an fstab entry must be an object (
"type":"object
) - it must have one property with name storage (
"required":["storage"]
) - the storage property must also be an object.
There are a couple of novelties:
- you will notice the appearance of JSON References, via the $ref keyword; here, all references used are local to the schema, and the fragment part is a URI encoded JSON Pointer;
- you will notice the appearance of an id : this is the URI of this resource; we assume here that this URI is the actual URI of this schema;
- the oneOf keyword is new in draft v4; its value is an array of schemas, and an instance is valid if and only if it is valid against exactly one of these schemas;
- finally, the definitions keyword is a standardized placeholder in which you can define inline subschemas to be used in a schema.
The fstype , options and readonly properties
The entry schema - adding constraints
Let’s now extend this skeleton to add constraints to these three properties. Note that none of them are required:
- we will pretend that we only support
ext3
,ext4
andbtrfs
as filesystem types;
- options must be an array, and the items in this array must be strings; moreover, there must be at least one item, and all items should be unique;
- readonly must be a boolean.
With these added constraints, the schema now looks like this:
{
"id": "http://some.site.somewhere/entry-schema#",
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "schema for an fstab entry",
"type": "object",
"required": [ "storage" ],
"properties": {
"storage": {
"type": "object",
"oneOf": [
{ "$ref": "#/definitions/diskDevice" },
{ "$ref": "#/definitions/diskUUID" },
{ "$ref": "#/definitions/nfs" },
{ "$ref": "#/definitions/tmpfs" }
]
},
"fstype": {
"enum": [ "ext3", "ext4", "btrfs" ]
},
"options": {
"type": "array",
"minItems": 1,
"items": { "type": "string" },
"uniqueItems": true
},
"readonly": { "type": "boolean" }
},
"definitions": {
"diskDevice": {},
"diskUUID": {},
"nfs": {},
"tmpfs": {}
}
}
For now, all definitions are empty (an empty JSON Schema validates all instances). We will write schemas for individual definitions below, and fill these schemas into the entry schema.
The diskDevice storage type
This storage type has two required properties, type and device . The type can only be disk , and the device must be an absolute path starting with /dev
. No other properties are allowed:
{
"properties": {
"type": { "enum": [ "disk" ] },
"device": {
"type": "string",
"pattern": "^/dev/[^/]+(/[^/]+)*$"
}
},
"required": [ "type", "device" ],
"additionalProperties": false
}
You will have noted that we need not specify that type must be a string: the constraint described by enum is enough.
The diskUUID
storage type
This storage type has two required properties, type and label . The type can only be disk , and the label must be a valid UUID. No other properties are allowed:
{
"properties": {
"type": { "enum": [ "disk" ] },
"label": {
"type": "string",
"pattern": "^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
}
},
"required": [ "type", "label" ],
"additionalProperties": false
}
The nfs storage type
This storage type has three required properties: type , server and remotePath . What is more, the server may be either a host name, an IPv4 address or an IPv6 address.
For the constraints on server , we use a new keyword: format . While it is not required that format be supported, we will suppose that it is here:
{
"properties": {
"type": { "enum": [ "nfs" ] },
"remotePath": {
"type": "string",
"pattern": "^(/[^/]+)+$"
},
"server": {
"type": "string",
"oneOf": [
{ "format": "host-name" },
{ "format": "ipv4" },
{ "format": "ipv6" }
]
}
},
"required": [ "type", "server", "remotePath" ],
"additionalProperties": false
}
The tmpfs storage type
This storage type has two required properties: type and sizeInMB . The size can only be an integer. What is more, we will require that the size be between 16 and 512, inclusive:
{
"properties": {
"type": { "enum": [ "tmpfs" ] },
"sizeInMB": {
"type": "integer",
"minimum": 16,
"maximum": 512
}
},
"required": [ "type", "sizeInMB" ],
"additionalProperties": false
}
The full entry schema
The resulting schema is quite large:
{
"id": "http://some.site.somewhere/entry-schema#",
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "schema for an fstab entry",
"type": "object",
"required": [ "storage" ],
"properties": {
"storage": {
"type": "object",
"oneOf": [
{ "$ref": "#/definitions/diskDevice" },
{ "$ref": "#/definitions/diskUUID" },
{ "$ref": "#/definitions/nfs" },
{ "$ref": "#/definitions/tmpfs" }
]
},
"fstype": {
"enum": [ "ext3", "ext4", "btrfs" ]
},
"options": {
"type": "array",
"minItems": 1,
"items": { "type": "string" },
"uniqueItems": true
},
"readonly": { "type": "boolean" }
},
"definitions": {
"diskDevice": {
"properties": {
"type": { "enum": [ "disk" ] },
"device": {
"type": "string",
"pattern": "^/dev/[^/]+(/[^/]+)*$"
}
},
"required": [ "type", "device" ],
"additionalProperties": false
},
"diskUUID": {
"properties": {
"type": { "enum": [ "disk" ] },
"label": {
"type": "string",
"pattern": "^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
}
},
"required": [ "type", "label" ],
"additionalProperties": false
},
"nfs": {
"properties": {
"type": { "enum": [ "nfs" ] },
"remotePath": {
"type": "string",
"pattern": "^(/[^/]+)+$"
},
"server": {
"type": "string",
"oneOf": [
{ "format": "host-name" },
{ "format": "ipv4" },
{ "format": "ipv6" }
]
}
},
"required": [ "type", "server", "remotePath" ],
"additionalProperties": false
},
"tmpfs": {
"properties": {
"type": { "enum": [ "tmpfs" ] },
"sizeInMB": {
"type": "integer",
"minimum": 16,
"maximum": 512
}
},
"required": [ "type", "sizeInMB" ],
"additionalProperties": false
}
}
}
Plugging this into our main schema
Now that all possible entries have been described, we can refer to the entry schema from our main schema. We will, again, use a JSON Reference here:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"/": { "$ref": "http://some.site.somewhere/entry-schema#" }
},
"patternProperties": {
"^(/[^/]+)+$": { "$ref": "http://some.site.somewhere/entry-schema#" }
},
"additionalProperties": false,
"required": [ "/" ]
}
Wrapping up
This example is much more advanced than the previous example; you will have learned of schema referencing and identification, you will have been introduced to other keywords. There are also a few additional points to consider.
The schema can be improved
This is only an example for learning purposes. Some additional constraints could be described. For instance:
- it makes no sense for
/
to be mounted on a tmpfs filesystem;
- it makes no sense to specify the filesystem type if the storage is either NFS or tmpfs.
As an exercise, you can always try and add these constraints. It would probably require splitting the schema further.
Not all constraints can be expressed
JSON Schema limits itself to describing the structure of JSON data, it cannot express functional constraints.
If we take an NFS entry as an example, JSON Schema alone cannot check that the submitted NFS server’s hostname, or IP address, is actually correct: this check is left to applications.
Tools have varying JSON Schema support
While this is not a concern if you know that the schema you write will be used by you alone, you should keep this in mind if you write a schema which other people can potentially use. The schema we have written here has some features which can be problematic for portability:
- format support is optional, and as such other tools may ignore this keyword: this can lead to a different validation outcome for the same data;
- it uses regular expressions: care should be taken not to use any advanced features (such as lookarounds), since they may not be supported at the other end;
- it uses $schema to express the need to use draft v4 syntax, but not all tools support draft v4 (in fact, most don’t support it).