Conventions used in this book
参考文档: https://json-schema.org/understanding-json-schema/conventions.html
Language-specific notes
The names of the basic types in JavaScript and JSON can be confusing when coming from another dynamic language. I’m a Python programmer by day, so I’ve notated here when the names for things are different from what they are in Python, and any other Python-specific advice for using JSON and JSON Schema. I’m by no means trying to create a Python bias to this book, but it is what I know, so I’ve started there. In the long run, I hope this book will be useful to programmers of all stripes, so if you’re interested in translating the Python references into Algol-68 or any other language you may know, pull requests are welcome!
The language-specific sections are shown with tabs for each language. Once you choose a language, that choice will be remembered as you read on from page to page.
For example, here’s a language-specific section with advice on using JSON in a few different languages:
Python
In Python, JSON can be read using the json module in the standard library.
Ruby
In Ruby, JSON can be read using the json gem.
C
For C, you may want to consider using Jansson to read and write JSON.
Draft-specific notes
The JSON Schema standard has been through a number of revisions or “drafts”. The current version is Draft 2020-12, but some older drafts are still widely used as well.
The text is written to encourage the use of Draft 2020-12 and gives priority to the latest conventions and features, but where it differs from earlier drafts, those differences are highlighted in special call-outs. If you only wish to target Draft 2020-12, you can safely ignore those sections.
Examples
There are many examples throughout this book, and they all follow the same format. At the beginning of each example is a short JSON schema, illustrating a particular principle, followed by short JSON snippets that are either valid or invalid against that schema. Valid examples are in green, with a checkmark. Invalid examples are in red, with a cross. Often there are comments in between to explain why something is or isn’t valid.
Note:
These examples are tested automatically whenever the book is built, so hopefully they are not just helpful, but also correct!
For example, here’s a snippet illustrating how to use the number
type:
{ "type": "number" }
int: 42
, -1
Simple floating point number: 5.0
Exponential notation also works: 2.99792458e8
Numbers as strings are rejected: "42"
What is a schema?
参考文档: https://json-schema.org/understanding-json-schema/about.html
If you’ve ever used XML Schema, RelaxNG or ASN.1 you probably already know what a schema is and you can happily skip along to the next section. If all that sounds like gobbledygook to you, you’ve come to the right place. To define what JSON Schema is, we should probably first define what JSON is.
JSON stands for “JavaScript Object Notation”, a simple data interchange format. It began as a notation for the world wide web. Since JavaScript exists in most web browsers, and JSON is based on JavaScript, it’s very easy to support there. However, it has proven useful enough and simple enough that it is now used in many other contexts that don’t involve web surfing.
At its heart, JSON is built on the following data structures:
- object:
{ "key1": "value1", "key2": "value2" }
- array:
[ "first", "second", "third" ]
- number:
42
,3.1415926
- string:
"This is a string"
- boolean:
true
,false
- null:
null
These types have analogs in most programming languages, though they may go by different names.
Python
JSON | Python |
---|---|
string | string |
number | int/float |
object | dict |
array | list |
boolean | bool |
null | None |
Footnotes:
Since JSON strings always support unicode, they are analogous to
unicode
on Python 2.x andstr
on Python 3.x.JSON does not have separate types for integer and floating-point.
With these simple data types, all kinds of structured data can be represented. With that great flexibility comes great responsibility, however, as the same concept could be represented in myriad ways. For example, you could imagine representing information about a person in JSON in different ways:
{
"name": "George Washington",
"birthday": "February 22, 1732",
"address": "Mount Vernon, Virginia, United States"
}
{
"first_name": "George",
"last_name": "Washington",
"birthday": "1732-02-22",
"address": {
"street_address": "3200 Mount Vernon Memorial Highway",
"city": "Mount Vernon",
"state": "Virginia",
"country": "United States"
}
}
Both representations are equally valid, though one is clearly more formal than the other. The design of a record will largely depend on its intended use within the application, so there’s no right or wrong answer here. However, when an application says “give me a JSON record for a person”, it’s important to know exactly how that record should be organized. For example, we need to know what fields are expected, and how the values are represented. That’s where JSON Schema comes in. The following JSON Schema fragment describes how the second example above is structured. Don’t worry too much about the details for now. They are explained in subsequent chapters.
{
"type": "object",
"properties": {
"first_name": { "type": "string" },
"last_name": { "type": "string" },
"birthday": { "type": "string", "format": "date" },
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" },
"country": { "type" : "string" }
}
}
}
}
By “validating” the first example against this schema, you can see that it fails:
{
"name": "George Washington",
"birthday": "February 22, 1732",
"address": "Mount Vernon, Virginia, United States"
}
However, the second example passes:
{
"first_name": "George",
"last_name": "Washington",
"birthday": "1732-02-22",
"address": {
"street_address": "3200 Mount Vernon Memorial Highway",
"city": "Mount Vernon",
"state": "Virginia",
"country": "United States"
}
}
You may have noticed that the JSON Schema itself is written in JSON. It is data itself, not a computer program. It’s just a declarative format for “describing the structure of other data”. This is both its strength and its weakness (which it shares with other similar schema languages). It is easy to concisely describe the surface structure of data, and automate validating data against it. However, since a JSON Schema can’t contain arbitrary code, there are certain constraints on the relationships between data elements that can’t be expressed. Any “validation tool” for a sufficiently complex data format, therefore, will likely have two phases of validation: one at the schema (or structural) level, and one at the semantic level. The latter check will likely need to be implemented using a more general-purpose programming language.