JSONID
Description
Identification of JSON, YAML, and TOML document types.
Functionality
JSONID parses serialization/deserialization formats ("serde") such as JSON, YAML, and TOML to provide unambiguous identification. JSONID also introduces a declarative syntax for writing document type signatures to enable identification of specific serde document types. Key-value attributes can be shared across formats, and so signatures for JSON and YAML, for example, need only be written once.
Registry
As a temporary placeholder JSONID signatures are available in a registry. The long-term goal of this project is to enable other registries to delivery JSONID compatible signatures, e.g. PRONOM, Wikidata, and remove the need for a centralized resource like this.
Technical characteristics
JSONID explores potential technical characteristics that can be attributed to serde formats. An example for a basic JSON object might look as follows:
{
"content_length": 82,
"number_of_lines": 9,
"line_warning": false,
"top_level_keys_count": 1,
"top_level_keys": [
"key1"
],
"top_level_types": [
"map"
],
"depth": 4,
"heterogeneous_list_types": false,
"fingerprint": {
"unf": "UNF:6:YEKQWBGm75JsN6H+8SzYRg==",
"cid": "bafkreidexnd3r76r5h3invwvu554573px5z4fg4uglw4pextmqc765kz64"
},
"doctype": "JSON",
"encoding": "UTF-8",
"agent": "jsonid/0.12.0 (ffdev-info)"
}
The use of these technical characteristics will be explored in the documentation and future writing.
Universal fingerprint
JSONID exports two universal fingerprints enabling the assertion of equivalence between different data objects. Universal Numerical Fingerprint (UNF) is also used in the Dataverse project. Content Identifiers (CIDs) come from the IPFS project and enable content-addressed storage within that ecosystem and others.
The significance of these fingerprinting techniques is their application to identical data structures stored in different file formats.
The checksum of different file formats will always evaluate differently, but analysed as data structures, we can begin to appraise data beyond its presentation.
Fingerprinting example
| Content | {
"hello": "world",
"goodbye": false,
"values": [
1,
2,
3.142
]
} |
goodbye: false hello: world values: - 1 - 2 - 3.142 |
hello = "world" values = [1, 2, 3.142] goodbye = false |
| Type | JSON | YAML | TOML |
| Checksum (MD5) | bcd5a37f36ada2e4b72144d90a1427d5 |
b5e75fdc100032f2744eff1e1bdf5b88 |
04b33e24e0cc208c6bd70fabaef3a9c5 |
| UNF | UNF:6:97EfAWBIQlObVCVwa7kc0g== |
UNF:6:97EfAWBIQlObVCVwa7kc0g== |
UNF:6:97EfAWBIQlObVCVwa7kc0g== |
| CID | bafkreiawsimwdn4blnb7scz2cfwtdksifrayccsl3z6gmxam6uxddctkoy |
bafkreiawsimwdn4blnb7scz2cfwtdksifrayccsl3z6gmxam6uxddctkoy |
bafkreiawsimwdn4blnb7scz2cfwtdksifrayccsl3z6gmxam6uxddctkoy |
PRONOM signature development
JSONID provides a high-level language for output of PRONOM compatible signatures. The feature set is still in its BETA phase but JSONID provides two distinct capabilities:
Registry output
JSONID's registry can be output using the `--pronom` flag. A signature file will be created under `jsonid_pronom.xml` which can be imported into DROID for identification of document types registered with JSONID.
JSONID's registry is output alongisde a handful of baseline JSON signatures designed to capture "plain"-JSON that is not yet encoded in the registry.
Signature development
A standalone `json2pronom` utility is provided for creation of potentially robust DROID compatible signatures.
As a high-level language, signatures can be defined in easy to understand syntax and then output consistently via the `json2pronom` utility. Signatures include sensible defaults for whitespace and other aspects that are difficult for signature developers to consistently anticipate when writing JSON based signatures.
See the JSONID docs for more information.
User experiences
Development Activity
All development activity is visible on GitHub: https://github.com/ffdev-info/jsonid/commits
Release Feed
Below the last 3 release feeds:
- 2026-01-04 23:05:32
- [tag:github.com,2008:Repository/964720703/0.12.2 0.12.2]
- by github-actions[bot]
- 2026-01-04 17:28:30
- [tag:github.com,2008:Repository/964720703/0.12.1 0.12.1]
- by github-actions[bot]
- 2026-01-04 17:01:25
- [tag:github.com,2008:Repository/964720703/0.12.0 0.12.0]
- by github-actions[bot]
Activity Feed
Below the last 5 commits:
- 2026-01-04 23:08:36
- [tag:github.com,2008:Grit::Commit/a689a1554e664f9ad66c6432814c58863c97083a Convert from Alpha status to Beta]
- by ross-spencer https://github.com/ross-spencer
- 2026-01-04 23:00:35
- [tag:github.com,2008:Grit::Commit/ab77f2ec8c63cda665d3ed13e2b0081f042cc13b Update analysis options]
- by ross-spencer https://github.com/ross-spencer
- 2026-01-04 17:29:52
- [tag:github.com,2008:Grit::Commit/378a1c20249a48aff217c018843f64752fb8581d Add agent to analysis output]
- by ross-spencer https://github.com/ross-spencer
- 2026-01-04 17:29:52
- [tag:github.com,2008:Grit::Commit/32ee6c7cfa3a9d173b14d392ca486e5abb8dee2c Provide analysis only entry-point]
- by ross-spencer https://github.com/ross-spencer
- 2026-01-04 17:29:52
- [tag:github.com,2008:Grit::Commit/28861bd148373edf94896e6918810e9757ae0cbd Fix jsonid url]
- by ross-spencer https://github.com/ross-spencer