Closer alignment between international and UK data structures

UK and international (US-based) teams have been discussing closer alignment of data structures so we can better share learning, documentation, tools etc.

This follows:

One proposed approach is to adopt all UK extensions and the recent international changes that switch from using enumerations to taxonomies. We would then use formally defined “application profiles” to represent: the Open Referral “classic” structure; the current Open Referral UK; and further application-specific profiles as needed over time.

Application profiles might also be used to mandate for particular situations properties that are optional in the overall specification.

We could use the exercise to consider minor other backwards compatible version enhancements that have been proposed.

Please give any comments you have in support of this approach or identifying downsides.

@Dominic has investigated techniques whereby different application profiles can be defined.

He is suggesting using a tool such as Jolt for JSON to JSON transformation of the tabular data package definition for the full data structure.

One Jolt transformation would exist for each application profile. Transformations would remove optional parts of the data structure that don’t form part of a profile and may change optional properties to being required.

1 Like

In general, there’s three core things required for any sort of application profile / extension / customisation of a standard:

  • Some way of describing the changes that the profile makes to the base schema
  • Some documentation of any constraints that can’t be encoded in schema, along with any guidance or useful information about how to use the profile.
  • Some way of bundling those together to make a coherent artefact that can be reasoned about, discussed, used, etc.

I hadn’t come across Jolt before - it looks interesting! It feels a bit heavy for this particular use, but that might just be my Java vs Python bias showing. Similar tools include json-merge-patch and python-json-patch. I’d prefer to use something that uses one of the actual standards for diff/patching JSON (JSON Patch and JSON Merge Patch). Some of the properties of JSON Table Schema make it quite hard to use those standards effectively, particularly around how you identify which object you’re working on at any given time. Ideally, a patch format would include some human-readable contextual information - I’d rather be talking about “removing the language column from the meta_table_description table” than “deleting /resources/21/schema/fields/3”).

I think it’s worth us considering rolling our own tooling, and seeing if we can collaborate with the folk at Frictionless Data who have already created a Tabular diff format - there’s maybe something to build on there.

For some inspiration, OCDS have a template extension for their extension mechanism which bundles together all of the stuff I mention above. It uses JSON Merge Patch (JSON Merge Patch works much better with “regular” JSON Schema than JSON Table Schema) alongside some pre-defined containers for metadata and other components.

1 Like

Hi @robredpath, that’s a good comprehensive solution you have suggested.

In the meantime, I have created a basic solution to this problem using Jolt, which I have illustrated in GitHub.

Using this approach a “spec” would have to be created for each required view of the extended data package.

1 Like

As requested by some, I’ve created a separate thread US and UK Alignment and version control for work just starting in this area.