Comments (7)
WIP.
To review:
- https://www.w3.org/wiki/HashVsSlash
- http://ukgovld.github.io/ukgovldwg/recommendations/uri-patterns.html
- https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/60975/designing-URI-sets-uk-public-sector.pdf
Data Cube | gss-data.org.uk/data/{dataset-name} |
|
Data Structure Definition | gss-data.org.uk/data/{dataset-name}/structure |
|
Component | gss-data.org.uk/data/{dataset-name}/component/{component-name} |
|
Dimension | Global | gss-data.org.uk/def/dimension/{dimension-name} |
Dimension | Within-family | gss-data.org.uk/def/{family-name}/dimension/{dimension-name} |
Dimension | Local | gss-data.org.uk/data/{dataset-name}/dimension/{dimension-name} |
Measure | Global | gss-data.org.uk/def/measure/{measure-name} |
Measure | Within-family | gss-data.org.uk/def/{family-name}/measure/{measure-name} |
Measure | Local | gss-data.org.uk/data/{dataset-name}/measure/{measure-name} |
Attribute | Global | gss-data.org.uk/def/attribute/{attribute-name} |
Attribute | Within-family | gss-data.org.uk/def/{family-name}/attribute/{attribute-name} |
Attribute | Local | gss-data.org.uk/data/{dataset-name}/attribute/{attribute-name} |
Codelist | Global | gss-data.org.uk/def/codelist/{codelist-name} |
Codelist | Within-family | gss-data.org.uk/def/{family-name}/codelist/{codelist-name} |
Codelist | Local | gss-data.org.uk/data/{dataset-name}/codelist/{codelist-name} |
Concept | Global | gss-data.org.uk/def/codelist/{codelist-name}/concept/{concept-name} |
Concept | Within-family | gss-data.org.uk/def/{family-name}/codelist/{codelist-name}/concept/{concept-name} |
Concept | Local | gss-data.org.uk/data/{dataset-name}/codelist/{codelist-name}/concept/{concept-name} |
Observation | gss-data.org.uk/data/{dataset-name}/obs/{concept-1}/{concept-2}/.../{concept-n} |
from csvcubed.
Will likely want API work to progress further before implementing any change here. csvwlib
is progressing on the basis of the previous URI schemes.
Will move to icebox to revisit at a later date.
from csvcubed.
Also need to consider putting family and common/global identifiers under reference.data.gov.uk and how we can fit with what's already there.
from csvcubed.
For the purposes of csvwlib is a "family"-level components accurate and suitably descriptive? Our use of family descriptors might be confusing to people using our dataset. We've got contained (i.e. in Qube) components, externally controlled components (i.e. reference.data.gov.uk), and this third one where the publisher/user is assumed to know what they're doing.
Do we want to surface functionality to generate this third type of standardised URIs in csvwlib? I know we need this internally, but I don't think that we should make this apart of the public API.
from csvcubed.
I dislike the notion of family
being used for anything other than an internal ONS identifier to coincide with projects we're undertaking. I prefer the idea of theme-level URIs.
third one where the publisher/user is assumed to know what they're doing.
I would like to be able to say to a department like BEIS or DfE that they should do some discovery on the sorts of dimensions/measures they're reporting on, and then to advise them to coin some theme level uris like http://data.gov.uk/climate/dimension/greenhouse-gas
for reuse across their publications. Essentially teaching them to manage an ontology of components.
Atm we do this outside of the info.json
using CSV files containing data about these components e.g. here. I don't know whether we would ask departments to do the same.
from csvcubed.
I think that coining these "common" dimensions should be treated as a special service for government departments provided by us, and that effectively they should be treated as global. Eventually we could allow the departments to maintain the codelists themselves, but certainly not at first.
from csvcubed.
Whatever we call the dimensions and codelists that are used by more than one dataset, we do need to have some hierarchy and a way to group these things together, so need to decide on URI components that reflect the hierarchical grouping.
Normally with URIs, we'd use the publishing organisation's DNS name as the top level of this hierarchy, but as we're expecting these particular dimensions and codelists to be shared and not necessarily owned or managed by a single organisation, but rather by a consortium of concerned orgs, we're proposing some top-levels of the hierarchy to reflect these groups/consortia.
The term "family" has been defined as "a group of datasets that share common dimensions & codelists", so seems to fit quite well with this top level of the hierarchy, but we could equally use another term with a similar definition to reflect more the notion that it should be a shared responsibility to manage the dimensions & codelists under that grouping.
If we're to use reference.data.gov.uk, we need some top level paths that don't clash with the existing time period paths (year, month, etc.).
from csvcubed.
Related Issues (20)
- Integrate prototype CsvWBrowser into csvcubed
- Move remaining pieces of inspect functionality into `csvcubed.inspect` package
- Worked Example #4 for inspect API docs
- [BUG] url-safing explicit configuration uri-identifiers in user code lists and data sets sets capitals to lowercase
- AP - Use `skos:exactMatch` instead of `owl:sameAs` in code lists
- AP - Investigate using `skos:ConceptSchemes` to hold/define attribute values against attribute resources HOT 1
- Further enhancements to `csvcubed.inspect` package
- Convert from pivoted to standard shape when loading dataframe
- Convert from standard to pivoted shape when loading dataframe
- _map_col_name_to_attr_val_uris using pandas_input_to_columnar_str
- Implement skos:conceptScheme to generate Attribute Value codelists
- Investigation of how to access and manipulate ONS Geographies in order to implement hierarchical codelists HOT 1
- Create a tool to impose ONS Geographies hierarchy within a codelist
- Implement pandas 2.0 upgrade
- Issue with urllib3 update HOT 1
- BUG - Inspect API cannot load dataframe when CSV-W has suppressed columns
- Investigate `qb:DataSet` being an instance of a `dcat:Distribution` HOT 2
- [IMPROVEMENT] Incomplete Dimension Error
- New Feature: move beyond assuming that all dimensions are nominal HOT 2
- Attribute value code lists - output conceptScheme
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from csvcubed.