-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update data.json to comply with DCAT-US Metadata Schema #89
Comments
Upstream JKAN issue, although I haven't checked recently if it complies with it's single data set. I believe some items needed to just be changed to output in a different field. Some items like bureauCode IMU can be ignored as that's specific to the US government. They used to have a different validator for non-federal orgs but I can't find that option. |
I reached out to the Data.gov folks. Many of the "mandatory" fields are only mandatory for federal agencies. They ran it for non-federal data sets and identified only one mandatory field that we don't have:
We would need to have a unique id expressed in the "identifier" field. |
It's a bit more complicated than I had hoped - but I understand why: "This field allows third parties to maintain a consistent record for datasets even if title or URLs are updated. Agencies may integrate an existing system for maintaining unique identifiers. Each identifier must be unique across the agency’s catalog and remain fixed. It is highly recommended that a URI (preferably an HTTP URL) be used to provide a globally unique identifier. Identifier URLs should be designed and maintained to persist indefinitely regardless of whether the URL of the resource itself changes." I see two options:
|
@BryanQuigley I received an annual check-in email from Data.gov asking for our harvest information, which caused me to come back to this. After a year of editing the catalog over the past year, I definitely have seen several examples of the filename changing as well as being deleted. However, I also think it is the simplest approach as it could be generated in the data.json file without having to persist a unique ID and all of the issues that will go along with that. So I propose we implement this using the file name as the unique ID. |
the federal data.gov catalog is able harvest catalogs from local jurisdictions. They used to harvest data from OpenDataPhilly but have suspended this harvesting because our data.json file is not compliant with the current metadata schema.
Our data.json file says it's 1.1, but when we run the validator at https://catalog.data.gov/dcat-us/validator (using the data.json at https://opendataphilly.org/data.json), there are a lot of errors, including:
The text was updated successfully, but these errors were encountered: