-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor runCheck
method in XMLDialect
class to remove system metadata related arguments
#463
Comments
runCheck
method in MDQEngine
class to remove system metadata related argumentsrunCheck
method in XMLDialect
class to remove system metadata related arguments
@doulikecookiedough Take a look at the SOLR report that is stored for each run report -- you can tell that it includes some sysmeta fields with a command like this:
{
"originMemberNode": "urn:node:ARCTIC",
"rightsHolder": "http://orcid.org/0000-0003-1410-628X",
"groups": [],
"dateUploaded": "Jan 6, 2025, 9:04:29 PM",
"formatId": "https://eml.ecoinformatics.org/eml-2.2.0",
"obsoletes": "urn:uuid:28b531d7-61d2-412e-b3f4-8caf9c3d8ece",
"obsoletedBy": null,
"seriesId": null
} In addition, sysmeta fields are also added directly into the SOLR record for each run, so that SOLR can be used to facet and group results. Here's an example SOLR query that shows the fields from sysmeta that are in the SOLR schema for the quality service (this only works if you first forward port 8983 to the cluster with
{
"responseHeader": {
"status": 0,
"QTime": 2,
"params": {
"q": "*:*",
"indent": "true",
"start": "0",
"q.op": "OR",
"wt-json": "",
"rows": "2"
}
},
"response": {
"numFound": 89543,
"start": 0,
"numFoundExact": true,
"docs": [
{
"metadataId": "doi:10.18739/A2KW57J9Q",
"formatId": "https://nceas.ucsb.edu/mdqe/v1",
"runId": "724bf66a-f5b3-425f-b968-25be0572733f",
"suiteId": "FAIR-suite-0.3.1",
"timestamp": "2022-01-11T01:14:37.701Z",
"checksPassed": 32,
"checksWarned": 9,
"checksFailed": 10,
"checksInfo": 0,
"checksErrored": 0,
"checkCount": 51,
"scoreOverall": 0.7619048,
"scoreByType_Interoperable_f": 0.78,
"scoreByType_Reusable_f": 0.64,
"scoreByType_Accessible_f": 0.62,
"scoreByType_Findable_f": 0.93,
"_version_": 1721618849384628224,
"rightsHolder": "CN=DBO,DC=dataone,DC=org",
"datasource": "urn:node:ARCTIC",
"dateUploaded": "2020-07-23T17:16:13Z",
"obsoletes": "doi:10.18739/A29S1KK5B",
"metadataFormatId": "https://eml.ecoinformatics.org/eml-2.2.0",
"group": [
"CN=DBO,DC=dataone,DC=org"
]
},
{
"metadataId": "doi:10.18739/A2KH0F05D",
"formatId": "https://nceas.ucsb.edu/mdqe/v1",
"runId": "fbae7d4b-9a10-4b6a-b265-229da29bd73c",
"suiteId": "FAIR-suite-0.3.1",
"timestamp": "2022-01-11T01:14:29.644Z",
"checksPassed": 20,
"checksWarned": 11,
"checksFailed": 20,
"checksInfo": 0,
"checksErrored": 0,
"checkCount": 51,
"scoreOverall": 0.5,
"scoreByType_Interoperable_f": 0.12,
"scoreByType_Reusable_f": 0.2,
"scoreByType_Accessible_f": 0.62,
"scoreByType_Findable_f": 0.86,
"_version_": 1721618843173912576,
"rightsHolder": "CN=DBO,DC=dataone,DC=org",
"datasource": "urn:node:ARCTIC",
"dateUploaded": "2020-07-17T22:06:05Z",
"obsoletes": "doi:10.18739/A2HH6C63Z",
"metadataFormatId": "eml://ecoinformatics.org/eml-2.1.1",
"group": [
"CN=DBO,DC=dataone,DC=org"
]
}
]
}
} |
It appears that
Since the checks themselves moving forward will use (or be refactored to use)
To Do:
|
Check-in:
To Do:
|
Before
runCheck
is executed fromrunSuite
, it appears that we are also setting the system metadata to the xml results produced from the check. This should no longer be required unless it is used by thesolr
index in some form.Investigate and then remove the system metadata related code if it is redundant.
The text was updated successfully, but these errors were encountered: