-
-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Generate stores taxonomy from name-suggestion-index #9607
base: main
Are you sure you want to change the base?
feat: Generate stores taxonomy from name-suggestion-index #9607
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9607 +/- ##
========================================
Coverage 49.54% 49.55%
========================================
Files 67 67
Lines 20650 20765 +115
Branches 4980 4998 +18
========================================
+ Hits 10231 10290 +59
- Misses 9131 9185 +54
- Partials 1288 1290 +2 ☔ View full report in Codecov by Sentry. |
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
@teolemon can you review this PR ? I personally lack context to understand. |
cc @raphodn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks really nice to me, I just required a review from the 2 Raphael from Open Prices, since this will potentially be really helpful for them.
What was your personal intent behind this PR ?
Bigger picture, I want to link a particular food item to "where can I get it". Use cases:
Name suggestion index solves a lot of the hard work of "what are the stores that sell food around the world"; "what are their Wikidata IDs?", and given it is used heavily by open streetmap you get a growing list of stores maintained around the world. Other use cases:
Finally, there was a change of legal advice in the OSM community around scraping of facts from websites being fair game. If this project ever decided to go in a similar direction, having more Wikidata against stores plus using some of the many, many pre built spiders in alltheplaces could, in some circumstances, yield pricing and other product information as structured data. |
Also : https://github.com/openfoodfacts/open-prices |
Quality Gate passedIssues Measures |
@raphodn can you validate the PR ? (If it fits openprice needs) |
We should be careful here creating a taxonomy independent of OFF Prices. I would prefer to uses Prices as starting point. |
Also integrate with the existing countries taxonomy. |
I would move this to another taxonomy: store_brands in order to distinguish from actual individual stores. |
We already have store data entered by the users. We should integrate that in someway. |
I do not get where the data comes from. |
So: OpenStreetMap + Wikipedia have found they have some intersecting interests. One of those is wikidata, so there are stable identifiers for concepts like brand of store/chain of store. OpenStreetMap has a large number of contributors who survey things on the ground, but of course everyone agreeing that a Carrefour is a Carrefour in the exact same way is difficult. So; to help but not replace end user judgement, a number of the editing tools made the Name Suggestion Index. Example of it in use by an editor: Most of the time, it's right, or an end user says it's not. So, what does it have to do with OpenFoodFacts?
So, we are left with:
Does that make sense? |
Complex way of explaining. Note I am a contributor to OSM and Wikidata and taxonomy maintainer/developer for OFF. So I get all the details. I did not know this tool though, interesting. |
I am all in favour of getting this taxonomy going. Checkout the wiki link above. Integration with prices.openfoodfacts.org is key however. We should introduce new taxonomies with small steps, one usecase at the time. Indeed using OSM and wikidata will be central in this. OSM for actual shop locations with OSM identifiers and Wikidata for shop brand information. |
Fixes #7632 ?
What
This adds a snapshot of the BSD 3 clause licenced name-suggestion-index records for shop/supermarket, transformed into a stores.txt taxinomy format.
There may be some mapping inconsistencies with countries.txt, as I used ISO3601 2 letter codes and added 1 or 2 minor adjustments.
While I added this as a snapshot in git, it would be trivial to fetch the live data from upstream.
Running/generating:
What should be done to make this better