-
-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider moving to a relational data model, like Postgres #8620
Comments
POC repo is https://github.com/john-gom/openfoodfacts-data Using NestJS as a general framework with Mikro-ORM for data modelling / migrations and Postgraphile for GraphQL support |
This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts |
I've created a script to load all product "sto" files into Postgres. Branch is issues/8620-a. |
Some products contain \u0000 in the data which is not compatible with postgres. SQL to fix was:
|
An example database has been uploaded here: https://static.openfoodfacts.org/data/pg/products.dmp This can be restored using pg_restore |
That's a very interesting proposal. Although I still haven't understood how OFF is organized, I definitely have the feeling that a regular relational model could bring many benefits like :
From what I heard during the march 2024 hackaton, there are some recursive relations within the data -- but that's not a hindrance : most database management systems support Common Table Expressions, which is the SQL way for expressing queries on recursive data. So thanks for your work, I'm eager to look at your postgres data. |
Problem
Currently the OFF data is in a lot of different places (taxonomy files, MongoDB, STO files) which makes it difficult to perform queries across the data sets.
Aggregated queries against MongoDB are also very slow and the author feels these would be considerably faster against a relational model
Proposed solution
Move to a relational model
Tasks
Part of
The text was updated successfully, but these errors were encountered: