Skip to content

Commit

Permalink
fix: improve parsing of 'category (type 1, type 2..)' ingredients (#1…
Browse files Browse the repository at this point in the history
…0999)

PR to better handle things like "vegetal oil (palm, rapeseed)":

- instead of turning "vegetal oil (palm, rapeseed)" to "palm vegetal
oil", "rapeseed vegetal oil", we now turn it to "vegetal oil (palm
vegetal oil, rapeseed vegetal oil)", as keeping a parent ingredient is
better for ingredient percent estimation
- improved the definition of all the variations of "huile et stéarine
végétales non hydrogénées (colza, palme)" to have better coverage
- added support for percentages like "huiles végétales 54% (colza,
palme)"

Work in progress, some tests will need to be updated.

---------

Co-authored-by: Open Food Facts Bot <[email protected]>
Co-authored-by: Pierre Slamich <[email protected]>
Co-authored-by: Alex Garel <[email protected]>
Co-authored-by: Alex Garel <[email protected]>
  • Loading branch information
5 people authored Dec 10, 2024
1 parent 30257c1 commit 42618ac
Show file tree
Hide file tree
Showing 76 changed files with 666 additions and 412 deletions.
33 changes: 17 additions & 16 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -505,21 +505,6 @@ Data import:
- any-glob-to-any-file: 'cgi/generate_sample_import_file.pl'

# https://openfoodfacts.github.io/openfoodfacts-server/dev/ref-perl-pod/ProductOpener/Ingredients.html
🥗 Ingredients:
- changed-files:
- any-glob-to-any-file: 'lib/ProductOpener/Ingredients.pm'
- any-glob-to-any-file: 'taxonomies/food/ingredients.txt'
- any-glob-to-any-file: 'tests/unit/ingredients.t'
- any-glob-to-any-file: 'tests/unit/ingredients_analysis.t'
- any-glob-to-any-file: 'tests/unit/ingredients_clean.t'
- any-glob-to-any-file: 'tests/unit/ingredients_nesting.t'
- any-glob-to-any-file: 'tests/unit/ingredients_parsing.t'
- any-glob-to-any-file: 'tests/unit/ingredients_parsing_todo.t'
- any-glob-to-any-file: 'tests/unit/ingredients_percent.t'
- any-glob-to-any-file: 'tests/unit/ingredients_processing.t'
- any-glob-to-any-file: 'tests/unit/ingredients_tags.t'
- any-glob-to-any-file: 'scripts/test_ingredient_parser.pl'

# We want to improve the analysis of ingredient list to extract ingredients and their properties, across languages.
# This is helpful to determine if a product is vegan, vegetarian, contains palm oil, is kosher/halal, the exact Nutri-Score, how much environmental impact it has…
# https://wiki.openfoodfacts.org/Ingredients_Extraction_and_Analysis
Expand All @@ -538,7 +523,23 @@ Data import:
- any-glob-to-any-file: 'scripts/extract_individual_ingredients.pl'
- any-glob-to-any-file: 'scripts/aggregate_ingredients.pl'
- any-glob-to-any-file: 'lib/ProductOpener/Ingredients.pm'

- any-glob-to-any-file: 'tests/unit/ingredients_parsing.t'
- any-glob-to-any-file: 'lib/ProductOpener/Ingredients.pm'
- any-glob-to-any-file: 'taxonomies/food/ingredients.txt'
- any-glob-to-any-file: 'tests/unit/ingredients.t'
- any-glob-to-any-file: 'tests/unit/ingredients_analysis.t'
- any-glob-to-any-file: 'tests/unit/ingredients_clean.t'
- any-glob-to-any-file: 'tests/unit/ingredients_nesting.t'
- any-glob-to-any-file: 'tests/unit/ingredients_parsing_todo.t'
- any-glob-to-any-file: 'tests/unit/ingredients_percent.t'
- any-glob-to-any-file: 'tests/unit/ingredients_processing.t'
- any-glob-to-any-file: 'tests/unit/ingredients_tags.t'
- any-glob-to-any-file: 'scripts/test_ingredient_parser.pl'
- any-glob-to-any-file: 'tests/unit/expected_test_results/ingredients/en-category-types.json'
- any-glob-to-any-file: 'tests/unit/expected_test_results/ingredients/fr-infinite-loop-allergens.json'
- any-glob-to-any-file: 'tests/unit/expected_test_results/ingredients/fr-marmelade.json'
- any-glob-to-any-file: 'tests/unit/expected_test_results/ingredients/fr-percents-origins-2.json'
- any-glob-to-any-file: 'tests/unit/expected_test_results/ingredients/ru-russian-oil.json'
# Labels are all claims present on product packages.
# https://wiki.openfoodfacts.org/Labels
# Tracking issue:
Expand Down
Loading

0 comments on commit 42618ac

Please sign in to comment.