Proposal: serve API product GET requests from an async server #10732
Labels
🪶 Apache
We use Apache as a server to run Open Food Facts
API Refactor
infrastructure
https://wiki.openfoodfacts.org/Infrastructure
🚅 Performance
Perl
Related to the Perl code of the ProductOpener server
The performance issues we're currently experiencing led us to analyze what requests are taking most processing time on Apache server: https://docs.google.com/document/d/13rYXR0TxR2hUc0XEKzKcBT6ndcd5_L3yeP_L6UjZwzs/edit.
The analysis revealed that facet-related queries were the most costly.
We only have 50 Apache workers, so when most workers are busy waiting for MongoDB or off-query, we can't respond to basic
GET /api/v*/products/{code}
queries that only require a disk access (to fetch the sto file) and a bit of RAM to get the translations. These requests account for 15% of all requests handled by Product Opener.This route is the most-used API endpoint by our own mobile app and reusers.
My proposal would be to use a new asynchronous service (written with FastAPI in Python, for example), to handle read-only
GET /api/v*/products/{code}
requests.Having a distinct service that takes care of read-only API queries would make sure that our own app (or third-party apps) won't fail even if ProductOpener does. Asynchronicity means that:
The addition of knowledge panels could also be migrated to this new service later.
I think it's a better alternative than #8934 that, while being faster (served directly by nginx), is more disk-hungry, won't be available on all products and doesn't play nicely with taxonomized fields translations.
This could also be a first step to tackle #5170.
Write queries are not very common (0.25% of queries handle by Product Opener), and most of the complexity of the codebase comes from data processing/score computation associated with write queries.
That's why I think it's better to keep POST queries out of the scope of this proposal for now.
Limits
This service wouldn't account for the 53% of queries that are product HTML pages.
Serving these pages through this async service would be much more difficult, as it would mean to migrate all the HTML logic there.
The text was updated successfully, but these errors were encountered: