Measuring performance and latency is crucial to ensure a timely and frictionless user experience. It's important to assess these factors to guarantee prompt and smooth delivery of the expected benefit.
The interactions with LLM may have several layers between the user and the LLM, like web frontend, network connections, making it vital to monitor and measure the delay at each stage. This evaluation, however, specifically focuses on the orchestrator.
Please refer to the Enterprise RAG load test docs to know more about how to load test Enterprise RAG.