Latency and throughput in ML

October 03, 2023

What is the difference between latency and throughput in ML?

Latency is a measurement in Machine Learning to determine the performance of various models for a specific application. Latency refers to the time taken to process one unit of data provided only one unit of data is processed at a time.

We'll define the latency as the time required to execute a pipeline (e.g. 10 ms) and throughput as the number of items processed in such a pipeline (e.g. 100 imgs/sec). In this particular example throughput and latency yield are the same. A machine learning pipeline could be made faster.

Search This Blog

Artifical intelligence AI, Responsible AI, Machine learning ML, Generative AI

Latency and throughput in ML

Comments

Post a Comment

Popular posts from this blog

Key considerations for accurate and seamless AI agent interaction

Human skills for working effectively with complex AI agents

Top AI solutions and concepts used in them