Latency and throughput in ML
What is the difference between latency and throughput in ML?
Latency is a measurement in Machine Learning to determine the performance of various models for a specific application. Latency refers to the time taken to process one unit of data provided only one unit of data is processed at a time.
We'll define the latency as the time required to execute a pipeline (e.g. 10 ms) and throughput as the number of items processed in such a pipeline (e.g. 100 imgs/sec). In this particular example throughput and latency yield are the same. A machine learning pipeline could be made faster.
Comments
Post a Comment