Examples of Precision, Recall and F1 scores in machine learning

October 03, 2023

Precision measures the accuracy of positive predictions, indicating the proportion of correctly identified positive instances out of all predicted positives. Recall gauges the model's ability to find all positive instances, representing the proportion of correctly identified positive instances out of all actual positives. F1 score combines precision and recall into a single metric, striking a balance between them, making it useful for binary classification tasks. It's the harmonic mean of precision and recall, providing a more comprehensive assessment of a model's performance, especially when there's an imbalance between positive and negative classes.

A detailed article: https://towardsdatascience.com/a-look-at-precision-recall-and-f1-score-36b5fd0dd3ec

Certainly, here are three examples each for precision, recall, and F1 score:

**Precision Examples:**

1. **Spam Email Detection:** In email filtering, precision measures the accuracy of classifying an email as spam. If a spam filter correctly identifies 95 out of 100 emails as spam (true positives) but also incorrectly marks 5 legitimate emails as spam (false positives), the precision is 95% (95/100).

2. **Medical Diagnosis:** In medical testing, precision assesses the accuracy of identifying patients with a particular disease. If a diagnostic test correctly identifies 90 out of 100 patients with a disease (true positives) but also incorrectly labels 10 healthy individuals as positive (false positives), the precision is 90% (90/100).

3. **Search Engine Results:** When a search engine returns results, precision measures the relevance of the displayed links. If, out of 20 search results, 15 are genuinely relevant to the query (true positives), and 5 are unrelated (false positives), the precision is 75% (15/20).

**Recall Examples:**

1. **Information Retrieval:** In a search engine, recall assesses the ability to retrieve all relevant documents. If there are 100 relevant documents, and the search engine returns 90 of them (true positives) but misses 10 (false negatives), the recall is 90% (90/100).

2. **Fault Detection:** In quality control, recall measures the ability to identify all faulty products. If a factory produces 100 faulty items, and a detection system correctly identifies 80 of them (true positives) but misses 20 (false negatives), the recall is 80% (80/100).

3. **Document Classification:** In text categorization, recall evaluates the capacity to find all instances of a specific category. If there are 50 articles related to technology, and a classification model correctly labels 45 of them (true positives) but misses 5 (false negatives), the recall is 90% (45/50).

**F1 Score Examples:**

1. **Spam Filter Evaluation:** For a spam filter, if precision is 98% and recall is 90%, the F1 score combines both metrics, resulting in a balanced score of approximately 94%.

2. **Medical Test Assessment:** In a medical test, if precision is 92% and recall is 85%, the F1 score, which considers both metrics, yields an overall performance score of around 88%.

3. **Content Recommendation:** When recommending products to users, if precision is 80% (80% of recommended products are relevant) and recall is 70% (70% of all relevant products are recommended), the F1 score provides a balanced assessment, indicating an overall effectiveness of approximately 75%.

Search This Blog

Artifical intelligence AI, Responsible AI, Machine learning ML, Generative AI

Examples of Precision, Recall and F1 scores in machine learning

Comments

Post a Comment

Popular posts from this blog

Is creativity / imagination due to hallucinations ?

Self-sustainable AI, LLM (Large Language Model), and AI agent ecosystem

Key considerations for accurate and seamless AI agent interaction