File size: 2,873 Bytes
038c995
 
728e0e5
 
 
1566eee
 
 
038c995
7cf89e7
038c995
 
 
 
728e0e5
5ae6761
728e0e5
5ae6761
728e0e5
5ae6761
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
728e0e5
5ae6761
 
a686ae2
 
5ae6761
 
 
 
 
 
 
 
 
 
 
 
 
 
 
728e0e5
 
1c7b8ac
 
728e0e5
38f0d16
728e0e5
 
1c7b8ac
 
 
 
 
728e0e5
38f0d16
164d436
38f0d16
 
 
 
 
 
 
 
 
728e0e5
 
1c7b8ac
 
 
 
 
 
 
 
 
 
15ec600
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
title: Multi Label Precision Recall Accuracy Fscore
tags:
- evaluate
- metric
description: >-
  Implementation of example based evaluation metrics for multi-label
  classification presented in Zhang and Zhou (2014).
sdk: gradio
sdk_version: 4.4.0
app_file: app.py
pinned: false
---

# Metric Card for Multi Label Precision Recall Accuracy Fscore
Implementation of example based evaluation metrics for multi-label classification presented in Zhang and Zhou (2014).

## How to Use

    >>> multi_label_precision_recall_accuracy_fscore = evaluate.load("mdocekal/multi_label_precision_recall_accuracy_fscore")
    >>> results = multi_label_precision_recall_accuracy_fscore.compute(
                predictions=[
                    ["0", "1"],
                    ["1", "2"],
                    ["0", "1", "2"],
                ],
                references=[
                    ["0", "1"],
                    ["1", "2"],
                    ["0", "1", "2"],
                ]
            )
    >>> print(results)
    {
        "precision": 1.0,
        "recall": 1.0,
        "accuracy": 1.0,
        "fscore": 1.0
    }

There is also multiset configuration available, which allows to calculate the metrics for multi-label classification with repeated labels.
It uses the same definition as in previous case, but it works with multiset of labels. Thus, intersection, union, and cardinality for multisets are used instead.
    
    >>> multi_label_precision_recall_accuracy_fscore = evaluate.load("mdocekal/multi_label_precision_recall_accuracy_fscore", config_name="multiset")
    >>> results = multi_label_precision_recall_accuracy_fscore.compute(
                predictions=[
                    [0, 1, 1]
                ],
                references=[
                    [1, 0, 1, 1, 0, 0],
                ]
            )
    >>> print(results)
    {
        "precision": 1.0,
        "recall": 0.5,
        "accuracy": 0.5,
        "fscore": 0.6666666666666666
    }

### Inputs
- **predictions** *(list[Union[int,str]]): list of predictions to score. Each predictions should be a list of predicted labels*
- **references** *(list[Union[int,str]]): list of reference for each prediction. Each reference should be a list of reference labels*


### Output Values

This metric outputs a dictionary, containing:
- precision 
- recall
- accuracy
- fscore


If prediction and reference are empty lists, the evaluation for given sample will be:
```python
{
    "precision": 1.0,
    "recall": 1.0,
    "accuracy": 1.0,
    "fscore": 1.0
}
```

## Citation

```bibtex
@article{Zhang2014ARO,
  title={A Review on Multi-Label Learning Algorithms},
  author={Min-Ling Zhang and Zhi-Hua Zhou},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  year={2014},
  volume={26},
  pages={1819-1837},
  url={https://api.semanticscholar.org/CorpusID:1008003}
}
```