Evaluation metrics are quantitative measures of machine learning models' performance. They are essential to determining whether our model is performing well or poorly for specific tasks.

**METEOR** **(Metric for Evaluation of Translation with Explicit Ordering)** is a metric used to measure the quality of candidate text based on the

Following are the steps to calculate the METEOR score:

Calculate the unigram precision and recall.

Compute the F-score.

Compute chunk penalty.

Calculate the METEOR score.

We calculate the

The unigram recall is calculated as the ratio between the overlapping unigrams between the candidate and reference summary and the total number of unigrams in the reference summary.

After calculating the unigram precision and recall, we compute the weighted F-score by taking their harmonic mean, with precision being weighted higher than recall.

where,

P: Unigram precision

R: Unigram recall

$\alpha$ : It is the relative weight for precision and recall.

Note:The precision is weighted higher than the recall so that the candidate summary is more precised in the meaning then the word-to-word matches.

A **chunk** is a set of consecutive words appearing in the sentence. The precision, recall, and

Where,

Question

What would be the chunk size in case of candidate summary is exactly similar to the reference summary?

Show Answer

After computing the F-score and chunk penalty, we are now ready to calculate the METEOR score.

METEOR scores are given on a scale of 0 to 1, with higher values indicating greater similarity between the candidate and the reference summary.

Now, let’s see how to calculate the METEOR score using Python.

import nltk nltk.download('wordnet') reference_summary = [['Machine', 'learning', 'is', 'a', 'subset', 'of', 'artificial', 'intelligence']] candidate_summary = ['Machine', 'learning', 'is', 'seen', 'as', 'a', 'subset', 'of', 'artificial', 'intelligence'] METEORscore = nltk.translate.meteor_score.meteor_score(reference_summary, candidate_summary) print(METEORscore)

Calculating METEOR score

Let’s get the insight of the above code.

**Line 1:**We import the`nltk`

library, which is used widely in the field of NLP.**Line 2:**We download the`wordnet`

corpus reader from the`nltk`

library.**Line 4:**We define a list named`reference_summary`

and set “Machine learning is a subset of artificial intelligence” as a reference summary.**Line 5:**We define a`candidate_summary`

variable and set its value to “Machine learning is seen as a subset of artificial intelligence."**Line 7:**We use the`meteor_score()`

function from the`nltk.translate.meteor_score`

to calculate the METEOR score.**Line 8:**We print the METEOR score for the provided candidate summary.

Copyright ©2024 Educative, Inc. All rights reserved

TRENDING TOPICS