The BERTScore metric.
BERTScore is a metric for measuring generated text that measures the similarity between embeddings calculated by the BERT model.
Configuration
Configuration for the parameters of the BERTScore metric:
title: "BertScoreConfig"
type: "object"
properties:
variety:
type: "string"
description: "What variety of score to calculate (e.g. f_measure)"
enum:
- "f_measure"
- "precision"
- "recall"
language:
type: "string"
pattern: "^[a-z]{3}$"
description: >
"Three-letter abbreviation of the language in ISO 639-3 language code)"
"format: https://en.wikipedia.org/wiki/ISO_639-3\n"
"For example, English is 'eng'."
model:
type: "string"
pattern: "^[a-z0-9-/]+$"
description: "Model name."
num_layers:
type: "integer"
minimum: 1
description: >
"Use the Nth layer in the model (e.g. 8). "
"Must be between 1 and the number of layers in the model."
all_layers:
type: "boolean"
description: "Use all layers, not just the selected one."
For the model
parameter, you can use the following:
bert-base-uncased
More models will be coming soon! Please get in contact if you’re interested in using a different model.
Data
Accepted data format of the BERTScore metric. Note that there is a size limit of 2000 examples per query. If you want to submit more examples, you can use multiple queries.
title: "BertScoreData"
type: "object"
properties:
target:
type: "string"
description: "Target text to evaluate."
references:
type: "array"
description: "The references to evaluate the target against."
items:
type: "string"
required:
- "target"
- "references"
Results
Format of the results of the BERTScore metric:
title: "BertScoreResult"
type: "object"
$defs:
BertScoreStats:
type: "object"
properties:
value:
type: "number"
description: "The main BERTScore value."
precision:
type: "number"
description: "Precision score."
recall:
type: "number"
description: "Recall score."
f_measure:
type: "number"
description: "F-measure score."
required:
- "value"
properties:
overall:
$ref: "#/$defs/BertScoreStats"
examples:
type: "array"
items:
$ref: "#/$defs/BertScoreStats"
required:
- "overall"
- "examples"