UniEval is a metric based on predicting the probability that a particular sentence is fluent, relevant, coherent, consistent, natural, engaging, grounded, understandable, factual, or informative with a large language model.
UniEval is parameters by both “tasks” and “evaluation aspects”. The “task” parameter determines the type of input data that is expected. The “evaluation aspect” parameter determines the type of evaluation that is performed. For each evaluation aspect, different things must be input as part of your dataset. The following combinations are permissible:
Task | Evaluation Aspect | Required Inputs |
---|---|---|
summarization | coherence | target, source |
summarization | consistency | target, source |
summarization | fluency | target |
summarization | relevance | target, reference |
Evaluators for dialog and factual consistency are coming soon! Please get in contact if you’d be interested in using them.
Configuration
Configuration for the parameters of the UniEval metric:
title: "UniEvalConfig"
type: "object"
properties:
task:
type: "string"
description: "Name of the task to be used."
enum:
- "summarization"
evaluation_aspect:
type: "string"
description: "Name of the evaluation aspect to be used."
enum:
- "fluency"
- "relevance"
- "coherence"
- "consistency"
Data
Accepted data format of the UniEval metric. Note that there is a size limit of 250 examples per query. If you want to submit more examples, you can use multiple queries.
title: "UniEvalData"
type: "object"
properties:
target:
type: "string"
description: "Input text to evaluate."
source:
type: "string"
description: "Source text."
references:
type: "array"
description: "Gold reference texts"
items:
type: "string"
context:
type: "string"
description: "Context for the input text."
required:
- "target"
Results
Format of the results of the UniEval metric:
title: "UniEvalResult"
type: "object"
$defs:
UniEvalStats:
type: "object"
properties:
value:
type: "number"
description: "The main UniEval value."
required:
- "value"
properties:
overall:
$ref: "#/$defs/UniEvalStats"
examples:
type: "array"
items:
$ref: "#/$defs/UniEvalStats"
required:
- "overall"
- "examples"