This page describes how you can use Critique to assess the quality of summaries of a longer text into a shorter text.
What is Summary Quality?
Summary quality defines how well a summary represents the main points and the key information of the original text. A high-quality summary should accurately capture the essence of the original text with concise language.
Let say we have a long text as follows:
Artificial intelligence (AI) is transforming the way we live and work. It has already been integrated into many industries,
from healthcare to finance, to improve efficiency, accuracy and decision-making. The technology is also being used to
develop new products and services that were previously impossible. Despite the benefits of AI, there are also concerns
about its potential impact on employment and privacy. It is important that AI is developed and used ethically, so that
its benefits can be fully realized while minimizing any negative consequences.
Two summaries of the text are shown below:
AI is improving industries, but its ethical use is important to avoid negative consequences on employment and privacy.
AI is transforming the way we live and work by improving efficiency and decision-making in industries.
In this case, the the first summary is better as it accurately highlights the key points discussed in the document regarding the benefits and potential impact of AI.
Critique provides a set of metrics that can be used to assess the quality of summaries, which can help you:
- Filter out low-quality summaries so that unqualified summaries will be not displayed to users or be post-edited by human editors
- Monitor the quality of summarization systems
- Improve the quality of summaries by identifying the areas that need improvement
There are mainly two ways to estimate the quality of summaries:
- reference-based: the quality of a summary is compared to a reference summary, which is usually written by human editors.
- reference-free: the quality of a summary is compared to the original text or just the summary itself. The reference-free approach is more suitable for summarization systems that do not have access to reference summaries, while reference-based can be chosen when collecting reference summaries is feasible.
Critique API for Summary Quality Evaluation
Reference-based Evaluation
In this scenario, you need to provide a reference summary for each target (the summary to be evaluated). For example,
dataset = [
{
"target": "AI is improving industries, but its ethical use is important to avoid negative consequences on employment and privacy.",
"references": [
"AI is transforming the way we live and work by improving efficiency and decision-making in industries."
],
},
{
"target": "AI is transforming the way we live and work by improving efficiency and decision-making in industries.",
"references": [
"AI is transforming the way we live and work by improving efficiency and decision-making in industries."
],
}
]
Critique provides a variety of metrics to evaluate the quality of summaries and different configurations can be used for each metric.
For example, you can use the rouge
metric to evaluate the quality of summaries with the rouge-1
configuration, which will
metric = "rouge"
config = {
"variety": "rouge_1"
}
You can then call the API to evaluate the quality of summaries in the dataset
:
from inspiredco import critique
client = critique.Critique(api_key=os.environ["INSPIREDCO_API_KEY"])
result = client.evaluate(metric=metric, config=config, dataset=dataset)
Reference-free Evaluation
In this scenario, you do not need to provide a reference summary for each target, but you probably need to provide the source text. For example,
dataset = [
{
"target": "AI is improving industries, but its ethical use is important to avoid negative consequences on employment and privacy.",
"source": "Artificial intelligence (AI) is transforming the way we live and work. It has already been integrated into many industries,
from healthcare to finance, to improve efficiency, accuracy and decision-making. The technology is also being used to develop new products
and services that were previously impossible. Despite the benefits of AI, there are also concerns about its potential impact on employment and
privacy. It is important that AI is developed and used ethically, so that its benefits can be fully realized while minimizing any negative consequences."
},
{
"target": "AI is transforming the way we live and work by improving efficiency and decision-making in industries.",
"source": "Artificial intelligence (AI) is transforming the way we live and work. It has already been integrated into many
industries, from healthcare to finance, to improve efficiency, accuracy and decision-making. The technology is also being used to
develop new products and services that were previously impossible. Despite the benefits of AI, there are also concerns about its
potential impact on employment and privacy. It is important that AI is developed and used ethically, so that its benefits can be
fully realized while minimizing any negative consequences."
}
]
In this case, you may want to use a different metric or configuration that can take advantage of the source text.
metric = "bart_score"
config = {
"variety": "source_to_target",
"model": "facebook/bart-large-cnn",
"language": "eng"
}
Various Metrcs/Configurations for Summarization Quality
See details by clicking the links above.