Selection of Content Measures for Evaluation of Text Summaries using Genetic Algorithms
Abstract
Automatic Text Summarization (ATS) has played an essential role in condensing textual information from digital documents. Since 2001, the development of ATS has been significant, aiming to emulate the creation of human-like summaries. Thus, most of the methods and approaches are usually evaluated through ROUGE; however, it does not evaluate if human references are not available. Due to this, the Evaluation of Text Summaries (ETS) without human references has been proposed. In this sense, ROUGE-C, LSA, and SIMetrix have been presented as methods that compare the content between summaries and source documents. Although previous studies have demonstrated that combining these methods has improved automatic evaluation, it is still far from manual assessment. We assume this situation is due to the presence of different complexity levels in evaluation measures and source documents. Therefore, the performance of automatic evaluation varies according to the complexity level of each evaluation measure. In this paper, we propose a selection of content evaluation measures through a Genetic Algorithm (GA) to determine the most suitable evaluations for each summary. Calculating complexity levels in source documents and content measures may help to select the best measures to evaluate summaries without human references. Experiments in the DUC01 and DUC02 datasets demonstrate that the proposed selection improves the performance of this task.
Keywords
Evaluation of text summaries, Content evaluation measures, Genetic algorithm, Text complexity indexes