IMPARA: Impact-Based Metric for GEC Using Parallel Data

Koki Maeda, Masahiro Kaneko, Naoaki Okazaki

Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022) · October 2022

Proposal of a new impact-based metric for grammatical error correction using parallel datasets.

BibTeX

@inproceedings{maeda2022impara,
  title = {IMPARA: Impact-Based Metric for GEC Using Parallel Data},
  author = {Koki Maeda and Masahiro Kaneko and Naoaki Okazaki},
  booktitle = {Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022)},
  pages = {3578--3588},
  year = {2022},
  address = {Gyeongju, Republic of Korea},
  publisher = {International Committee on Computational Linguistics}
}

Abstract

The paper introduces IMPARA (Impact-based Metric for GEC using PARAllel data), a novel reference-less metric for automatic evaluation of grammatical error correction (GEC) systems. Unlike existing methods relying on manual assessments or multiple reference sentences, IMPARA uses parallel data (pairs of grammatical and ungrammatical sentences) to compute the impact of individual corrections. This approach significantly reduces data creation costs and adapts well across different domains and correction styles.