The paper introduces IMPARA (Impact-based Metric for GEC using PARAllel data), a novel reference-less metric for automatic evaluation of grammatical error correction (GEC) systems. Unlike existing methods relying on manual assessments or multiple reference sentences, IMPARA uses parallel data (pairs of grammatical and ungrammatical sentences) to compute the impact of individual corrections. This approach significantly reduces data creation costs and adapts well across different domains and correction styles.
Architecture:
Quality Estimator (QE):
Similarity Estimator (SE):
Impact Calculation:
Evaluation:
Datasets:
The paper proposes IMPARA as a practical and effective evaluation method. Future directions include enhancing interpretability and integrating synthetic parallel data to further reduce data creation costs and improve evaluation quality.