Benchmarks

List of the available benchmarks.

  • MateCat post-edits: documents including source text, post-edited suggestions and final post-editions

  • BinQE: collection of multilingual source texts and MT outputs with binary quality estimation labels

  • BitterCorpus: collection of parallel English-Italian documents in the IT domain, with domain-specific terms manually marked and aligned

  • Word-alignment Gold Reference: collection of human-checked word-alignment of English-Italian sentence pairs in the Legal domain

A detailed description of each resource and download info can be found by clicking the corresponding link.