Chapter A: Lexical processing
Task A.1: Cross-lingual word embeddings
Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. “A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 789–798.
http://aclweb.org/anthology/P18-1073
Major reproduction comparables: Accuracy scores (tables 1 to 4).
Task A.2: Named entity embeddings
Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018. “Jointly Embedding Entities and Text with Distant Supervision”. In Proceedings of The Third Workshop on Representation Learning for NLP, pp. 195–206.
http://aclweb.org/anthology/W18-3026
Major reproduction comparables: Spearman’s ρ scores for semantic similarity predictions (tables 3 and 4), and accuracy scores (table 6).
Chapter B: Sentence processing
Task B.1: POS tagging
Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily Pitler, and Joshua Maynez. 2018. “Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 2642–2652.
http://aclweb.org/anthology/P18-1246
Major reproduction comparables: f-score values (tables 2 to 8).
Task B.2: Sentence semantic relatedness
Gupta, Amulya, and Zhu Zhang. 2018. “To Attend or not to Attend: A Case Study on Syntactic Structures for Semantic Relatedness”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 2116–2125.
http://aclweb.org/anthology/P18-1197
Major reproduction comparables: Pearson’s r and Spearman’s ρ scores for the semantic relatedness (table 1), and f-score values for paraphrase detection (table 2).
Chapter C: Text processing
Task C.1: Relation extraction and classification
Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. “ETH-DS3Lab at SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation Classification and Extraction”. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval 2018), pp. 689–696.
http://aclweb.org/anthology/S18-1112
Major reproduction comparables: precision, recall and f-score values (tables 3 and 4).
Task C.2: Privacy preserving representation
Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. “Towards Robust and Privacy-preserving Text Representations”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 25-30.
http://aclweb.org/anthology/P18-2005
Major reproduction comparables: POS accuracy scores (tables 1 and 2), and sentiment analysis f-score scores (table 3).
Task C.3: Language modelling
Howard, Jeremy, and Sebastian Ruder. 2018. ”Universal Language Model Fine-tuning for Text Classification”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 328–339.
http://aclweb.org/anthology/P18-1031
Major reproduction comparables: Error rate (%) scores in sentiment analysis and question classification tasks (tables 2 and 3).
Chapter D: Applications
Task D.1: Text simplification
Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. “Exploring Neural Text Simplification Models”. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), pp. 85-91.
http://aclweb.org/anthology/P/P17/P17-2014.pdf
Major reproduction comparables: Averaged human evaluation scores, by 3 evaluators, in 1 to 5 and -2 to +2 scales (table 2).
Task D.2: Language proficiency scoring
Vajjala, Sowmya, and Taraka Rama. 2018. “Experiments with Universal CEFR classifications”. In Proceedings of Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 147–153.
http://aclweb.org/anthology/W18-0515
Major reproduction comparables: f-score values (tables 2, 3 and 4).
Task D.3: Neural machine translation
Vanmassenhove, Eva, and Andy Way. 2018. “SuperNMT: Neural Machine Translation with Semantic Supersenses and Syntactic Supertags”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 67–73.
http://aclweb.org/anthology/P18-3010
Major reproduction comparables: BLEU scores (tables 1 and 2; plots in figures 2, 3 and 4).
Chapter E: Language resources
Task E.1: Parallel corpus construction
Brunato, Dominique, Andrea Cimino, Felice Dell’Orletta, and Giulia Venturi. 2016. “PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification”. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), pp. 351-361.
https://aclweb.org/anthology/D16-1034
Major reproduction comparables: data set.
Participants are expected to obtain the data and tools for the reproduction from the information provided in the paper. Using the description of the experiment is part of the reproduction exercise.