|
|
|
|
Thursday, 23 June, 2022
|
09:30 - 10:50
|
Session O29: Opinion Mining, Sentiment Analysis
- Auditorium
Chair: Palmer, Martha
Co-Chair: Dragos, Valentina
|
|
|
|
|
09:30 - 09:50
|
Investigating User Radicalization: A Novel Dataset for Identifying Fine-Grained Temporal Shifts in Opinion
Flora Sakketou1, Allison Lahnala2, Liane Vogel3, Lucie Flek1
1Philipps-Marburg University, 2University of Marburg, 3Technical University of Darmstadt
|
|
|
|
|
09:50 - 10:10
|
APPReddit: a Corpus of Reddit Posts Annotated for Appraisal
Marco Antonio Stranisci1, Simona Frenda2, Eleonora Ceccaldi3, Valerio Basile1, Rossana Damiano1, Viviana Patti4
1University of Turin, 2Università degli Studi di Torino and Universitat Politècnica de València, 3University of Genoa, 4University of Turin, Dipartimento di Informatica
|
|
|
|
|
10:10 - 10:30
|
Evaluating Methods for Extraction of Aspect Terms in Opinion Texts in Portuguese - the Challenges of Implicit Aspects
Mateus Machado and Thiago Pardo
University of São Paulo
|
|
|
|
|
10:30 - 10:50
|
SenticNet 7: A Commonsense-based Neurosymbolic AI Framework for Explainable Sentiment Analysis
Erik Cambria1, Qian Liu1, Sergio Decherchi2, Frank Xing3, Kenneth Kwok4
1Nanyang Technological University, 2Fondazione Istituto Italiano di Tecnologia, 3National University of Singapore, 4A*STAR
|
|
|
|
|
09:30 - 10:50
|
Session O30: Less-Resourced Languages (2)
- La Major
Chair: Ramisch, Carlos Co-Chair: Ruiz Fabo, Pablo
|
|
|
|
|
09:30 - 09:50
|
Building an Endangered Language Resource in the Classroom: Universal Dependencies for Kakataibo
Roberto Zariquiey1, Claudia Alvarado1, Ximena Echevarría1, Luisa Gomez1, Rosa Gonzales1, Mariana Illescas1, Sabina Oporto1, Frederic Blum2, Arturo Oncevay3, Javier Vera4
1Pontificia Universidad Católica del Perú, 2Humboldt-Universität zu Berlin, 3The University of Edinburgh, 4Pontificia Universidad Católica de Valparaíso
|
|
|
|
|
09:50 - 10:10
|
The Norwegian Colossal Corpus: A Text Corpus for Training Large Norwegian Language Models
Per Kummervold, Freddy Wetjen, Javier de la Rosa
The National Library of Norway
|
|
|
|
|
10:10 - 10:30
|
Embeddings models for Buddhist Sanskrit
Ligeia Lugli1, Matej Martinc2, Andraž Pelicon3, Senja Pollak3
1Mangalam Research Center for Buddhist Languages, 2Jozef Stefan Institute, 3Jožef Stefan Institute
|
|
|
|
|
10:30 - 10:50
|
Development of Automatic Speech Recognition for the Documentation of Cook Islands Māori
Rolando Coto-Solano1, Sally Akevai Nicholas2, Samiha Datta1, Victoria Quint1, Piripi Wills3, Emma Powell4, Liam Koka'ua3, Syed Tanveer1, Isaac Feldman1
1Dartmouth College, 2Massey University, 3Kо̄rero Rororuia, 4University of Otago
|
|
|
|
|
09:30 - 10:50
|
Session O31: Document Classification, Text Categorisation
- Salle 120
Chair: Volk, Martin
Co-Chair: Zhang, Mike
|
|
|
|
|
09:30 - 09:50
|
A Generalized Approach to Protest Event Detection in German Local News
Gregor Wiedemann1, Jan Dollbaum2, Sebastian Haunss3, Priska Daphi4, Larissa Meier4
1Leibniz Institute for Media Research | Hans-Bredow-Institute, 2University of Bremen, 3University of Bremen, SOCIUM, 4University of Bielefeld
|
|
|
|
|
09:50 - 10:10
|
Evaluation of Transfer Learning and Domain Adaptation for Analyzing German-Speaking Job Advertisements
Ann-Sophie Gnehm, Eva Bühlmann, Simon Clematide
University of Zurich
|
|
|
|
|
10:10 - 10:30
|
Pre-Training Language Models for Identifying Patronizing and Condescending Language: An Analysis
Carla Perez Almendros, Luis Espinosa Anke, Steven Schockaert
Cardiff University
|
|
|
|
|
10:30 - 10:50
|
HeLI-OTS, Off-the-shelf Language Identifier for Text
Tommi Jauhiainen, Heidi Jauhiainen, Krister Lindén
University of Helsinki
|
|
|
|
|
09:30 - 10:50
|
Session O32: Lexicon and WordNet
- Salle 92
Chair: Vossen, Piek
Co-Chair: Frontini, Francesca
|
|
|
|
|
09:30 - 09:50
|
Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages
Silvia Severini1, Ayyoob ImaniGooghari2, Philipp Dufter3, Hinrich Schütze4
1Ludwig-Maximilians-Universität, 2Center for Information and Language Processing, LMU Munich, 3Apple, 4Center for Information and Language Processing, University of Munich
|
|
|
|
|
09:50 - 10:10
|
Towards the Construction of a WordNet for Old English
Fahad Khan1, Francisco Minaya Gómez2, Rafael Cruz González2, Harry Diakoff3, Javier Diaz Vera2, John P. McCrae4, Ciara O'Loughlin5, William Short6, Sander Stolk7
1Istituto di Linguistica Computazionale "Antonio Zampolli", CNR, 2University of Castilla-La Mancha, 3alpheios project, 4Insight Center for Data Analytics, National University of Ireland Galway, 5Data Science Institute, National University of Ireland Galway, 6University of Exeter, 7Leiden University
|
|
|
|
|
10:10 - 10:30
|
A Framenet and Frame Annotator for German Social Media
Eckhard Bick
University of Southern Denmark
|
|
|
|
|
10:30 - 10:50
|
The Robotic Surgery Procedural Framebank
Marco Bombieri1, Marco Rospocher2, Simone Paolo Ponzetto3, Paolo Fiorini1
1University of Verona, 2Università degli Studi di Verona, 3University of Mannheim
|
|
|
|
|
09:30 - 10:50
|
Session: P30: Knowledge Discovery
- Poster Area 2
Chair: Yamada, Hiroaki
|
|
|
|
|
|
Representing the Toddler Lexicon: Do the Corpus and Semantics Matter?
Jennifer Weber and Eliana Colunga
University of Colorado, Boulder
|
|
|
|
|
|
Organizing and Improving a Database of French Word Formation Using Formal Concept Analysis
Nyoman Juniarta1, Olivier Bonami2, Nabil Hathout3, Fiammetta Namer4, Yannick Toussaint5
1CNRS, 2Université de Paris, CNRS, Laboratoire de linguistique formelle, 3CLLE/ERSS, CNRS & Université de Toulouse, 4UMR 7118 ATILF & University of Lorraine, 5LORIA, Université de Lorraine
|
|
|
|
|
|
Towards a new Ontology for Sign Languages
Thierry Declerck
DFKI GmbH
|
|
|
|
|
|
Towards the Detection of a Semantic Gap in the Chain of Commonsense Knowledge Triples
Yoshihiko Hayashi
Waseda University
|
|
|
|
|
|
COPA-SSE: Semi-structured Explanations for Commonsense Reasoning
Ana Brassard1, Benjamin Heinzerling2, Pride Kavumba3, Kentaro Inui4
1RIKEN AIP / Tohoku University, 2RIKEN AIP & Tohoku University, 3Tohoku University, 4Tohoku University / Riken
|
|
|
|
|
|
GRhOOT: Ontology of Rhetorical Figures in German
Ramona Kühn, Jelena Mitrović, Michael Granitzer
University of Passau
|
|
|
|
|
|
Querying a Dozen Corpora and a Thousand Years with Fintan
Christian Chiarcos1, Christian Fäth2, Maxim Ionov3
1Goethe-Universität Frankfurt am Main, 2Universität Frankfurt, 3Goethe-Universität Frankfurt
|
|
|
|
|
|
The Index Thomisticus Treebank as Linked Data in the LiLa Knowledge Base
Francesco Mambrini, Marco Passarotti, Giovanni Moretti, Matteo Pellegrini
Università Cattolica del Sacro Cuore
|
|
|
|
|
|
Building a Multilingual Taxonomy of Olfactory Terms with Timestamps
Stefano Menini1, Teresa Paccosi1, Serra Sinem Tekiroğlu1, Sara Tonelli2
1Fondazione Bruno Kessler, 2FBK
|
|
|
|
|
|
Attention Understands Semantic Relations
Anastasia Chizhikova1, Sanzhar Murzakhmetov2, Oleg Serikov3, Tatiana Shavrina4, Mikhail Burtsev5
1MIPT, HSE University, 2MIPT, 3DeepPavlov, AIR Institute, HSE University, 4AIRI, HSE University, 5Artificial Intelligence Research Institute, Moscow Institute of Physics and Technology
|
|
|
|
|
09:30 - 10:50
|
Session: P31 Dialogue and Conversational Systems (3)
- Poster Area 2
Chair: Damnati, Géraldine
|
|
|
|
|
|
Analysis of Dialogue in Human-Human Collaboration in Minecraft
Takuma Ichikawa1 and Ryuichiro Higashinaka2
1Graduate School of Informatics, Nagoya University, 2Nagoya University/NTT
|
|
|
|
|
|
Data Collection for Empirically Determining the Necessary Information for Smooth Handover in Dialogue
Sanae Yamashita1 and Ryuichiro Higashinaka2
1Nagoya University, 2Nagoya University/NTT
|
|
|
|
|
|
The slurk Interaction Server Framework: Better Data for Better Dialog Models
Jana Götze, Maike Paetzel-Prüsmann, Wencke Liermann, Tim Diekmann, David Schlangen
University of Potsdam
|
|
|
|
|
|
Corpus Design for Studying Linguistic Nudges in Human-Computer Spoken Interactions
Natalia Kalashnikova1, Serge Pajak2, Fabrice Le Guel2, Ioana Vasilescu3, Gemma Serrano4, Laurence Devillers5
1LISN, University Paris - Saclay, 2RITM, 3Limsi-CNRS, 4Collège des Bernardins, 5LIMSI-CNRS/Paris-Sorbonne
|
|
|
|
|
|
Dialogue Corpus Construction Considering Modality and Social Relationships in Building Common Ground
Yuki Furuya1, Koki Saito1, Kosuke Ogura1, Koh Mitsuda2, Ryuichiro Higashinaka3, Kazunori Takashio1
1Keio University, 2NTT, 3Nagoya University/NTT
|
|
|
|
|
|
EmoWOZ: A Large-Scale Corpus and Labelling Scheme for Emotion Recognition in Task-Oriented Dialogue Systems
Shutong Feng1, Nurul Lubis2, Christian Geishauser3, Hsien-chin Lin2, Michael Heck2, Carel van Niekerk2, Milica Gasic3
1Heinrich-Heine-Universität Düsseldorf, 2Heinrich Heine University, 3Heinrich Heine University Duesseldorf
|
|
|
|
|
|
Data Augmentation with Paraphrase Generation and Entity Extraction for Multimodal Dialogue System
Eda Okur, Saurav Sahay, Lama Nachman
Intel Labs
|
|
|
|
|
|
Towards Modelling Self-imposed Filter Bubbles in Argumentative Dialogue Systems
Annalena Aicher1, Wolfgang Minker1, Stefan Ultes2
1Ulm University, 2Mercedes-Benz AG
|
|
|
|
|
09:30 - 10:50
|
Session: P32 Social Media Processing (2)
- Poster Area 2
Chair: Mubarak, Hamdy
|
|
|
|
|
|
Telling a Lie: Analyzing the Language of Information and Misinformation during Global Health Events
Ankit Aich and Natalie Parde
University of Illinois at Chicago
|
|
|
|
|
|
Misogyny and Aggressiveness Tend to Come Together and Together We Address Them
Arianna Muti1, Francesco Fernicola2, Alberto Barrón-Cedeño2
1University of Bologna, 2Università di Bologna
|
|
|
|
|
|
The ComMA Dataset V0.2: Annotating Aggression and Bias in Multilingual Social Media Discourse
Ritesh Kumar1, Shyam Ratan2, Siddharth Singh3, Enakshi Nandi4, Laishram Niranjana Devi5, Akash Bhagat2, Yogesh Dawer6, bornini lahiri7, Akanksha Bansal8, Atul Kr. Ojha9
1Dept. of Linguistics, Dr. Bhimrao Ambedkar University, Agra, 2Dr. Bhimrao Ambedkar University, 3Dr. Bhimrao Ambedkar University, Agra, 4Panlingua Language Processing LLP, 5Pan Lingua, 6K.M. Institute of Hindi and Linguistics, Dr. Bhimrao Ambedkar University, 7Indian Institute of Technology Kharagpur, 8Jawaharlal Nehru University, 9Data Science Institute, Unit for Linguistic Data, National University of Ireland Galway
|
|
|
|
|
|
TUSC: Emotion Word Usage in Tweets from US and Canada
Krishnapriya Vishnubhotla1 and Saif M. Mohammad2
1University of Toronto, 2National Research Council Canada
|
|
|
|
|
|
A Turkish Hate Speech Dataset and Detection System
Fatih Beyhan1, Buse Çarık2, İnanç Arın1, Ayşecan Terzioğlu1, Berrin Yanikoglu2, Reyyan Yeniterzi2
1Sabancı University, 2Sabanci University
|
|
|
|
|
|
Life is not Always Depressing: Exploring the Happy Moments of People Diagnosed with Depression
Ana-Maria Bucur1, Adrian Cosma2, Liviu P. Dinu3
1Interdisciplinary School of Doctoral Studies, 2University Politehnica of Bucharest, 3University of Bucharest
|
|
|
|
|
09:30 - 10:50
|
Session: P33 Evaluation and Validation Methodologies (3)
- Poster Area 2
Chair: Castilho, Sheila
|
|
|
|
|
|
Evaluating Tokenizers Impact on OOVs Representation with Transformers Models
Alexandra Benamar1, Cyril Grouin2, Meryl Bothua3, Anne Vilnat4
1Université Paris-Saclay, LIMSI-CNRS, 2LIMSI-CNRS, 3EDF R&D, 4LIMSI et Université Paris-Saclay
|
|
|
|
|
|
Assessing the Quality of an Italian Crowdsourced Idiom Corpus:the Dodiom Experiment
Giuseppina Morza, Raffaele Manna, Johanna Monti
University of Naples, "L'Orientale"
|
|
|
|
|
|
Medical Crossing: a Cross-lingual Evaluation of Clinical Entity Linking
Anton Alekseev1, Zulfat Miftahutdinov2, Elena Tutubalina3, Artem Shelmanov4, Vladimir Ivanov5, Vladimir Kokh6, Alexander Nesterov7, Manvel Avetisian7, Andrei Chertok8, Sergey Nikolenko9
1St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences, 2Kazan Federal University, 3HSE University, Russia and Kazan Federal University, Russia and Sber AI, Russia, 4Lomonosov Moscow State University, 5Innopolis University, 6Sberbank, 7Sber AI Lab, 8Sber, AIR Institute, 9Steklov Institute of Mathematics at St. Petersburg
|
|
|
|
|
|
MTLens: Machine Translation Output Debugging
Shreyas Sharma1, Kareem Darwish2, Lucas Pavanelli3, Thiago Castro Ferreira4, Mohamed Al-Badrashiny5, Kamer Yuksel6, Hassan Sawaf7
1aiXplain, 2aiXplain Inc., 3Pontifical Catholic University of Rio de Janeiro, 4Federal University of Minas Gerais, 5Columbia University - Department of Computer Science., 6aiXplain Inc, 7aixplain, inc.
|
|
|
|
|
|
IceBATS: An Icelandic Adaptation of the Bigger Analogy Test Set
Steinunn Friðriksdóttir1, Hjalti Daníelsson1, Steinþór Steingrímsson2, Einar Sigurdsson3
1The Árni Magnússon Institute for Icelandic Studies, 2Reykjavik University, 3University of Pennsylvania
|
|
|
|
|
|
Transfer Learning Methods for Domain Adaptation in Technical Logbook Datasets
Farhad Akhbardeh, Marcos Zampieri, Cecilia Ovesdotter Alm, Travis Desell
Rochester Institute of Technology
|
|
|
|
|
09:30 - 10:50
|
Session: P34 Statistical Methods and Machine Learning (2)
- Poster Area 2
Chair: Estève, Yannick
|
|
|
|
|
|
Downstream Task Performance of BERT Models Pre-Trained Using Automatically De-Identified Clinical Data
Thomas Vakili1, Anastasios Lamproudis2, Aron Henriksson3, Hercules Dalianis2
1Department of Computer and Systems Sciences, Stockholm University, 2DSV/Stockholm University, 3Department of Computer and Systems Sciences (DSV), Stockholm University
|
|
|
|
|
|
Dilated Convolutional Neural Networks for Lightweight Diacritics Restoration
Bálint Csanády and András Lukács
Eötvös Loránd University
|
|
|
|
|
|
Generating Artificial Texts as Substitution or Complement of Training Data
Vincent Claveau1, Antoine Chaffin2, Ewa Kijak3
1CNRS - IRISA, 2IMATAG - IRISA -CNRS, 3Université de Rennes 1-IRISA
|
|
|
|
|
|
From Pattern to Interpretation. Using Colibri Core to Detect Translation Patterns in the Peshitta.
Mathias Coeckelbergs
Université libre de Bruxelles (ULB)
|
|
|
|
|
|
PAGnol: An Extra-Large French Generative Model
Julien Launay1, E.L. Tommasone1, Baptiste Pannier1, François Boniface2, Amélie Chatelain1, Alessandro Cappelli1, Iacopo Poli1, Djamé Seddah3
1LightOn, 2Unaffiliated, 3Inria
|
|
|
|
|
|
CEPOC: The Cambridge Exams Publishing Open Cloze dataset
Mariano Felice1, Shiva Taslimipoor1, Øistein E. Andersen2, Paula Buttery1
1University of Cambridge, 2iLexIR
|
|
|
|
|
|
ALBETO and DistilBETO: Lightweight Spanish Language Models
José Cañete1, Sebastian Donoso1, Felipe Bravo-Marquez2, Andrés Carvallo3, Vladimir Araujo3
1Universidad de Chile, 2University of Chile, 3Pontificia Universidad Católica de Chile
|
|
|
|
|
|
On the Robustness of Cognate Generation Models
Winston Wu1 and David Yarowsky2
1University of Michigan, 2Johns Hopkins University
|
|
|
|
|
10:50 - 11:10
|
Coffee Break
|
|
|
|
|
11:10 - 12:30
|
Session O33: Semantics and Paraphrasing
- Salle 120
Chair: Sérasset, Gilles
Co-Chair: Mullick, Ankan
|
|
|
|
|
11:10 - 11:30
|
CLISTER : A Corpus for Semantic Textual Similarity in French Clinical Narratives
Nicolas Hiebel1, Olivier Ferret2, Karën Fort3, Aurélie Névéol1
1Université Paris Saclay, CNRS, LISN, 2CEA List, 3Sorbonne Université and LORIA
|
|
|
|
|
11:30 - 11:50
|
The Chinese Causative-Passive Homonymy Disambiguation: an adversarial Dataset for NLI and a Probing Task
Shanshan Xu1 and Katja Markert2
1TU Munich, 2Heidelberg University
|
|
|
|
|
11:50 - 12:10
|
Modeling Noise in Paraphrase Detection
Teemu Vahtola, Eetu Sjöblom, Jörg Tiedemann, Mathias Creutz
University of Helsinki
|
|
|
|
|
12:10 - 12:30
|
Give me your Intentions, I’ll Predict our Actions: A Two-level Classification of Speech Acts for Crisis Management in Social Media
Enzo laurenti1, Nils Bourgon2, Farah Benamara3, Alda Mari1, Véronique MORICEAU4, Camille Courgeon1
1IJN, CNRS/ENS/EHESS/PSL University, 2IRIT-CNRS, 3University of toulouse, 4IRIT, Université Toulouse 3
|
|
|
|
|
11:10 - 12:30
|
Session O34: Statistical Methods and Machine Learning (2)
- Auditorium
Chair: Popović, Maja
Co-Chair: Igamberdiev, Timour
|
|
|
|
|
11:10 - 11:30
|
Towards a Cleaner Document-Oriented Multilingual Crawled Corpus
Julien Abadji1, Pedro Ortiz Suarez2, Laurent Romary1, Benoît Sagot1
1Inria, 2Data and Web Science Group, University of Mannheim
|
|
|
|
|
11:30 - 11:50
|
A Warm Start and a Clean Crawled Corpus - A Recipe for Good Language Models
Vésteinn Snæbjarnarson1, Haukur Símonarson1, Pétur Ragnarsson1, Svanhvít Ingólfsdóttir1, Haukur Jónsson1, Vilhjalmur Thorsteinsson2, Hafsteinn Einarsson3
1Miðeind, 2Mideind ehf, 3University of Iceland
|
|
|
|
|
11:50 - 12:10
|
Adapting Language Models When Training on Privacy-Transformed Data
Tugtekin Turan1, Dietrich Klakow2, Emmanuel Vincent3, Denis Jouvet4
1Fraunhofer IAIS, 2Saarland University, 3Inria, 4LORIA - INRIA
|
|
|
|
|
11:10 - 12:30
|
Session O35: Evaluation and Validation Methodologies (2)
- La Major
Chair: Zweigenbaum, Pierre
Co-Chair: Jauhiainen, Tommi
|
|
|
|
|
11:10 - 11:30
|
Evaluation of Transfer Learning for Polish with a Text-to-Text Model
Aleksandra Chrabrowa1, Łukasz Dragan1, Karol Grzegorczyk1, Dariusz Kajtoch1, Mikołaj Koszowski1, Robert Mroczkowski1, Piotr Rybak2
1Allegro SP. Z O.O., 2ML Research, Allegro.pl
|
|
|
|
|
11:30 - 11:50
|
Evaluation of HTR models without Ground Truth Material
Phillip Benjamin Ströbel1, Martin Volk1, Simon Clematide1, Raphael Schwitter1, Tobias Hodel2, David Schoch2
1University of Zurich, 2University of Bern
|
|
|
|
|
11:50 - 12:10
|
A Semi-Automated Live Interlingual Communication Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking
Tomasz Korybski, Elena Davitti, Constantin Orasan, Sabine Braun
University of Surrey
|
|
|
|
|
12:10 - 12:30
|
Are Embedding Spaces Interpretable? Results of an Intrusion Detection Evaluation on a Large French Corpus
Thibault Prouteau1, Nicolas Dugué1, Nathalie Camelin2, Sylvain Meignier1
1LIUM, 2LIUM - University of Le Mans
|
|
|
|
|
11:10 - 12:30
|
Session O36: Corpus Creation, Use and Evaluation (2)
- Salle 92
Chair: Xia, Fei
Co-Chair: Chandran Nair, Nandu
|
|
|
|
|
11:10 - 11:30
|
Corpus for Automatic Structuring of Legal Documents
Prathamesh Kalamkar1, Aman Tiwari1, Astha Agarwal1, Saurabh Karn2, Smita Gupta2, Vivek Raghavan3, Ashutosh Modi4
1Thoughtworks Technologies, 2Agami, 3EkStep Foundation, 4Indian Institute of Technology Kanpur
|
|
|
|
|
11:30 - 11:50
|
The Search for Agreement on Logical Fallacy Annotation of an Infodemic
Claire Bonial1, Austin Blodgett2, Taylor Hudson3, Stephanie M. Lukin4, Jeffrey Micher5, Douglas Summers-Stay4, Peter Sutor6, Clare Voss7
1US Army Research Lab, 2Georgetown University, 3ORAU, 4U.S. Army Research Laboratory, 5Army Research Lab, 6University of Maryland, 7Army Research Laboratory
|
|
|
|
|
11:50 - 12::10
|
Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR)
Amelie Wührl and Roman Klinger
University of Stuttgart
|
|
|
|
|
11:10 - 12:30
|
Session: P35 Information Extraction (3)
- Poster Area 1
Chair: Rospocher, Marco
|
|
|
|
|
|
Improving Event Duration Question Answering by Leveraging Existing Temporal Information Extraction Data
Felix Virgo, Fei Cheng, Sadao Kurohashi
Kyoto University
|
|
|
|
|
|
Entity Linking over Nested Named Entities for Russian
Natalia Loukachevitch1, Pavel Braslavski2, Vladimir Ivanov3, Tatiana Batura4, Suresh Manandhar5, Artem Shelmanov1, Elena Tutubalina6
1Lomonosov Moscow State University, 2Ural Federal University and HSE University, 3Innopolis University, 4Novosibirsk State University and Lomonosov Moscow State University, 5Wiseyak, 6HSE University, Russia and Kazan Federal University, Russia and Sber AI, Russia
|
|
|
|
|
|
HiNER: A large Hindi Named Entity Recognition Dataset
Rudra Murthy1, Pallab Bhattacharjee2, Rahul Sharnagat3, Jyotsana Khatri4, Diptesh Kanojia5, Pushpak Bhattacharyya6
1IBM India Research Limited, 2CFILT Lab, IIT Bombay, 3IIT Bombay, 4Indian Institute of Technology Bombay, Mumbai, 5University of Surrey, 6Indian Institute of Technology Bombay and Patna
|
|
|
|
|
|
Bootstrapping Text Anonymization Models with Distant Supervision
Anthi Papadopoulou1, Pierre Lison2, Lilja Øvrelid3, Ildikó Pilán4
1Language Technology Group, University of Oslo, 2Norwegian Computing Centre, 3Dept of Informatics, University of Oslo, 4Norwegian Computing Center
|
|
|
|
|
|
Natural Questions in Icelandic
Vésteinn Snæbjarnarson1 and Hafsteinn Einarsson2
1Miðeind, 2University of Iceland
|
|
|
|
|
|
QA4IE: A Quality Assurance Tool for Information Extraction
Rafael Silva1, Kaushik Gedela1, Alex Marr1, Bart Desmet1, Carolyn Rose2, Chunxiao Zhou3
1National Institutes of Health Clinical Center, 2Language Technologies Institute, Carnegie Mellon University, 3National Institutes of Health
|
|
|
|
|
|
A New Dataset for Topic-Based Paragraph Classification in Genocide-Related Court Transcripts
Miriam Schirmer, Udo Kruschwitz, Gregor Donabauer
University of Regensburg
|
|
|
|
|
|
DeepREF: A Framework for Optimized Deep Learning-based Relation Classification
Igor Nascimento1, Rinaldo Lima1, Adrian-Gabriel CHIFU2, Bernard Espinasse3, Sébastien Fournier2
1Federal Rural University of Pernambuco, 2Aix-Marseille Université, Université de Toulon, CNRS, LIS, Marseille, France, 3Aix-Marseille Université
|
|
|
|
|
|
Exploring Data Augmentation Strategies for Hate Speech Detection in Roman Urdu
Ubaid Azam1, Hammad Rizwan2, Asim Karim3
1Lahore University of Management Sciences, 2Lahore University of Management Sciences(LUMS), 3Lahore University of Management Sciences (LUMS)
|
|
|
|
|
|
Incorporating LIWC in Neural Networks to Improve Human Trait and Behavior Analysis in Low Resource Scenarios
Isil Yakut Kilic1 and Shimei Pan2
1University of Maryland, Baltimore County, 2UMBC
|
|
|
|
|
|
Using Sentence-level Classification Helps Entity Extraction from Material Science Literature
Ankan Mullick1, Shubhraneel Pal2, Tapas Nayak2, Seung-Cheol Lee3, Satadeep Bhattacharjee3, Pawan Goyal2
1Indian Institute of Technology, Kharagpur, 2IIT Kharagpur, 3Indo-Korea Science and Technology
|
|
|
|
|
|
A Twitter Corpus for Named Entity Recognition in Turkish
Buse Çarık and Reyyan Yeniterzi
Sabanci University
|
|
|
|
|
|
A STEP towards Interpretable Multi-Hop Reasoning:Bridge Phrase Identification and Query Expansion
Fan Luo and Mihai Surdeanu
University of Arizona
|
|
|
|
|
|
Question Generation and Answering for exploring Digital Humanities collections
Frederic Bechet1, Elie Antoine1, Jérémy Auguste1, Géraldine Damnati2
1Aix Marseille Universite - LIS/CNRS, 2Orange Labs
|
|
|
|
|
|
Evaluating Retrieval for Multi-domain Scientific Publications
Nancy Ide1, Keith Suderman2, Jingxuan Tu3, Marc Verhagen3, Shanan Peters4, Ian Ross4, John Lawson3, Andrew Borg3, James Pustejovsky3
1Vassar College, 2Johns Hopkins University, 3Brandeis University, 4University of Wisconsin-Madison
|
|
|
|
|
11:10 - 12:30
|
Session: P36 Applications involving LRs and Evaluation
- Poster Area 1
Chair: Zinn, Claus
|
|
|
|
|
|
Modeling Dutch Medical Texts for Detecting Functional Categories and Levels of COVID-19 Patients
Jenia Kim1, Stella Verkijk1, Edwin Geleijn2, Marieke Leeden2, Carel Meskers2, Caroline Meskers2, Sabina Veen3, Piek Vossen4, Guy Widdershoven3
1Vrije Universiteit Amsterdam, 2Department of Rehabilitation Medicine, Amsterdam University Medical Centers, 3Department of Ethics, Law and Humanities, Amsterdam University Medical Centers, 4VU University Amsterdam
|
|
|
|
|
|
Hierarchical Aggregation of Dialectal Data for Arabic Dialect Identification
Nurpeiis Baimukan1, Houda Bouamor2, Nizar Habash1
1New York University Abu Dhabi, 2Carnegie Mellon University in Qatar
|
|
|
|
|
|
Investigating Active Learning Sampling Strategies for Extreme Multi Label Text Classification
Lukas Wertz1, Katsiaryna Mirylenka2, Jonas Kuhn3, Jasmina Bogojeska2
1Universität Stuttgart, 2IBM Research - Zurich, 3University of Stuttgart
|
|
|
|
|
|
German Light Verb Constructions in Business Process Models
Kristin Kutzner and Ralf Laue
University of Applied Sciences Zwickau
|
|
|
|
|
|
PhysNLU: A Language Resource for Evaluating Natural Language Understanding and Explanation Coherence in Physics
Jordan Meadows, Zili Zhou, André Freitas
University of Manchester
|
|
|
|
|
|
HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French
Amalia Todirascu1, Rodrigo Wilkens2, Eva Rolin3, Thomas François3, Delphine Bernhard4, Núria Gala5
1University of Strasbourg, 2Université catholique de Louvain, 3UCLouvain, CENTAL, 4Lilpa, Université de Strasbourg, 5LPL-CNRS, Aix Marseille Université
|
|
|
|
|
|
AiRO - an Interactive Learning Tool for Children at Risk of Dyslexia
Peter Juel Henrichsen1 and Stine Fuglsang Engmose2
1senior researcher, 2University College Absalon
|
|
|
|
|
|
Creating a Basic Language Resource Kit for Faroese
Annika Simonsen1, Sandra Lamhauge2, Iben Debess2, Peter Henrichsen3
1Grunnurin Talutøkni, 2University of the Faroe Islands, 3Dansk Sprognævn
|
|
|
|
|
|
Developing a Spell and Grammar Checker for Icelandic using an Error Corpus
Hulda Óladóttir1, Þórunn Arnardóttir2, Anton Ingason2, Vilhjálmur Þorsteinsson1
1Miðeind, 2University of Iceland
|
|
|
|
|
|
The TalkMoves Dataset: K-12 Mathematics Lesson Transcripts Annotated for Teacher and Student Discursive Moves
Abhijit Suresh1, Jennifer Jacobs2, Charis Harty3, Margaret Perkoff1, James H. Martin4, Tamara Sumner4
1Graduate Student, 2Associate Research Professor, 3Research Associate, 4Professor
|
|
|
|
|
|
Automating Idea Unit Segmentation and Alignment for Assessing Reading Comprehension via Summary Protocol Analysis
Marcello Gecchele1, Hiroaki Yamada1, Takenobu Tokunaga1, Yasuyo Sawaki2, Mika Ishizuka3
1Tokyo Institute of Technology, 2Waseda University, 3Tokyo University of Technology
|
|
|
|
|
|
IRAC: A Domain-Specific Annotated Corpus of Implicit Reasoning in Arguments
Keshav Singh1, Naoya Inoue2, Farjana Sultana Mim1, Shoichi Naito1, Kentaro Inui3
1Tohoku University, 2Japan Advanced Institute of Science and Technology, 3Tohoku University / Riken
|
|
|
|
|
|
Conversational Speech Recognition Needs Data? Experiments with Austrian German
Julian Linke1, Philip N. Garner2, Gernot Kubin3, Barbara Schuppler4
1SPSC, 2Idiap Research Institute, 3Graz University of Technology, 4SPSC Laboratory, Graz University of Technology
|
|
|
|
|
|
A Benchmark Corpus for the Detection of Automatically Generated Text in Academic Publications
Vijini Liyanage1, Davide Buscaldi2, Adeline Nazarenko3
1University Sorbonne Paris Nord, 2LIPN, Université Paris 13, 3Université Sorbonne Paris Nord
|
|
|
|
|
|
Building a Dataset for Automatically Learning to Detect Questions Requiring Clarification
Ivano Lauriola1, Kevin Small2, Alessandro Moschitti2
1Amazon Alexa AI, 2Amazon Alexa AI Web Information
|
|
|
|
|
|
The ALPIN Sentiment Dictionary: Austrian Language Polarity in Newspapers
Thomas Kolb1, Sekanina Katharina2, Bettina Kern3, Julia Neidhardt1, Tanja Wissik4, Andreas Baumann3
1TU Wien, 2Austrian Academy of Sciences, 3University of Vienna, 4Academy of Sciences
|
|
|
|
|
|
Text Classification and Prediction in the Legal Domain
Minh-Quoc Nghiem1, Paul Baylis2, André Freitas3, Sophia Ananiadou3
1The University of Manchester, 2Bott and Co Solicitors, 3University of Manchester
|
|
|
|
|
|
I still have Time(s): Extending HeidelTime for German Texts
Andy Luecking1, Manuel Stoeckel2, Giuseppe Abrami2, Alexander Mehler3
1Université Paris Cité and Goethe University Frankfurt, 2Goethe University Frankfurt, 3Goethe-University Frankfurt am Main
|
|
|
|
|
|
Morphological Complexity of Children Narratives in Eight Languages
Gordana Hržica1, Chaya Liebeskind2, Kristina Despot3, Olga Dontcheva-Navratilova4, Laura Kamandulytė-Merfeldienė5, Sara Košutar1, Matea Kramarić1, Giedrė Valūnaitė Oleškevičienė6
1University of Zagreb, 2Jerusalem College of Technology , Lev Academic Center, 3Institute for the Croatian Language and Linguistics, 4Masaryk University, 5Vytautas Magnus University, 6Mykolas Romeris University
|
|
|
|
|
|
EXPRES Corpus for A Field-specific Automated Exploratory Study of L2 English Expert Scientific Writing
Ana-Maria Bucur1, Madalina Chitez2, Valentina Muresan2, Andreea Dinca2, Roxana Rogobete2
1Interdisciplinary School of Doctoral Studies, 2West University of Timișoara
|
|
|
|
|
|
An Evaluation Framework for Legal Document Summarization
Ankan Mullick1, Abhilash Nandy1, Manav Kapadnis1, Sohan Patnaik2, Raghav R1, Roshni Kar3
1Indian Institute of Technology, Kharagpur, 2Undergraduate at Indian Institute of Technology Kharagpur, 3Indian Institute of Technology Kharagpur
|
|
|
|
|
|
Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings
Thibault Charmet1, Inès Cherichi2, Matthieu Allain2, Urszula Czerwinska2, Amaury Fouret2, Benoît Sagot1, Rachel Bawden1
1Inria, 2Cour de Cassation
|
|
|
|
|
|
Cyrillic-MNIST: a Cyrillic Version of the MNIST Dataset
Bolat Tleubayev, Zhanel Zhexenova, Kenessary Koishybay, Anara Sandygulova
Nazarbayev University
|
|
|
|
|
|
gaBERT — an Irish Language Model
James Barry1, Joachim Wagner2, Lauren Cassidy3, Alan Cowap3, Teresa Lynn3, Abigail Walsh4, Mícheál Ó Meachair5, Jennifer Foster3
1ADAPT Centre DCU, 2ADAPT Centre, Dublin City University, 3Dublin City University, 4ADAPT Centre / Dublin City University, 5Fiontar and Scoil na Gaeilge
|
|
|
|
|
11:10 - 12:30
|
Session: P37 Parsing and Tagging
- Poster Area 1
Chair: Osenova, Petya
|
|
|
|
|
|
PoS Tagging, Lemmatization and Dependency Parsing of West Frisian
Wilbert Heeringa1, Gosse Bouma2, Martha Hofman1, Jelle Brouwer2, Eduard Drenth1, Jan Wijffels3, Hans Van de Velde1
1Fryske Akademy, 2University of Groningen, 3BNOSAC
|
|
|
|
|
|
A Dataset of Offensive German Language Tweets Annotated for Speech Acts
Melina Plakidis and Georg Rehm
DFKI
|
|
|
|
|
|
Tracing Syntactic Change in the Scientific Genre: Two Universal Dependency-parsed Diachronic Corpora of Scientific English and German
Marie-Pauline Krielke, Luigi Talamo, Mahmoud Fawzi, Jörg Knappen
Saarland University
|
|
|
|
|
|
The Tembusu Treebank: An English Learner Treebank
Luís Morgado da Costa1, Francis Bond1, Roger Winder2
1Palacký University, 2Nanyang Technological University
|
|
|
|
|
|
The Norwegian Dialect Corpus Treebank
Andre Kåsen1, Kristin Hagen2, Anders Nøklestad2, Joel Priestly2, Per Erik Solberg1, Dag Haug2
1National Library of Norway, 2Department of Linguistics and Scandinavian Studies
|
|
|
|
|
|
RRGparbank: A Parallel Role and Reference Grammar Treebank
Tatiana Bladier, Kilian Evang, Valeria Generalova, Zahra Ghane, Laura Kallmeyer, Robin Möllemann, Natalia Moors, Rainer Osswald, Simon Petitjean
Heinrich Heine University Düsseldorf
|
|
|
|
|
|
Unifying Morphology Resources with OntoLex-Morph. A Case Study in German
Christian Chiarcos1, Christian Fäth2, Maxim Ionov3
1Goethe-Universität Frankfurt am Main, 2Universität Frankfurt, 3Goethe-Universität Frankfurt
|
|
|
|
|
12:30 - 13:00
|
Invited Local Talk - José Deulofeu
- Auditorium
Chair: Bechet, Frédéric
|
|
|
|
|
13:00 - 14:30
|
Lunch Break
|
|
|
|
|
14:30 - 15:10
|
Antonio Zampolli Prize Talk
- Auditorium
Chair: (TBA)
|
|
|
|
|
15:10 - 15:15
|
Short Break (5mn)
|
|
|
|
|
15:15 - 16:35
|
Session O37: Anaphora and Coreference
- Salle 120
Chair:
Magnini, Bernardo Co-Chair: De Bruyne, Luna
|
|
|
|
|
15:15 - 15:35
|
Building Dataset for Grounding of Formulae — Annotating Coreference Relations Among Math Identifiers
Takuto Asakura1, Yusuke Miyao1, Akiko Aizawa2
1The University of Tokyo, 2National Institute of Informatics
|
|
|
|
|
15:35 - 15:55
|
CorefUD 1.0: Coreference Meets Universal Dependencies
Anna Nedoluzhko1, Michal Novák2, Martin Popel3, Zdeněk Žabokrtský4, Amir Zeldes5, Daniel Zeman2
1Charles University in Prague, 2Charles University, Faculty of Mathematics and Physics, 3Charles University, Faculty of Mathematics and Physics, UFAL, 4Charles University, 5Georgetown University
|
|
|
|
|
15:55 - 16:15
|
The Universal Anaphora Scorer
Juntao Yu1, Sopan Khosla2, Nafise Sadat Moosavi3, Silviu Paun4, Sameer Pradhan5, Massimo Poesio4
1University of Essex, 2Amazon Web Services, Amazon Inc, 3Department of Computer Science, The University of Sheffield, 4Queen Mary University of London, 5University of Pennsylvania and cemantix.org
|
|
|
|
|
16:15 - 16:35
|
Towards Evaluation of Cross-document Coreference Resolution Models Using Datasets with Diverse Annotation Schemes
Anastasia Zhukova1, Felix Hamborg2, Bela Gipp1
1University of Wuppertal, 2University of Konstanz
|
|
|
|
|
15:15 - 16:35
|
Session O38: Information Extraction and Information Retrieval
- La Major
Chair: Strassel, Stephanie
Co-Chair: Hsu, Yu-Yin
|
|
|
|
|
15:15 - 15:35
|
Explainable Tsetlin Machine Framework for Fake News Detection with Credibility Score Assessment
Bimal Bhattarai1, Ole-Christoffer Granmo2, Lei Jiao1
1University of Agder, 2Centre for Artificial Intelligence Research
|
|
|
|
|
15:35 - 15:55
|
Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition
Ali Hatab1, Caroline Sabty2, Slim Abdennadher1
1German University in Cairo, 2German International University
|
|
|
|
|
15:55 - 16:15
|
SCAI-QReCC Shared Task on Conversational Question Answering
Svitlana Vakulenko1, Johannes Kiesel2, Maik Fröbe3
1Amazon, 2Bauhaus-Universität Weimar, 3Martin-Luther-Universität Halle-Wittenberg
|
|
|
|
|
16:15 - 16:35
|
Semantic Relations between Text Segments for Semantic Storytelling: Annotation Tool - Dataset - Evaluation
Michael Raring1, Malte Ostendorff1, Georg Rehm2
1German Research Center for Artificial Intelligence, 2DFKI
|
|
|
|
|
15:15 - 16:35
|
Session O39: Multilinguality and Multimodality
- Auditorium
Chair: Declerk, Thierry
Co-Chair: Götze, Jana
|
|
|
|
|
15:15 - 15:35
|
Evaluating Pre-training Objectives for Low-Resource Translation into Morphologically Rich Languages
Prajit Dhar, Arianna Bisazza, Gertjan van Noord
University of Groningen
|
|
|
|
|
15:35 - 15:55
|
Aligning Images and Text with Semantic Role Labels for Fine-Grained Cross-Modal Understanding
Abhidip Bhattacharyya1, Cecilia Mauceri1, Martha Palmer2, Christoffer Heckman1
1University of Colorado Boulder, 2University of Colorado
|
|
|
|
|
15:15 - 16:15
|
Rosetta-LSF: an Aligned Corpus of French Sign Language and French for Text-to-Sign Translation
Elise Bertin-Lemée1, Annelies Braffort2, Camille Challant3, Claire Danet2, Boris Dauriac4, Michael Filhol2, Emmanuella Martinod2, Jérémie Segouat5
1SYSTRAN, 2LISN, CNRS, Université Paris-Saclay, 3Université Paris-Saclay, CNRS, LISN, 4MocapLab, 5CLLE, Université Jean Jaurès, Toulouse
|
|
|
|
|
16:15 - 16:35
|
MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset
Marina Fomicheva1, Shuo Sun2, Erick Fonseca3, Chrysoula Zerva4, Frédéric Blain5, Vishrav Chaudhary6, Francisco Guzmán7, Nina Lopatina8, Lucia Specia9, André F. T. Martins10
1University of Sheffield, 2Johns Hopkins University, 3Instituto de Telecomunicações, 4Instituto de Telecomunicações, Instituto Superior Técnico, University of Lisbon, 5University of Wolverhampton, 6Facebook AI, 7Facebook, 8IQT Labs, 9Imperial College London, 10Unbabel, Instituto de Telecomunicacoes
|
|
|
|
|
15:15 - 16:35
|
Session O40:
- Salle 92
Chair: (TBA)
|
|
|
|
|
15:15 - 16:35
|
Session: P38 Less-Resourced Languages (2)
- Poster Area 2
Chair: Soroa, Aitor
|
|
|
|
|
|
OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation
Sangwhan Moon1, Won Ik Cho2, Hye Joo Han3, Naoaki Okazaki1, Nam Soo Kim4
1Tokyo Institute of Technology, 2Department of Electrical and Computer Engineering and INMC, Seoul National University, 3Odd Concepts Inc., 4Seoul National University
|
|
|
|
|
|
Enriching Grammatical Error Correction Resources for Modern Greek
Katerina Korre1 and John Pavlopoulos2
1University of Bologna, 2Stockholm University
|
|
|
|
|
|
A Hmong Corpus with Elaborate Expression Annotations
David R. Mortensen1, Xinyu Zhang2, Chenxuan Cui2, Katherine Zhang2
1Language Technologies Institute, Carnegie Mellon University, 2Carnegie Mellon University
|
|
|
|
|
|
ELAL: An Emotion Lexicon for the Analysis of Alsatian Theatre Plays
Delphine Bernhard and Pablo Ruiz Fabo
Lilpa, Université de Strasbourg
|
|
|
|
|
|
Universal Dependencies for Western Sierra Puebla Nahuatl
Robert Pugh1, Marivel Huerta Mendez2, Mitsuya Sasaki2, Francis Tyers1
1Indiana University, 2Independent
|
|
|
|
|
|
The Construction and Evaluation of the LEAFTOP Dataset of Automatically Extracted Nouns in 1480 Languages
Gregory Baker and Diego Molla
Macquarie University
|
|
|
|
|
|
Huqariq: A Multilingual Speech Corpus of Native Languages of Peru forSpeech Recognition
Rodolfo Zevallos1, Luis Camacho2, Nelsi Melgarejo3
1Pompeu Fabra University, 2PUCP, 3Pontifical Catholic University of Peru
|
|
|
|
|
|
Writing System and Speaker Metadata for 2,800+ Language Varieties
Daan van Esch, Tamar Lucassen, Sebastian Ruder, Isaac Caswell, Clara Rivera
Google Research
|
|
|
|
|
|
The PALMA Corpora of African Varieties of Portuguese
Tjerk Hagemeijer1, Amália Mendes2, Rita Gonçalves3, Catarina Cornejo3, Raquel Madureira3, Michel Généreux4
1Universidade de Lisboa, 2Centre of Linguistics, School of Arts and Humanities, University of Lisbon, 3University of Lisbon, School of Arts and Humanities, Center of Linguistics, 4McMaster University
|
|
|
|
|
|
A Learning-Based Dependency to Constituency Conversion Algorithm for the Turkish Language
Büşra Marşan1, Oğuz Yıldız2, Aslı Kuzgun3, Neslihan Cesur4, Arife Yenice4, Ezgi Sanıyar4, Oğuzhan Kuyrukçu4, Bilge Arıcan4, Olcay Taner Yıldız5
1Boğaziçi University, 2Ahmet Keleşoğlu High School, 3Starlang Yazılım, 4Starlang Yazılım Danışmanlık, 5Department of Computer Engineering, Ozyegin University
|
|
|
|
|
|
Standard German Subtitling of Swiss German TV content: the PASSAGE Project
Jonathan David Mutal1, Pierrette Bouillon2, Johanna Gerlach3, Veronika Haberkorn3
1UNIGE, 2UNIGE FTI, 3Université de Genève FTI/TIM
|
|
|
|
|
|
A Survey of Multilingual Models for Automatic Speech Recognition
Hemant Yadav1 and Sunayana Sitaram2
1MIDAS, IIITD, 2Microsoft Research India
|
|
|
|
|
|
LuxemBERT: Simple and Practical Data Augmentation in Language Model Pre-Training for Luxembourgish
Cedric Lothritz1, Bertrand Lebichot1, Kevin Allix2, Lisa Veiber1, TEGAWENDE BISSYANDE3, Jacques Klein1, Andrey Boytsov4, Clément Lefebvre4, Anne Goujon4
1University of Luxembourg, 2SnT / University of Luxembourg, 3SnT, University of Luxembourg, 4BGL BNP Paribas
|
|
|
|
|
|
PerPaDa: A Persian Paraphrase Dataset based on Implicit Crowdsourcing Data Collection
Salar Mohtaj1, Fatemeh Tavakkoli2, Habibollah Asghari3
1Technische Universität Berlin, 2Freie Universität Berlin, 3Department of Electrical and Computer Engineering, University of Tehran
|
|
|
|
|
|
Introducing the Welsh Text Summarisation Dataset and Baseline Systems
Ignatius Ezeani1, Mahmoud El-Haj1, Jonathan Morris2, Dawn Knight3
1Lancaster University, 2Cardiff University, Wales, 3Cardiff University
|
|
|
|
|
|
A Systematic Approach to Derive a Refined Speech Corpus for Sinhala
Disura Warusawithana, Nilmani Kulaweera, Lakshan Weerasinghe, Buddhika Karunarathne
Department of Computer Science and Engineering, University of Moratuwa
|
|
|
|
|
|
IgboBERT Models: Building and Training Transformer Models for the Igbo Language
Chiamaka Chukwuneke, Ignatius Ezeani, Paul Rayson, Mahmoud El-Haj
Lancaster University
|
|
|
|
|
|
Latvian National Corpora Collection – Korpuss.lv
Baiba Saulite1, Roberts Darģis2, Normunds Gruzitis3, Ilze Auzina4, Kristīne Levāne-Petrova5, Lauma Pretkalniņa2, Laura Rituma2, Peteris Paikens6, Arturs Znotins2, Laine Strankale5, Kristīne Pokratniece5, Ilmārs Poikāns5, Guntis Barzdins3, Inguna Skadiņa7, Anda Baklāne8, Valdis Saulespurēns8, Jānis Ziediņš9
1IMCS, University of Latvia, 2Institute of Mathematics and Computer Science, University of Latvia, 3University of Latvia, 4Institute of Mathematics un Computer Science, University of Latvia, 5IMCS UL, 6University of Latvia, IMCS, 7Tilde/ Institute of Mathematics and Computer Science, University of Latvia, 8LNB, 9CISC
|
|
|
|
|
|
Investigating the Relationship Between Romanian Financial News and Closing Prices from the Bucharest Stock Exchange
Ioan-Bogdan Iordache1, Ana Sabina Uban2, Catalin Stoean3, Liviu P. Dinu1
1University of Bucharest, 2Universitat Politecnica de Valencia, University of Bucharest, 3University of Craiova
|
|
|
|
|
|
A Free/Open-Source Morphological Analyser and Generator for Sakha
Sardana Ivanova1, Jonathan Washington2, Francis Tyers3
1University of Helsinki, 2Swarthmore College, 3Indiana University
|
|
|
|
|
|
An Expanded Finite-State Transducer for Tsuut’ina Verbs
Joshua Holden1, Christopher Cox2, Antti Arppe1
1University of Alberta, 2Carleton University
|
|
|
|
|
|
BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts
Nauros Romim1, Mosahed Ahmed2, Md Saiful Islam3, Arnab Sen Sharma4, Hriteshwar Talukder1, Mohammad Ruhul Amin5
1Shahjalal University of Science and Technology, 2Shajalal University of Science and Technology, 3University of Alberta, 4Shahjalal Unviersity of Science and Technology, 5Fordham University
|
|
|
|
|
15:15 - 16:35
|
Session: P39 Language Resources and Evaluation for Psycho-linguistics, Cognitive Linguistics and Linguistic Theories
- Poster Area 2
Chair: Frontini, Francesca
|
|
|
|
|
|
Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction
Mehdi Mirzapour1, Waleed Ragheb2, Mohammad Javad Saeedizade3, Kevin Cousot4, Helene Jacquenet5, Lawrence Carbon5, Mathieu Lafourcade2
1ContentSide, R&D, 2LIRMM, Univ Montpellier, 3IUST, 4Emvista, 5ContentSide
|
|
|
|
|
|
The Badalona Corpus - An Audio, Video and Neuro-Physiological Conversational Dataset
Philippe Blache1, Salomé Antoine2, Dorina De Jong3, Lena-Marie Huttner4, Emilia Kerr5, Thierry Legou6, Eliot Maës7, Clément François6
1LPL CNRS, 2Freie Universität Berlin, 3Università di Ferrara, 4Laboratoire Parole & Langage, Aix-Marseille Université, 5Laboratoire Parole & Langage (CNRS-Aix-Marseille Université), 6CNRS / Laboratoire Parole et Langage, 7Laboratoire Informatique & Systèmes (CNRS-Aix-Marseille Université)
|
|
|
|
|
|
Reading Time and Vocabulary Rating in the Japanese Language: Large-Scale Japanese Reading Time Data Collection Using Crowdsourcing
Masayuki Asahara
National Institute for Japanese Language and Linguistics
|
|
|
|
|
|
Thematic Fit Bits: Annotation Quality and Quantity Interplay for Event Participant Representation
Yuval Marton1 and Asad Sayeed2
1University of Washington, 2University of Gothenburg
|
|
|
|
|
|
ChiSense-12: An English Sense-Annotated Child-Directed Speech Corpus
Francesco Cabiddu1, Lewis Bott1, Gary Jones2, Chiara Gambi1
1Cardiff University, 2Nottingham Trent University
|
|
|
|
|
|
Making People Laugh like a Pro: Analysing Humor Through Stand-Up Comedy
Beatrice Turano1 and Carlo Strapparava2
1University of Trento, 2FBK-irst
|
|
|
|
|
|
Testing Focus and Non-at-issue Frameworks with a Question-under-Discussion-Annotated Corpus
Christoph Hesse1, Maurice Langner2, Ralf Klabunde3, Anton Benz1
1Leibniz-Zentrum Allgemeine Sprachwissenschaft, 2Sprachwissenschaftliches Institut, Ruhr-Universität Bochum, 3Ruhr-University Bochum
|
|
|
|
|
|
Development of a Multilingual CCG Treebank via Universal Dependencies Conversion
Tu-Anh Tran1 and Yusuke Miyao2
1The University of Tokyo, 2University of Tokyo
|
|
|
|
|
|
The Automatic Extraction of Linguistic Biomarkers as a Viable Solution for the Early Diagnosis of Mental Disorders
Gloria Gagliardi and Fabio Tamburini
FICLIT - University of Bologna
|
|
|
|
|
|
Singlish Where Got Rules One? Constructing a Computational Grammar for Singlish
Siew Yeng Chow1 and Francis Bond2
1Nanyang Technological University, 2Palacký University
|
|
|
|
|
|
COSMOS: Experimental and Comparative Studies of Concept Representations in Schoolchildren
Jeanne Villaneau1 and Farida SAID2
1IRISA Université de Bretagne Sud, 2LMBA, Univerité de Bretagne Sud
|
|
|
|
|
|
Features of Perceived Metaphoricity on the Discourse Level: Abstractness and Emotionality
Prisca Piccirilli and Sabine Schulte im Walde
University of Stuttgart
|
|
|
|
|
|
Hollywood Identity Bias Dataset: A Context Oriented Bias Analysis of Movie Dialogues
Sandhya Singh1, Prapti Roy1, Nihar Sahoo2, Niteesh Mallela1, Himanshu Gupta3, Pushpak Bhattacharyya4, Milind Savagaonkar5, Nidhi Sultan5, Roshni Ramnani6, Anutosh Maitra7, Shubhashis Sengupta6
1Indian Institute of Technology Bombay, 2Indian Institute of Technology, Bombay, 3Indian Institute of Technology, 4Indian Institute of Technology Bombay and Patna, 5Accenture Labs, 6Accenture Technology labs, 7Accenture
|
|
|
|
|
|
VoxCommunis: A Corpus for Cross-linguistic Phonetic Analysis
Emily Ahn1 and Eleanor Chodroff2
1University of Washington, 2University of York
|
|
|
|
|
15:15 - 16:35
|
Session: P40 Digital Humanities (2)
- Poster Area 2
Chair: Passarotti, Marco Carlo
|
|
|
|
|
|
Tracking Textual Similarities in Neo-Latin Drama Networks
Andrea Peverelli1, Marieke van Erp2, Jan Bloemendal1
1The Royal Netherlands Academy of Arts and Science, 2KNAW Humanities Cluster
|
|
|
|
|
|
Named Entity Recognition in Estonian 19th Century Parish Court Records
Siim Orasmaa1, Kadri Muischnek2, Kristjan Poska1, Anna Edela1
1University of Tartu, 2associate professor
|
|
|
|
|
|
Investigating Independence vs. Control: Agenda-Setting in Russian News Coverage on Social Media
Annerose Eichel1, Gabriella Lapesa2, Sabine Schulte im Walde1
1University of Stuttgart, 2Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung
|
|
|
|
|
|
SLäNDa version 2.0: Improved and Extended Annotation of Narrative and Dialogue in Swedish Literature
Sara Stymne and Carin Östman
Uppsala University
|
|
|
|
|
|
AGILe: The First Lemmatizer for Ancient Greek Inscriptions
Evelien de Graaf, Silvia Stopponi, Jasper Bos, Saskia Peels-Matthey, Malvina Nissim
University of Groningen
|
|
|
|
|
|
»textklang« – Towards a Multi-Modal Exploration Platform for German Poetry
Nadja Schauffler1, Toni Bernhart1, Andre Blessing1, Gunilla Eschenbach2, Markus Gärtner1, Kerstin Jung1, Anna Kinder2, Julia Koch1, Sandra Richter2, Gabriel Viehhauser1, Ngoc Thang Vu1, Lorenz Wesemann2, Jonas Kuhn1
1University of Stuttgart, 2German Literature Archive
|
|
|
|
|
16:35 - 16:55
|
Coffee Break
|
|
|
|
|
16:55 - 18:00
|
LREC 2022 Closing Ceremony
- Auditorium
Chair: (TBA)
|
|
|
|
|
|
|
|
|
20:00
|
LREC 2022 GALA Dinner Ceremony
- Pharo
|
|
|
|
|
|
|
|
|