LREC Industry track proceedings:
The European Language Resource Association (ELRA, www.elra.info) is glad to announce the 12th edition of LREC, organized with the support of international associations and a number of industrial partners and supporters.
Since the first LREC held in Granada in 1998, LREC has become the major event on Language Resources (LRs) and Evaluation for Language Technologies (LT) with over 1200 attendees from all over the world. LREC provides a unique forum for researchers, industrials and funding agencies from across a wide spectrum of areas to discuss problems and opportunities, find new synergies and promote initiatives for international cooperation, in support to investigations in language sciences, progress and innovation in language technologies and development of corresponding products, services and applications, and standards.
For the second time and as a hot LREC 2020 topic, an industry track will take place during the main conference (May 13-15, 2020).
Human language technologies have become increasingly important parts of our lives. These technologies have emerged from decades of collaborations between academic and industrial research organizations often with financial and research support from the public sector; collaborations are made possible by the unique strengths of both communities and a set of shared practices (algorithms, evaluation methods, datasets, and the like). But despite this, there are substantial differences between research in academic and industrial settings.
In contrast to academic research: industrial speech and language technologies may pose unique challenges of scale; language resources from industry may demand different algorithms or evaluation methodologies than in academic settings; and the practices of academic and industrial settings may converge on distinct methods for the same problem; industrial systems and practices may pose ethical challenges not necessarily present in academic settings.
Topics of Interest
Topics include but not limited to:
1. Industrial systems
For this topic, we welcome submissions which discuss industrial systems. They may describe technical innovations which are enabled by the industrial setting, or they may describe the implementation of a deployed industrial system. We also welcome submissions which discuss failures to replicate "state-of-the-art" performance when provided with the affordances of an industrial setting. Finally, we also welcome opinion papers which discuss similarities and differences between academic and industrial practices for system development and evaluation, or which consider ethical issues specific to systems deployed at industry scale.
2. Tools and platforms for data collection
Data collected in an industry setting may pose specific technical, legal, and ethical challenges not normally encountered in academic settings. The infrastructure within which developers in industry operate can provide tremendous advantages, but also unique challenges. There can be significant differences in the context of a tool's operator or a data platform's customer in industry vs. academic applications. Platforms may be globally distributed, and the scale itself of the data and of the deployment of industry technologies can add significant complexity, which may demand innovative approaches. Industry developers may also face special problems in defining users, their orientation to their tasks, and what constitutes a successful interaction from the standpoint of the user and of data acquisition efforts. We welcome submissions which discuss industrial tools and platforms used to collect data.
3. Human computation in industry
Industrial language technologies depend on machine learning methods, which in turn require large, diverse collections of labeled data collected from humans for rapid iterative development and refinement. We welcome submissions which discuss issues in experimental design for human computation, the challenges of quality, diversity, and representation in crowdsourcing, and ethical issues posed by data collection via crowdsourcing and outsourcing.
4. Less-resourced languages
One goal for this year's LREC is to strengthen connections with the Mediterranean speech and language communities, in particular for the less resourced languages. These cover a large number of languages, with associated varieties (European languages, varieties of Semitic languages, indigenous languages, spoken-only languages, etc.). Therefore we welcome submissions which discuss industrial resources and technologies specific to the challenges posed by such languages.
5. Spoken languages and dialects
We are particularly interested in work which describes industrial resources and technologies for spoken languages, non-standard dialects, and therefore we welcome submissions which focus on these topics, especially those submissions which contrast spoken and written language—or standard and non-standard
language—resources and technologies.
We encourage submissions of papers for oral or poster presentation. Papers should follow the. The working language of the track is English. Submitted papers must be written and delivered in English, be up to 4 pages in length and in PDF format.
Identify, Describe and Share your LRs!
Describing your language resources (LRs) in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). This LREC feature is available to submissions within this track and highly recommended. To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN,), a Persistent Unique Identifier to be assigned to each LR. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.
● Paper submission:
28 February 2020 31 March 2020
● Notification of acceptance:
13 March 2020 30 April 2020
● Camera-ready paper: 03 April 2020
● Track Date: to be defined (13-15 May 2020)
- Khalid Choukri (ELRA/ELDA, France),
- Bente Maegaard (University of Copenhagen, Denmark),
- Nicoletta Calzolari (Institute for Computational Linguistics «A. Zampolli», CNR, Italy; ELRA, France)