ELSE-IF workshop @ SEMANTiCS 2019

European Language SErvices — Industry Forum

Published in

Semantic Tech Hotspot

3 min readSep 25, 2019

Introduction

The workshop took place in Karlsruhe on September 9, one day before the main program of SEMANTiCS. This workshop aims at bringing together, on the one hand, researchers from across disciplines concerned with the use of language data, the development of language methods and implementation of language services and, on the other hand, the industries consuming those data and services.

Language technologies increasingly rely on large amounts of data. Better access and usage of language resources will enable the provision of multilingual solutions that will support the emerging Digital Single Market in Europe. However data is rarely ‘ready-to-use’ and language technology specialists spend over 80% of their time on cleaning, organizing and collecting datasets. Reducing this effort promises huge cost savings for all sectors where language technologies are required.

The workshop was sponsored by Pret-A-LLOD project, funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No 825182.

Around 15 participants in total. Affiliations of the participants included the following commercial organizations:

Invited speakers:

Luc Meertens, Crosslang
Georg Rehm, DFKI

Overview

The two invited talks — one with market focus and one research focus — complemented each other. First Luc Meertens gave an overview of the language technology (LT) market. The market appears segmented, small size but growing. The key recommendations stress a lack of venture capital and a necessity of stronger infrastructure. In the second invited talk Georg Rehm focused on the strategy of the European Language Grid project to address the identified challenges. Namely, a single infrastructure for single digital language market will be prepared in the ELG project. Pret-a-LLOD and other similar project benefit from such a market place and will use it to make the results visible, discoverable and sustainable.

Slides by Luc Meertens

Slides by Georg Rehm — the link will follow in ~2 weeks

Luc Meertens “An analysis of the Language Technology market” | Georg Rehm “What’s next for Multilingual Europe?”

The talks by Thierry Declerck, Matthias Orlikowski, Artem Revenko demonstrated different real-life use case for the usage of the Pret-a-LLOD resources and methodologies: usage for the extension of XBRL and Semalytix and Semantic Web Company pilots from the Pret-a-LLOD project. All three use cases benefit from the usage of linguistic data to extend the analytic capabilities.

At the stand-up session the 2 participants of the workshop described their related developments:

extraction of lexical information from Wikidata https://tools.wmflabs.org/ordia/.
Domain-specific entity recognition.

Matthias Orlikowski “Multilingual Text Analytics for Extracting Pharma Real-World Evidence”

Plan for the next event

The workshop concluded with a round-table discussion, the participants made their suggestions to improve the event next year and to increase the impact.

For the next event, it makes sense to focus on a tutorial to explain to the industrial participants how the language resource might be used to provide benefits for the industrial use cases. Also a simpler format of involving the participants, for example, a stand-up session with strict time limits, but without prior submission appears to be useful. Probably also demo session.

Pilots of Pret-a-LLOD project should be the driving force to clarify how exactly the language resource can be beneficial for industries.

Most related events next year

Meta forum

Significant industry participation, around 30–40%. Many European projects are disseminated at the event, an exhibition slot for projects.

Language Resources and Evaluation Conference

More academic, however, the event attracts significant attention and the organizers are eager to attract more industries.

LT Industry Summit

More industrial focus, hence more target audience, possibly relevant use cases. Tight schedule.