“Very Short Extracts”, Quotes and the Seven Word Question in European #copyright

For those who don’t know, a copyright reform is happening in the EU. It is in many ways peculiar, some good things happening, other not so good things, but also some suggestions that, in my opinion, don’t make very much sense. One of them is the new term “very short extract” to describe the common practice of service platforms such as Google, Reddit, or Facebook the potentially lawful length of a snippet. Here is a picture to explain the terminology:

A comprehensive overview of “Headline”, “7 words” and a “snippet”.

The proposal is that press publishers can claim copyright for a service platform’s use of their content, namely the headline and the “snippet”, which is the short extract, normally the first few words, of an article. This is derived from German copyright, the ancillary right for press publishers. Well, because copyright is all about being ‘balanced’ and ‘fair’ as we know, the German courts have tried to settle the issue of how many words from a snippet a service platform can use without paying. Their compromise was 7 words, which was promptly rejected by both sides. However, the 7 words exception is being used on EU level when discussing the EU wide copyright reform as it is the starting point.

The Test

To demonstrate the differences between 7 words in different languages I have decided to count the words and direct translate their meaning. The control sentence I will use is from: “on the harmonisation of certain aspects of copyright and related rights in the information society” or the InfoSoc directive. The sentence is 14 words in English, so the score will be out of 14. The control language will be English, for the purpose of this research.

English: on the harmonisation of certain aspects of

Score: 7 out of 14

Let’s see how other languages can convey much meaning in 7 words.

Spanish: relativa a la armonización de determinados aspectos
Translation: on the harmonisation of certain aspects. 6 out of 14

German: zur Harmonisierung bestimmter Aspekte des Urheberrechts und
Translation: On the harmonisation of certain aspects of copyright. 8 out of 14

Estonian: autoriõiguse ja sellega kaasnevate õiguste teatavate aspektide
Translation: on certain aspects of copyright and related rights. 8 out of 14

France: sur l’harmonisation de certains aspects du droit d’auteur
Translation: on the harmonisation of certain aspects of copyright. 8 out of 14

Croatian: o usklađivanju određenih aspekata autorskog i srodnih
Translation: on the harmonisation of certain aspects of copyright and related 10 out of 14

Finish: tekijänoikeuden ja lähioikeuksien tiettyjen piirteiden yhdenmukaistamisesta tietoyhteiskunnassa
Translation: “on the harmonisation of certain aspects of copyright and related rights in the information society” — the whole sentence! 14 out of 14!

The winner is Finland with full score, 14 out of 14, followed by Croatian, 10 out of 14! On the bottom is the Spanish translation with 6 words out of 14.


What I find interesting is the whole lack of understanding of how much meaning 7 words can convey in different European languages. There are 24 working languages in the EU, and then we’re not counting languages that are being used by a big population every day, such as Russian, Arabic and Turkish. Different European languages have different structure, different syntax, different usage of prefixes or suffixes, how to convey a possessive, and — are from different roots.

Measuring extract on an European level based on a fixed number of words is ridiculous way to regulate anything. Beside the obvious fact that an “extract” or a “very short extract” is not a legal term that has traditionally been used in copyright to describe an exception or limitation of, it is simply legally incoherent. Will every single European country try to find what the optimal “very short extract” is supposed to be? One word in Finnish but 4 words in Spanish? And how will that work out in practice?

The word “Quote” in turn has a legally coherent meaning as a quote will contain its context between different languages, regardless of how many words you need to convey the meaning. But by redefining quotes as extracts publishers can more or less put price on words — taking copyright yet another step away from its objective: to protect authors and their creation in a context.