Share on:

Data Privacy vs. Database Rights: Balancing The Scales

Photo by Ima Miroshnichenko in Pexels

Global economic trends and harsher business environments have pointed companies to the vast potential that lay in data, datasets, and databases. As businesses merge and exit, these form a strategic part of their asset base from where they can create more value.

With this comes the need to protect the underlying IP in these databases, especially as part of a wider strategy to promote data gathering and innovation. But there are complexities. Copyright laws, along with the TRIPS Agreement¹, notably confer copyright status on databases² and other compilations, in so far as they represent an intellectual creation through the selection or arrangement of their contents. This would apply even if some or all of the individual contents are not by themselves eligible for copyright protection.

At the same time, where these databases contain PII (Personal Identifiable Information), data protection laws empower a data subject to participate in or control the manner or extent of the processing of their data in this database. Naturally, exercising these rights would interfere with the protection the said databases enjoy under copyright law.

A database owner looking to pre-empt such conflicts would thus have to reconcile privacy considerations with their copyrights in the database. The question this begs, of course, is to what extent does the law balance these competing interests? This article will examine the legal landscape, exploring how current laws protect database rights in light of the need to respect privacy rights.

Copyright protection of databases: An Overview

To start with, it’s worth going over the different modalities created under copyright law to protect the creation of databases:

1. Database copyrights

The same motives for the protection of literary works in Article. 2(5) of the Berne Convention² underlie database copyrights. Traditionally, the focus of copyrights had always been on protecting the resources, skill, and labor put into recording and compiling a database.

This principle is the “Sweat of the Brow” or “Industrious Collection” doctrine in copyright protection. Under this doctrine, an author ‘earns’ copyrights through the diligent effort they have expended towards creating a work, such as a database or directory, without the need for substantial creativity or originality.

Accordingly, the author of a work—even if it lacks originality—is entitled to protection for the skill, judgment, labor, and other resources invested. Third parties are precluded from using such work without permission and must instead replicate it through their independent research or efforts.

The “Sweat of the Brow” rule was later abolished by the U.S. Supreme Court in the widely referenced case of Feist Publications v. Rural Telephone. The case concerned a defendant (Feist) who copied information from the claimant’s (Rural), telephone listings after the defendant an earlier refusal to license the said data for inclusion in Feist’s own directory. Alleging copyright infringement, the claimant sued, and the court held that:

“Factual compilations… may possess the requisite originality. The compilation author typically chooses which facts to include, in what order to place them, and how to arrange the collected data so that they may be used effectively by readers. These choices as to selection and arrangement, so long as they are made independently by the compiler and entail a minimal degree of creativity, are sufficiently original that Congress may protect such compilations through the copyright laws.”³

Delivering the lead judgment, Justice Sandra Day O’Connor remarkably went on to trace the contours of the originality requirement thus:

“There is an undeniable tension between these two propositions. …Many compilations consist of nothing but raw data — i.e. wholly factual information not accompanied by any original expression. On what basis may one claim a copyright upon such work? Common sense tells us that 100 uncopyrightable facts do not magically change their status when gathered together in one place. … The key to resolving the tension lies in understanding why facts are not copyrightable: The “Sine qua non of copyright is originality.”

From this, it is clear that facts residing in the database and not impacted by a creative touch, however subtle, may not have copyright status. The judgment further compels “Sweat of the Brow” courts to discard a core principle of copyright law — that facts or ideas cannot be copyrighted.

Now, for a database to merit copyright protection, the facts must be selected, coordinated, or arranged in a manner as to make the work original.

2. Sui generis database rights (“database rights”)

The problem with the originality requirement arises where a creation is constrained by rules or technical considerations that leave no room for creative freedom. The gaps in traditional copyright standards meant only the structure of a database could be protected — not its content.

Take an author who made significant investments in compiling a database of financial market data, such as stock prices or currency exchange rates.

Such an investment may be unprotected either because it was ‘non-original’, or because its content did not enjoy copyright protection. This would offer little incentives for database building. The hardship this created led the EU per the EU Database Directive⁴ (Directive 96/9/EC)to adopt an investment-based approach to copyright protection that:

Ensures a consistent level of copyright protection for “original” databases by harmonizing the requirements across the EU; and
Creates a sui generis right that focuses on protecting the investments of database authors, namely the content of the database and its economic value.

For database rights to accrue, substantial investment must have gone into, as stated in Article 7, “obtaining, verifying or presenting the contents of a database.” To benefit, the owner(s) of such a database must also have a substantial economic and business connection with an EEA state (either an EU national or resident in the EU). It lasts a period of 15 years from its date of creation or publishing.

“The Database Problem”: How data privacy may interfere with database rights

An underlying aim of protecting database rights, or any copyrights in data is to shield the said database from commercial exploitation by third parties.

In turn, it is desired that this supports the creation of more databases to foster innovation — but not in a manner as to create tensions with the privacy rights of data subjects. Accordingly, where Personally Identifiable Information (“PII”) forms part of a dataset/database, this would invoke data privacy law considerations.

As if to pre-empt this incidence, privacy laws such as the GDPR provide for a special set of rights called “Data subject participation rights”. These rights allow a data subject to exercise their prima facie right of control with respect to the use of any PII.

They include: the right to rectification (Article. 16), the right to restriction of processing (Article 18), the right to be forgotten, otherwise known as the right to erasure (Article 17), and the right to data portability. (Article. 20)

As an example, a data subject may, in the exercise of their right to be forgotten, require that the owner of a database delete or delist their data. In essence, such a data subject would be piercing the metaphorical veil of copyright protection that resides with the database owner to exercise their privacy rights, thereby modifying the database owner’s proprietary interest in the database.

The 2014 Google Spain Case⁵, also known as the Costeja Case brought this legal hypothesis to real life. The case involved a Spaniard, Mario Costeja González, who wanted links to an old newspaper article about his past bankruptcy removed. The Court of Justice of the European Union (CJEU) ruled in his favor.

This decision alluded to a “right to be delisted,” an iteration of the right to be forgotten.” Invariably, the right compels search engines like Google to, in their role as data controllers, honor individuals’ requests to remove links to outdated, irrelevant, or excessive personal information.

From all indications, these kinds of occurrences may prove disruptive to business. While it is true that the GDPR does not expressly provide for the protection of copyrights, it implicitly provides for certain restrictions that arguably extend to IP rights.

According to Article. 15 of the GRPR, the right of access provided to data subjects “shall not adversely affect the rights or freedoms of others.” In the same way, Recital 73 allows this right of access to be limited to safeguard trade secrets or intellectual property.

A real-life application of this restriction can be seen in an interaction based on a Journalist’s data subject access request (DSAR) made to Tinder⁶.

The request returned 800 pages of data collected from the journalist during her use of the app. Interestingly, the returned data did not disclose how Tinder utilized this data to personalize her experience and match her with potential partners. When asked, Tinder responded that their matching tools are proprietary technology and intellectual property — sensitive company information that they are not compelled to disclose.

The legal validity of this response is yet to be tested in courts. But this underscores the problem of distinguishing personal data from related proprietary data that businesses own rights to. The right to data portability under Article. 20 also poses a similar issue, where data subjects can receive their personal data in a structured, commonly used, and machine-readable format.

Where a data subject owner elects to exercise their data portability rights, Article. 20 limits the rights to the personal data provided by them — not the (potentially more valuable) derived data or any proprietary know-how used by the controller.

Who owns data?

Photo by Muhammed Ensar in Pexels

The question of “who owns what data?” is crucial to determining whether or not a database owner is obliged under data privacy law to honor data subject rights. The assessment, however, is a subjective one that goes to the nature (Data can be a product of a user or machine) or origin of the data involved. They are grouped into 4 broad categories, namely:

Observable Information

Information created by a user about themselves, or seen firsthand by others are observable data. Examples are personal correspondence like letters or emails made by the user, or recorded media such as CCTV surveillance footage, personal photos, or audio recordings made by a third party.

Observed Information

Observed information is data based on a third party’s observation or provided by a user to the exclusion of a third party.

This includes basic details like place and date of birth, physical traits, personal preferences, social traits, family information, employment details, biological conditions, and geolocation information.

Computed Information

Computed PIIs are a derivate of observable or observed information. It is created when existing data is used to produce fresh, inferred data about an individual.

Examples: Online advertising profiles created from automated profiling and biometric data (derived from observable information about the physical characteristics of an individual).

Associated Information:

Data a third party may link with an individual, but by itself is not sufficient to identify them.

Examples include Social Security Numbers, Driver’s License numbers, IP addresses, device identifiers, and financial information (think bank account numbers).

In the case of observable or observed information, the issue of ownership is straightforward — the party who reduces the data into copyrightable material should own the rights.

Concerning computed information, as seen in the Tinder case, if a database owner is not in breach of any data privacy laws, they should own rights to computed information — since they expended significant labor, skill, and judgment in arriving at the computed dataset.

Finally, associated information should, by default, vest in the database owner. However, wherever it clearly identifies a data subject, they should be equally entitled to usage rights.

The “Ownership” debate

Globally, the idea of attributing ownership rights of personal data to a data subject is gathering pace. Pro-ownership commentators argue that unless data subjects are incentivized with ownership rights, they are more likely to be passive in exercising their rights.

This notion echoes some valid user frustration about the power imbalance in today’s digital marketplace; Users are left in the dark about data shared while companies massively profit from extracting marketable insights from the said data.

Giving users a fair shake would therefore mean rights of ownership, which would also imply that they are entitled to sell their data. But while this thinking looks good on paper, it may not translate seamlessly in practice.

If such rights are granted, what universally accepted framework would underpin data transactions? What determines the pricing? If we were to adopt a flexible pricing model, one may imagine that data subjects become forced to enter into data negotiations before each web visit or service use. The granularity involved may well lead to ‘negotiation fatigue’ — another user frustration we already know all too well about in the style of ‘consent fatigue’.

More so, data is not exactly a commodity. Data transactions are an entirely different value exchange. For one, it is non-rivalrous; multiple companies can collect the same set of data from a data subject without diminishing the endless supply of such data, as it still resides in the data subject.

Reducing data to a commodity to be exchanged or licensed for money will do users no favors. In the best case, users may sell their data for a pittance. At worst, free data flow as we currently know it may no longer exist, which may hinder the innovation of cutting-edge technologies some currently enjoy and couldn’t otherwise afford.

Ultimately, a better policy approach would be to strengthen the current privacy framework to ensure a balance of individual privacy interests with commercial use.

Navigating the privacy-utility tradeoff

Provided a database owner obtains data lawfully and observes privacy rights, a data subject may be construed as an implied licensor of data. Authorization to collect data may imply a license, the revocation of which may done by exercising their data subject participation right, for instance, their right to be forgotten.

The question this then raises is: Can it be said that companies truly own rights to a database before the privacy right is exercised? Ownership in this context depends on legal and technical interpretations that only the courts can weigh in on.

While the answer remains unclear, the rise of the data economy will certainly continue to spark debates about data rights, touching on both intellectual property rights and privacy rights.

The current landscape for rights represents more of an unsettled road than a clearly defined path. Societal advancement will only spell more creation and volume of data, making data-protection laws more relevant to safeguard the interests of data subjects.

In any event, policymakers must be wary not to, in overestimating privacy concerns in data sharing, suppress database rights. A lot of innovation would depend on how well stakeholders can perform this delicate balancing act. Failure will inevitably de-incentivize data sharing, leading to a data shortage that may well prevent several groundbreaking technological innovations from seeing the light of day.

Endnotes

The Trade-Related Aspects of Intellectual Property Rights (TRIPS) is an international agreement within the World Trade Organization (WTO) that sets minimum standards for the protection and enforcement of intellectual property rights across various categories like patents, trademarks, copyrights, and trade secrets. It aims to harmonize intellectual property laws globally and facilitate international trade in knowledge-based goods and services.
Article 10(2) of the TRIPS Agreement provides that “compilations of data or other material, whether in machine-readable or other form, which by reason of the selection or arrangement of their contents constitute intellectual creations, shall be protected as such.”
https://www.wipo.int/wipolex/en/text/283698
Feist Publications, Inc. v. Rural Telephone Service Co., 499 U. S. 340 (1991)
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A31996L0009#ntr1-L_1996077EN.01002001-E0001
Google Spain SL, Google Inc. v Agencia Española de Protección de Datos, Mario Costeja González (2014)
https://www.theguardian.com/technology/2017/sep/26/tinder-personal-data-dating-app-messages-hacked-sold

Post by

Aramide O.

An IP & Tech lawyer, Aramide’s interest spans Intellectual Property, Data Privacy, Media Law, Fintech Law, and the regulation of emerging technologies. His understanding of the pivotal role IP plays in digitalization projects drives his commitment to advance awareness of IP rights across local and global contexts.

Share on:

Data Privacy vs. Database Rights: Balancing The Scales

Copyright protection of databases: An Overview

1. Database copyrights

2. Sui generis database rights (“database rights”)

“The Database Problem”: How data privacy may interfere with database rights

Who owns data?

Observable Information

Observed Information

Computed Information

Associated Information:

The “Ownership” debate

Navigating the privacy-utility tradeoff

Endnotes

Tags

Post by

Aramide O.

Leave a Reply Cancel reply

Related Posts

The Rise and Rise of IP Trolls

When IP Crosses Borders: Appraising the Exhaustion Doctrine in International Trade

“It’s Just a Replica—Or Is It?”: Unpacking Trademark Infringement & Unfair Competition in Fragrance Dupes

Up In The Air: Navigating Perfume Copyrights in a Sea of ‘Smell Alikes’

Newsletter Sign up

Sign up for IP case analysis, IP strategy, and legal commentary. Dial up your IP savvy and max out on commercial value.