My Lords, I am beginning to feel like the noble Lord, Lord Clement-Jones, but I reassure everyone that this is the last day of Committee.
I shall speak to the amendments in this group in my name and that of the noble Lords, Lord Stevenson—he is very sorry not to be in his place today—and Lord Clement-Jones, and my noble friend Lord Freyberg. I thank the News Media Association for its briefing and support. I also thank, for their wonderful and unlikely support, Sir Paul McCartney, Kate Mosse, Margaret Drabble and Richard Osman, alongside the many creative artists who have spoken, written and tweeted and are among the 37,000 people who signed a petition calling for swift action to protect their livelihoods.
I have already declared my interests for the Committee but I add, to be clear, that my husband is a writer of film, theatre and opera; and that, before I came to your Lordships’ House, I spent 30 years as a movie director. As such, I come from and live alongside a community for whom the unlicensed and illegal use of copyrighted content by generative AI developers is an existential issue. I am therefore proud to move and speak to amendments that would protect one of our most financially significant economic sectors, which contributes £126 billion in gross value added to UK GDP; employs 2.4 million people; and brings so much joy and understanding to the world.
Text and data mining without licence or permission is illegal in the UK, unless it is done specifically for research. This means that what we have witnessed over the past few years is intellectual property theft on a vast scale. Like many of the issues we have discussed in Committee, this wrongdoing has happened in plain sight of regulators and successive Governments. I am afraid that yesterday’s announcement of a consultation did not bring the relief the industry needs. As Saturday’s Times said,
“senior figures in the creative sector are scathing about the government plans”,
suggesting that the Secretary of State has drunk Silicon Valley’s “Kool-Aid” and that rights reservation is nonsense. An official at the technical briefing for the consultation said that
“rights reservation is a synonym for opt out”.
Should shopkeepers have to opt out of shoplifters? Should victims of violence have to opt out of attacks? Should those who use the internet for banking have to opt out of fraud? I could go on. I struggle to think of another situation where someone protected by law must proactively wrap it around themselves on an individual basis.
The value of our creative industries is not in question; nor is the devastation that they are experiencing as a result of non-payment of IP. A recent report from the International Confederation of Societies of Authors and Composers, which represents more than 5 million creators worldwide, said that AI developers and providers anticipate the market for GAI music and audiovisual content increasing from €3 billion to €64 billion by 2028 —much of it derived from the unlicensed reproduction of creators’ works, representing a transfer of economic value from creators to AI companies. Let there be no misunderstanding of the scale of the theft: we already know that the entire internet has been downloaded several times without the consent or financial participation of millions of copyright holders.
This transfer of economic value from writers, visual artists and composers across all formats and all genres to AI companies is not theoretical. It is straightforward: if you cannot get properly paid for your work, you cannot pay the rent or build a career. Nor should we be taken in by the “manufactured uncertainty” that Silicon Valley-funded gen AI firms and think tanks have sought to create around UK copyright law. Lobbyists and their mouthpieces, such as TechUK, speak of a lack of clarity—a narrative that may have led to Minister Chris Bryant claiming that the Government’s consultation was a “win-win”. However, I would like the Minister to explain where the uncertainty on who owns these copyrighted works lies. Also, where is the win for the creative industries in the government proposal, which in one fell swoop deprives artists of control and payment for their work—unless they actively wrap the law around them and say “no”—leaving them at the mercy of pirates and scrapers?
Last week, at a meeting in this House attended by a wide range of people, from individual artists to companies representing some of the biggest creative brands in the world, a judge from the copyright court said categorically that copyright lies with the creator. AI does not create alone; it depends on data and material then to create something else. A technological system that uses it without permission is theft. The call for a new copyright law is a tactic that delays the application of existing law while continuing to steal. Unlike the physical world, where the pursuit of a stolen masterpiece may eventually result in something of value being returned to its owner, in the digital world, once your IP is stolen, the value is absorbed and fragmented, hidden amid an infinite number of other data points and onward uses. If we continue to delay, much of the value of the creative industries’ rich dataset will be absorbed already.
The government consultation has been greeted with glee overnight by the CCIA, which lobbies for the biggest tech firms. After congratulating the Government at some length, it says that
“it will be critical to ensure that the transparency requirements are realistic and do not ask AI developers to compromise their work by giving away trade secrets and highly sensitive information that could jeopardise the safety and security of their models”.
In plain English, that means, “We have persuaded the Government to give up creatives’ copyright, and now the campaign begins to protect our own ‘sensitive business information’”. If that is not sufficiently clear to the Committee, that means they are, first, claiming their own IP while stealing others, while simultaneously pushing back at transparency, because they do not want an effective opt-out.
The government consultation does not even contain an option of retaining the current copyright framework and making it workable with transparency provisions—the provisions of the amendments in front of us. The Government have sold the creative industries down the river. Neither these amendments nor the creative community are anti-tech; on the contrary, they simply secure a path by which creatives participate in the world that they create. They ensure the continuous sustainable production of human-generated content into the future, for today’s artists and those of tomorrow. The amendments do not extend the fundamentals of the Copyright, Designs and Patents Act 1988, but they ensure that the law can be enforced on both AI developers and third parties that scrape on their behalf. They force transparency into the clandestine black box.
Amendment 204 requires the Secretary of State to set out the steps by which copyright law must be observed by web crawlers and others, making it clear that it applies during the entire lifecycle, from pretraining onwards, regardless of jurisdiction—and it must take place only with a licence or express permission.
Amendment 205 requires the Secretary of State to set out the steps by which web crawlers and general-purpose AI models are transparent. This includes but is not limited to providing a name for a crawler, identifying the legal entity responsible for it, a list of purposes for which it is engaged and what data it has passed on. It creates a transparent supply chain. Crucially, it requires operators of crawlers to disclose the businesses to which they sell the data they have scraped, making it more difficult for AI developers that purchase illegally scraped content to avoid compliance with UK copyright law, overturning current practice in which the operators of crawlers can obscure their own identity or ownership, making it difficult and time-consuming—potentially impossible—to combat illegal scraping.
Amendment 206 requires the Secretary of State to set out by regulation what information web crawlers and general-purpose models must disclose regarding copyrighted works—information such as URL, time and type of data collected and a requirement to inform the copyright holder. This level of granularity, which the tech companies are already pushing against, provides a route by which IP holders can choose or contest the ways in which their work is used, as well as provide a route for payment.
In sum, the amendments create a clear and simple process for identifying which copyright works are scraped, by whom, for what purpose and from which datasets. They provide a process by which existing law can be implemented.
I shall just mention a few more points before I finish. First, there is widespread concern that mashing up huge downloads of the internet, including the toxic, falsehoods and an increasing proportion of artificially generated or synthetic data, will cause it to degenerate or collapse, putting a block on the innovation that the Government and all of us want to see, as well as raising serious safety concerns about the information ecosystem. A dynamic licensing market would provide a continuous flow of identified human-created content from which AI can learn.
Secondly, the concept of a voluntary opt-out regime—or, as the Government prefer, rights reservation—is already dead. In the DPDI Bill, I and others put forward an amendment to make robots.txt part of the robots’ exclusion protocol opt-in. In plain English, that would have meant that the voluntary scheme in which any rights holder can put a note on their digital door saying “Don’t scrape” would have been reversed to be mandatory. Over the last few months, we have seen scrapers ignoring the agreed protocol, even when activated. I hope the Minister will explain why he thinks that creators should bear the burden and the scrapers should reap the benefit and whether the Government have done an impact assessment on how many rights holders will manage to opt out versus how many would opt in, given the choice.