everwas

a blog by Ian Kennedy

Tag: copyright

Access as a Service

Tim O’Reilly popularized the term “Web 2.0” to explain the network effects of the participatory web enabled by dynamic web pages tied to personalization. He is excellent at summarizing large technical trends in a way that not only makes it relatable but also provides a useful framework when I need to explain these concepts to others.

So it was with great anticipation that I saw that O’Reilly has posted his thoughts on the intersection of copyright and AI.

The Risk

If the long-term health of AI requires the ongoing production of carefully written and edited content—as the currency of AI knowledge certainly does—only the most short-term of business advantage can be found by drying up the river AI companies drink from. Facts are not copyrightable, but AI model developers standing on the letter of the law will find cold comfort in that if news and other sources of curated content are driven out of business.
How to Fix “AI’s Original Sin”

The Opportunity

While large licensing deals are being cut by publishers that have the leverage & lawyers to negotiate massive, one-time deals, these are ultimately short-lived and only serve to build up the large AI-providers that can afford to subsidize premium materials for their users. These deals just make the rich even richer.

The longer term, sustainable opportunity he proposes is in allowing the internet-of-many to share in the revenues enabled by the output from these large AI systems.

But what is missing is a more generalized infrastructure for detecting content ownership and providing compensation in a general purpose way. This is one of the great business opportunities of the next few years, awaiting the kind of breakthrough that pay-per-click search advertising brought to the World Wide Web.
How to Fix “AI’s Original Sin”

The Challenge

Build a shared provenience and attribution service that keeps track of all documents available to AI systems and the permissions and royalty payment requirements around those documents.

O’Reilly alludes to the UNIX/LINUX filesystem architecture of files with permissions set at the global, group, and user levels as a potential solution to what publishers allow to AI vendors seeking out material for their training sets.

If we expand this analogy out to internet scale, could we apply the architecture of Hosts tables and the modern Domain Name Service to provide a dynamic infrastructure that could maintain a public “lookup” service so any particular AI could locate the origin of any attributable fact, quote or yet-to-be-determined “knowledge unit” and the license fee should an AI wish to leverage that data.

In UNIX, the chmod command is used to change permissions. Could setting copyright permissions via a specialized version of “chmod” be the key to a new way to control access and compensate publishers at scale?

Food for thought.

June 19, 2024
The Sharing Economy

Joi Ito gave a brief preamble to a gathering at Digital Garage in San Francisco today, Unlocking the Power of Japanese Content in Worldwide Markets.

He spoke about the history of remix culture and differences between Japanese and Western commercial attitudes towards fan fiction and derivative works. Historically, the act of reproducing someone’s work required the manifestation of that copy in a physical artifact which required an investment. If you made a mixtape you had to purchase the cassette and source the original music. If you shared a film, you copied it onto a VHS tape along with the scary FBI warning.

Scary FBI warning found on old VHS tapes

The act of copying something in the pre-digital days was a “trigger event” that could be codified into law and enforced.

When you had to pass around physical media, it cost money. Because of this cost, you needed to create a business to support duplication and distribution. When works became digital, the distribution costs went to zero and the need for a business to manage duplication and distribution went away.

In this new world, the enforcement of the trigger event became ludicrous. Joi reminded us that every time you browse a web site, your browser is making a copy of everything on that page. The act of copying something is fundamental to the infrastructure of the internet. Ownership is hard to enforce and, when done so, threatened the very relevance of the institutions that tried.
The software that runs a majority of the internet, was developed through commons-based peer production where no one institution owns the software.

Linux is not on anyone’s balance sheet. It is not part of anyone’s GDP.

Joi’s work with Creative Commons was an attempt to resolve this incongruence by simplifying the licensing contracts and baking it into the structure of the internet. What was interesting to him and the topic of the afternoon was the difference in corporate attitudes towards ownership of content between Japan and the United States. While Japanese manga culture welcomed derivative fan fiction works produced by their otaku followers, traditional US comic houses heavy-handedly squelched any adoration, killing off their core audience.

Walt Disney is famous for their enforcement of their copyright. I knew someone in Hong Kong who spent all their time touring the markets in Hong Kong and China serving notices to violators who sold unlicensed reproductions of Disney characters. Marvel, and DC Comics are no different. Copyright law is written in such a way that lack of enforcement can lead to forfeiture of rights. A publicly listed company has a fiduciary responsibility to its shareholders to enforce their copyright.

But the times have changed. Otaku and fan fiction culture is winning. Publishers that allow their fans to remix their brands enjoy a halo effect that is helping market them on social media channels. A stiffly isolated character is less likely to go viral than one that has been remixed to blend into an environment in tune with the context of its surroundings (more on that tomorrow).

Japanese law is written very strictly to enforce copyright but the honne of Japanese society looks the other way when a publishing house slips rough cuts to its fans to allow them to extend the storyline. The rest of the world is catching on and understanding the commercial benefits of a rabid fanbase.

This week’s news of Taylor Swift’s removal of her entire back catalog from Spotify is significant in that she is probably one of the last acts that will be able to do this and drive record commercial sales based on artificial scarcity. In the digital age it is not the artifact of the recording which drives revenue, it is the performance, either in live concert or in the contextual packaging of that recording with other works.

The mp3 is just meta-data to the (live music) event.

The Japanese have known the value of the otaku community for a long time. Pokemon started as a simple video game but grew into a media franchise through its community which it embraced. Western creators such as J.J. Abrams with the Lost TV series have benefited from large fan bases that carry their narrative far beyond the original creation. Wikia (Craig Palmer, CEO, was also a speaker at the conference) is positioning itself as a platform for these communities.

While original media from Japan may be limited to a fringe collection of weird game shows and cos-play conventions appreciated only by those outside Japan that can break through the difficult language and cultural barriers, the concept of remix communities as an important part of a brand’s success is taking hold in the West and driving commercial success to those that embrace it.

November 7, 2014
Getty Images Opens Up

Getty Images added embed icons to 35 million photos in their collection. Not all images are available for embed (look for the icon). Images are for non-commercial use only and you need to use their embed code which adds the frames you see below.

Unable to close the barn door, Getty material was finding its way online into Google Image search which crawled sites that had properly licensed the images. The explosion of social media has accelerated secondary use via “right-click/save” so this was largely Getty reading the writing on the wall.

Getty Images is smart to do this. By providing a superior image search to Google and a simple way for people to use their images, they gain control of their assets again and wrap some marketing around its use, taking advantage of free distribution that was already happening. All the embeds point back to gettyimages.com so it’s great for SEO and exposure for the vast selection they have available. Getty Images has also said they will collect data on where the photos are used to improve their service which adds an important crowd sourcing to their ranking algorithms. Finally, buried in the Terms of Service is, “the right to place advertisements in the Embedded Viewer or otherwise monetize its use without any compensation to you.” The Nieman Journalism Lab ponders where Getty is going with this,

Aha! The data collected could have internal use (measuring what kinds of images are popular enough to invest in more stock photos, for instance). But they could also help with those ads. Imagine a day, five years from now, with Getty photo embeds all over the web, when they flip the switch — ads everywhere. Maybe there’s a photo equivalent of a preroll video ad and you now have to click to view the underlying image. Or a small banner on the bottom 90px of the photo.

And imagine your website has used a lot of Getty embeds over the years — enough that Getty can actually sell ads specifically targeting your website, using all that data it’s gathered. Or imagine there are enough Getty embeds that it could sell ads only on photos of Barack Obama, or only photos about Cajun music, or only photos about restaurants in Kansas City. You can start to see the potential there. Think of how many YouTube videos were embedded on other websites before Google ever started putting ads on them.

Embedded widgets used to be all the rage but they fell out of fashion as social networks became the place to share social objects. Getty is late to the game unless everybody gets sick of Facebook and fires up their own WordPress site. Notice how the sharing icons for the Getty Images are only for Twitter and Tumblr, the most open of all social networks.

Finally, what about all the folks that check the box on Flickr allowing their photos to be licensed by Getty Images like Phoenix Wang who took the photo above? Their works will now be used freely to help market the Getty service. On the plus side, clicks thru on the image will bring up options to license hi-res images for a fee so it’s not a total loss for the Flickr crowd. I wonder if the inclusion of the Tumblr share icon was a condition of including the Flickr photos in this deal?

March 5, 2014
BBC sources photos from Flickr

Steve Rubel over at Micro Persuasion notices that a recent article about Podcasting carried on the BBC website included photos that were taken by a Flickr user. It is not clear if they had gotten permission to do so from the user but they clearly were not credited as a source.

This is the first time I can think of that an established media site included material without attribution and also brings up an interesting point about using RSS feeds and Search Engines as an alternative to the traditional newswires (AP, Reuters, PR Newswire) for sources of republished material.

The article did not credit the photographer, camoby, so it’s unclear if the BBC purchased these images or if he works on staff or he simply let them use the photos gratis. His web site, however, does feature a BBC ticker. What’s known is that these images were not published under a Creative Commons license.

As a friend of mine noted when the news of Flickr’s purchase by Yahoo was announced, “Who owns the photos on Flickr and is Yahoo or anyone else going to profit from the sale of images of my family without my knowledge?” Ads around Flickr images on either Flickr or Yahoo are one thing but distribution to other sites is another thing entirely. Let alone, did Adam Curry release these images as well?

March 31, 2005