Category: Featured

  • Preserving Publisher Rights in the Era of AI Chatbots

    Preserving Publisher Rights in the Era of AI Chatbots

    Last September, I gave a talk at the Media Party conference in New York to propose a method to track the origin of text as it travels through a Large Language Model (LLM). Tracking provenance is important because to evaluate reputation and assign credit to properly allocate licensing revenues to publishers that provide source material to an LLM.

    What follows are the slides from the talk with some annotations to help explain.

    The rough outline of the proposal is a simple type of HTML markup which allows the publisher or author of a page to mark unique phrases, facts, quotes or figures for which they would like to retain credit. This markup, if retained along with the indexed text, would allow the LLMs to store and trace the origin of these unique phrases back to the originating url or domain tracking the “knowledge” as it travels from the originating website to an LLM and then back out via a genAI chatbot in the form of an “answer.”

    Setting some historical context, I explained how incentives can shape ecosystems . The pageview & advertising economy of online publishing incentivizes publishers to seek out traffic and has given rise to an ecosystem that put Google and their “ten blue links” at the center. A link drives traffic and traffic drives ad impressions which equals revenue in this ecosystem.

    This well-established ecosystem is being upended by AI chatbots which efficiently extract knowledge from a page and serve it back to the user without generating a pageview. This cuts out an important way for publishers to make money, grow audience, and promote their brand.

    To get a jump on this new ecosystem, large publishers are cutting deals with the AI companies but only the biggest will have the resources to benefit from such arrangements. Smaller publishers will be left out.

    SimpleFeed (where I work) released a simple WordPress plugin that monitors your site to see who is crawling your site and allows the site admin to block selected bots. The idea is to educate smaller site owners how much indexing is going on and build awareness of how the LLMs are interacting with their content.

    According to CloudFlare, bots make up 30% of a site’s traffic and this figure will surely increase.

    Referrals from social networks are falling. This puts pressure on site owners who wish to control who comes to crawl their site. Who do you let in, who to block? The act of publishing something is to distribute your information far and wide but, right now, many are defending their sites from aggressive crawlers strip-mining their sites without compensation.

    If we plot this situation to it’s conclusion, the largest publishes will survive off of whatever licensing terms they can secure while the smaller sites get starved of traffic and miss out on any significant licensing revenue. The result is that we lose the diversity of the web. This leads to the gentrification of anything going into and coming out of the LLMs. This is what is called, an ecosystem collapse.

    Tim O’Reilly is my North Star when it comes to understanding technological tectonic shifts. Much of my thinking here is inspired by an O’Reilly piece, How to Fix “AI’s Original Sin” in which he writes about how incentives can influence ecosystem design and how pageview incentives of past result in the block & tackle behavior of publishers towards the LLM platforms today.

    The challenge for the LLMs to break out of this cycle is to create a system for “detecting content ownership and providing compensation” so that LLMs can share the enormous, untapped potential everyone anticipates for the LLM platforms. In O’Reilly’s words,

    This is one of the great business opportunities of the next few years, awaiting the kind of breakthrough that pay-per-click search advertising brought to the World Wide Web.

    In the world of digital art (audio, photos, videos), the people and companies behind Content Credentials are already hard at work in creating this system.

    If a picture is worth 1,000 words, there must be value assigned to text. If something has a value, it’s worth tracking. I propose a few elements worth tracking. Quotes, Statistics, and even unique phrases.

    The next few slides told the story of how, when blogs and blogging were just getting started, there was a huge problem with comment spam. This was largely the result of incentives to get a high reputation site to link back to the commenter’s website to help improve their ranking in Google’s search results.

    Over the course of a few days (the internet was a smaller place back then), engineers at Google and Six Apart (where I worked at the time) agreed to negate the relevance of the link back to the commenter’s site on a comment and dealt a blow to the comment spam problem. A small group of engineer’s extended the web and, in a very simple way, removed the incentives that rewarded bad behavior.

    I told this story because I see the rel= link qualifier as something that could be used to markup text and prove provenance. I proposed something called a “knowledge unit” or KU for short.

    The syntax of the markup worked alongside HTML, just bracket anything you want to track in the rel=ku markup and, as long as the consuming LLM keeps that markup intact, that text will be tagged as something originating from the url cited in the markup.

    This provenance can be used to track the number of times a particular knowledge unit is mentioned in an LLM’s response. This enables a fundamentally different ecosystem from that of pageviews in that there is no need to constantly re-post something you wrote years ago to keep it fresh, relevant, and trending in Google’s search results. Hard work to produce durable knowledge should pay dividends on into the future.

    More akin to the Wikipedia reputation model, a good, unique fact can continue to be cited over time and, in fact, revenues should flow towards durable knowledge units and will hopefully reward those that gather and present unique knowledge rather that the hot takes and re-writes that are rewarded in today’s pageview economy.

    Taken a step further, we will then return to a web before ad targeting and enragement metrics to a world where we reward those that teach us something new.

    This new internet no longer drives you to “acquire” a “user” to package up and sell to an advertiser. Publishers no longer need to lock their stories behind a paywall to prevent non-monetized access. In this new ecosystem, the incentive is to share knowledge, getting paid directly for the broad distribution and citation of your work.

    This is just the germ of an idea that may well be totally naive. While I do like the bottoms-up, simplicity of the markup approach, it requires everyone to adopt and trust each other to collectively make it work.

    What is to keep bad actors from hijacking Knowledge Units and claiming something as their own? Page index timestamps will need to be the arbitrator of provenance I suppose but how do you guarantee delivery of your post over others?

    Also, why would LLMs adopt such a system that would fundamentally make their indexes more complex and expensive? My hope is that the LLMs eventually see that strip-mining the web is unsustainable. Just as in agriculture, an ecosystem that does not replenish it’s resources, both large and small, is not a diverse, healthy, and long-lasting ecosystem.

    If you’ve made it this far, I’m super-interested in your thoughts and encourage you to get in touch.

  • A very New York story

    A very New York story

    Last week a drama unfolded in public that can only be described as one of those uniquely New York moments. Someone lifted someone else’s magazine from the pile of mail on the ground floor of a walk-up apartment. This is something that I am sure happens all the time in cities around the world yet, due to the concentration here of people unafraid to speak their mind and media professionals willing to pay attention, a tenement-level spat exploded into an event followed by thousands. All thanks to the performative platform that is social media.

    It all started with a note from Kareem in apartment 2L.

    Which escalated after a response was scrawled under the note.

    Flames had been fanned. 326 comments and 5,508 likes (as of this writing) on the post above and already large swaths of the city are now tuning in, looking on. Debates take place in hallways all over New York, private chat groups between friends and family and company Slack threads buzz with side conversations on who was in the right and who was in the wrong. What to do if your neighbor plays their stereo too loudly. We’ve all been there. Was this action justified to get their attention? But is stealing the latest issue of New York Magazine warranted? An audience formed, people took sides. We all tuned in to the Instagram account famous for curating signs found around New York city to get the updates.

    Then we get this:

    A very New York message

    Right off the bat, an “alright buddy” set the tone inviting the perp to have a face-to-face conversation rather than “holding my magazine hostage.” New Yorkers hate the passive aggressive. If you have a beef with someone, just come out and say it. Talk it out. Things take a dark turn though when the demand is coldly set to return the magazine by 7pm “or else the deal if off.”

    The response:

    The light blue marker and cursive handwriting style are disarming but this response is quite literally a throw down, pushing back on the ownership of the magazine and setting the power dynamic squarely back to the person literally holding the final word. “I will return your magazine when I finish reading it.” acknowledges that they do not own magazine but they are going to hang on to it, regardless.

    “I’m on the edge of my seat” comments @charmpants on Instagram.

    Working thru his feelings, Kareem posts something on a neighborhood blog, TAKEN: HOW A MISSING MAGAZINE TURNED ME INTO LIAM NEESON … AND AN INSTAGRAM ANTI-HERO

    Someone in the apartment tears down all the signs and tapes them up on Kareem’s door and scrawls “Enough with the signs you morons!”

    To which Kareem responds apologetically:

    $20 for an issue of the New York Magazine is a generous offer as the newsstand price is $6.99. This post is clearly performative. Kareem is playing to the crowd, trying to get people in the apartment (and greater NY) on his side so that the whole thing can be over and done with. But to those of us following the debate online, we are all curious as to why the note looks so huge as it appears to cover half the door. Others point out it must be posted on a mirror across the hall, otherwise the $20 bill would be the size of a dish towel.

    The backstory indicates Kareem has asked the landlord to review the video tapes to see who might have stolen the magazine.

    More updates from Williamsburg. Now “Manegment (sic)” is posting signs of fines about signs. We recognize Kareem’s writing asking for clarification. You can see he is calculating the potential risks and cost of his very public appeal to his neighbors. More importantly to the Instagram public, this post clarifies the relationship of the door and mirror question raised earlier.

    In the comments, the madding crowd screams, “Release the tapes!”

    Blue Marker is upset that the notes are getting so much attention on the internet and asks Kareem to stop then drops the bomb, “I am sorry but I do not have your magazine.”

    The crowd collectively loses it. Mayhem.

    The city came out in full-throated support of Kareem in his time of loss. The cover story of the New York Magazine was an excerpt from an upcoming biography of New York’s very own Alexandria Ocasio-Cortez. We all want to read that story – we are proud of our own.

    Casa Magazines offers a free issue if Kareem wants to stop by their West Village store. Someone from New York Magazine who lives nearby takes a copy over and drops it off. We are all Kareem, just looking for some good reading material.

    The final installment was posted yesterday. Unless, of course, Netflix picks it up and turns it into a three-part mini-series. Turns out that upon review of the video tape, the landlord discovers that the magazine was not taken by anyone who lived in the building after all but by a previous resident.

    Kareem apologized.

    The city took a collective breath. All is well that ends well. We all learned to drop our suspicious gaze and give our neighbors the benefit of the doubt. Kareem now has two issues of this month’s New York Magazine and offers one up to anyone who might want to read it. New York Magazine’s real estate section even reached out and published an interview with Kareem and did their own Instagram bit.

    “This sort of petty neighborly drama is what keeps New York alive,”

    But note Kareem’s final message and invitation.

    It did make me realize I’ve lived here for 6 years and don’t really know any of you . . . so if anyone would like to have coffee with me, just knock or leave a sign on my door.

  • It can happen again

    It can happen again

    A couple of weeks ago, I took the family to see Then They Came for Me, an exhibit about the incarceration of Japanese-Americans on the West Coast during the Second World War. The exhibit, at San Francisco’s Presidio, has been extended through August and I highly recommend it. The use of the courts to remove civil liberties and justify racism (let’s call it what it was) is an ugly chapter in American history. Lessons learned then are more relevant than ever in today’s political environment of bombastic pronouncements and unnecessary walls.

    Most know about the forced removal of 120,000 Americans from California, Arizona, Oregon, and Washington during World War II but did you also know,

    • Most families were given only a few days to clear out or give away everything they owned. Lifelong businesses were shutdown and sold off for pennies on the dollar. Houses were sold off, basically repossessed. You were only allowed a single suitcase and it wasn’t clear where you were going.
    • Until the actual camps were built, families had to make do in the horse stalls at local racetracks. Of course it stunk, was cold, and there was no privacy.
    • The “Internment” camps were a nice way of putting it. They were basically concentration camps, surrounded by razor wire and machine gun towers. The shacks were simple tar-paper sheds which provided almost no insulation from the freezing temperature in the Winter and baked in the desert sun during the Summer.
    • There were many acts of passive resistance in the face of extreme institutional injustice. This was 20 years before the civil rights movement.
    • Award-winning photographers Ansel Adams and Dorthea Lange were hired by the War Department to document the round-up and show it in a favorable light. Photos that depicted machine gun towers or protests were censored. It didn’t go as planned and we have them to thank for their record of this time.

    We were lucky to have a guide the day we visited. Not just any guide but Donald Tamaki, one of the lawyers who worked on the team that cleared Fred Korematsu from the landmark Korematsu v. United States case.

    Our guide, Don Tamaki

    In the video clip above, Don talks about how his team uncovered evidence of a cover-up. There was no evidence of any shore-to-ship radio messages, the threat of Japanese spies was unfounded, made up. 120,000 people were ripped out of their communities for no reason. Farms, businesses, and homes were sold off and people were told to suspect their neighbors for no reason.

    In the end, the Supreme Court took the military & intelligence at their word and went along with their demand for an exclusion zone and incarceration of all those of Japanese decent within it. Once the courts stop questioning the other branches of government, in this case Congress and the President, the balance that keeps dictators and tyrants in check was lost.

    While the current Chief Justice Roberts has said Korematsu v United States ‘has no place in law under the Constitution’ the law used to send Japanese to the camps was never overturned. The Supreme Court has not reversed its original decision so the law that gives the president power to round up people based on race in times of national security is still on the books. As the dissenting justice in the original ruling writes, such a flawed law “lies about like a loaded weapon.”

    A military order, however unconstitutional, is not apt to last longer than the military emergency. Even during that period, a succeeding commander may revoke it all. But once a judicial opinion rationalizes such an order to show that it conforms to the Constitution, or rather rationalizes the Constitution to show that the Constitution sanctions such an order, the Court for all time has validated the principle of racial discrimination in criminal procedure and of transplanting American citizens. The principle then lies about like a loaded weapon, ready for the hand of any authority that can bring forward a plausible claim of an urgent need. Every repetition imbeds that principle more deeply in our law and thinking and expands it to new purposes.

    Korematsu v. United States, Dissent, Justice Jackson

    It can happen again.

    Related: California and Sanctuary Cities

  • Finnish Ingenuity

    Finnish Ingenuity

    We had some house guests over last night who shared some observations about the Finnish people and their incredible spirit and creativity, especially when their backs are up against the wall.

    World War II was a time of extreme struggle for the Finns who found them up against the full wrath of Stalin’s Army. Out-numbered and out-gunned, the Finnish people were left with their wits, here are a couple of highlights:

    Winter War – In 1940, the Finns faced a full-scale invasion of their homeland. As the Russian Army advanced on Finland in the winter of 1940, they ran into sub-zero temperatures and long nights of darkness. Using this to their advantage, small squads of Finnish troops would infiltrate enemy lines between larger divisions and set up machine gun lines pointing outward, towards each division. After short bursts to the left and right, the guerrilla squads would retreat and leave the two, recently alerted adjacent divisions to open fire upon each other thinking they were firing on the enemy when in fact they were firing upon the neighboring division.

    The Bombing of Helsinki – In February of 1944, Stalin ordered bombers to flatten the city in order to break the spirit of the Finnish people. In preparation for the bombing which they knew was coming, the civil defense forces laid out a grid of signal fires out on the frozen bay and surrounding islands which roughly matched the layout of Helsinki at night. When the bombers flew towards the city, the civilians doused the lights and the bombers, thinking the lights they saw out on the bay were the city, dropped a majority of their bombs harmlessly into the water, sparing most of the city.

    Don’t mess with the Finns, they’ll mess with you.