Month: August 2023

  • Video: Simon Willison on AI

    Video: Simon Willison on AI

    Simon Willison has been hacking on technology for years and blogging about it in his excellent blog where he posts on how to recreate his innovations and follow along on his adventures. He was a speaker at this year’s WordPress WordCamp US 2023 conference and gave a talk that I would highly recommend to anyone who wants to spend an hour to catch up on all the latest developments in the world of gen AI and LLM.

    Posting this here today because I expect that I’ll be sending this link to people for weeks to come.

    Large Language Models are the technology behind ChatGPT, Google’s Bard and more. They are weird and somewhat intimidating pieces of technology: we’re still trying to figure out how they work and what they can do, in a field that changes radically on an almost weekly basis.

    In this talk I’ll break down how they work, what they’re useful for, what you can build with them and how to dodge their many pitfalls.

    Making Large Language Models work for you
    Simon Willison on AI at WordCamp 2023

    If you’d rather just read Simon’s talk, he’s created an annotated version of his talk here.

  • Sitemaps for AI

    Sitemaps for AI

    Last week, I was double-booked in conferences. Wednesday & Thursday I was in Philadelphia for the beginning of the Online News Association conference, a gathering of journalists who work with words online. Friday & Saturday, I was in Washington DC for WordCamp, a gathering of people who work with WordPress, the CMS software that powers many of the websites journalists use to publish their news online.

    Hopping between these two worlds, the editorial and technical, gave me a unique perspective of the change sweeping online media. Everyone agrees that AI Chat Bots, specifically generative AI from Large Language Models (LLMs), will have an enormous impact on what we read online. But, depending on who you’re talking to, it’s going to result in either the horror or something wonderful.

    It’s still very early but a long Amtrak train ride home gave me some time to project out where we’re headed and ponder what we might need to make it work in a way that both publishers and AI Chat Bot companies feel comfortable.

    Those that fear AI view it as something that will strip mine websites of their facts and process them into the bland, robotic responses that power AI Chat Bots. This characterization echoes the publishing industry’s initial reactions to Google search. In 2006, French and Belgium newspapers demanded to be removed from Google News only to come back begging for inclusion in 2011 after they experienced a precipitous drop in traffic.

    Are we seeing the same thing play out with AI? Isn’t an AI chat bot just the conversational form of the Google SERP? Microsoft Bing Chat and Google Bard are crawling the web for tidbits to power their conversational engine. Concern about Bing and Bard abstracting facts without sending users back to a publisher’s site exposes a flaw in the publishing business model where a website is compensated by readers looking for answers on a page in adorned with advertisements designed to distract and harvest attention.

    Dare Obasanjo on threads.net

    It’s time to upgrade this business model. Instead of asking people to browse a bunch of search links, AI Chat Bots bring information to the reader, aggregated, summarized in a conversational tone. To a certain extent, this is an evolution of what has been happening for years.

    Google Knowledge Graph

    When Google Knowledge Graph launched in 2012, many publishers felt the Knowledge Panel (as it came to be known) did not provide enough attribution. Sound familiar?

    If the reader no longer goes to the publisher’s site, they will end up spending time with the product providing the answers, not the source. Back then it was Google, today, it’s the AI Chat Bot.

    The AI Chat Bot is the latest step in a journey that was started a long time ago. Bringing answers into a conversational UI is just improving on user experience for those in search of quick answers to their question.

    Bing Chat AI

    This new conversational UI is under rapid development. I’m not even sure a conversational is where we’ll end up. Microsoft is leading the way with Bing Chat AI results sprinkled with attributions that give credit and links back to the source material. From what I can tell, Microsoft is also is paying for this attribution in an early experiment in what I would call “licensing of facts.” Google’s Bard is following Microsoft’s lead and is also starting to add attribution to its SGE results, something that was missing at launch. I’d be curious to know if they are paying publishers for these links.

    Microsoft is embedding Bing AI in not only into their Edge Browser but has also announced extensions for Chrome and Firefox. Bing Chat is also available as an Enterprise service as well as on their mobile app and Skype.

    The pressure is on and Google is responding in kind with their version of generative AI chat, SGE, which is running in Google Labs.

    If generative AI is the next generation of search, I can think of a number of things that are needed to build a relationship between the publisher and AI vendor that is transparent, trustworthy and thus, sustainable. Allow me to riff a bit.

    Honor Robots.txt
    Open AI already announced that they would honor robots.txt and not crawl sites that declare themselves off limits. This is now extended to optimize which sections of your site you want to make available to the AI Chat Bots. The New York Times, CNN, and others are already adopting this method to control what they make available.

    This is a step in the right direction as it builds trust but more granular control over what is made available for the crawl is necessary. Within a restaurant review, maybe the address & phone number will be valued one way while the reviewer’s opinion valued another way.

    Sitemaps for AI
    A sitemap is a file that instructs a web crawler where to look for new pages. A sitemap for AI could be an intentional declaration by a site owner of what specific facts and information you want to make available and what link you want to serve up for the attribution. Addresses can be fielded and formatted one way, quotes another way so that they travel along with the name of the person quoted.

    Ads.txt was developed to make programmatic advertising more transparent. What I’m thinking of is something in between a sitemaps.xml and ads.txt, a lightweight, machine-readable way for publishers to declare what they make available to the Chat Bot crawlers.

    Real time Fact Exchange
    The technology that enables the real time auction for ad impressions on sites in milliseconds is some the most impressive technology developed for the internet in the past couple of decades. The incredible revenue machines of the ad industry have fueled the advancements in this technology.

    It’s time for a similar exchange for the facts which will be the new commodity. When looking for answers via a chat bot that has access to everything, maybe the deciding factor is the quality of the information or the party that is making it available. If every fact is distinct in the aforementioned Sitemap for AI, why not also attach a value to that fact that can inform the AI chat bot which information it can afford to share. If it’s a high value reader then more expensive information from higher quality sites might be presented. We are already headed down this path as both search results and social media links that go to paywalled sites attempt to capture subscriber budget.

    Is it finally time to create a marketplace of micro-transactions brokered by the Chat Bot UI? Instead of subscribing to a bunch of subscription sites, maybe the AI is where “pay” for tidbits of information with either advertising or payment tiers and that revenue is shared by the Chat Bot companies with the companies providing the information?

    In order for the Chat Bot AI ecosystem to grow, the publishers need to be fairly compensated and the Chat Bot vendors need a marketplace for the content they need to provide a quality experiences. Maybe the Real Time Fact Exchange is a far-fetched but I would have never thought the simple banner ad would have evolved into the complex ecosystem we have today.

    Further Reading:
  • Urban Downhill Racing

    Urban Downhill Racing

    How do you get a First Person View of Tomas Slavik as he races the Red Bull Valparaiso Cerro Abajo? Shot in 2022, Red Bull hired the best FPV drone pilots and set up some hovering signal relay drones to put together this amazing video all done in a single take.

    Insane.

  • Vampires in Waterloo

    Vampires in Waterloo

    There’s a scene in the BlackBerry movie (I rented it of Amazon) where Jim Balsillie, played by “It’s Always Sunny in Philadelphia” star Glenn Howerton, goes off on the NHL board over a decision he does not like.

    It’s a full-blown meltdown with a bizarre line thrown in there that sticks in the back of your mind because it’s so strange but not too out of character because it almost fits Howerton’s character to go off like that.

    “I’m from Waterloo, where the vampires hang out!!!”

    Tyler picked up on this line immediately as it was an easter egg referring to a video that went viral 11 years ago.

    The director was born and raised in Toronto, where this interview took place, so it was probably an in-joke as a wink to his fellow Canadians. So now you know.

  • Phish and Guy Forget

    Phish and Guy Forget

    The more I learn about this crazy band the deeper I sink into the mythology. I caught the last two nights of Phish’s seven day “residency” at Madison Square Garden last week. The usual superlatives apply about how the band is playing better than ever. The guy who does the lights even had a profile piece done on him in the New York Times. Phish has now played the Garden 88 times, surpassing Sir Elton John and only second to Billy Joel (who probably sleeps in the tunnels below).

    Phish’s 13-night run at the Garden in 2017, what is known as The Baker’s Dozen, was the stuff of legend and, like all feats remarkable that take place in this city, endeared them to New Yorkers, fans or not.

    Across 13 nights — from July 21 to August 6, 2017 — the Vermont jam band performed 237 distinct songs without repetition, with each night taking on a specific donut-flavored theme. With a run time of around 34-and-a-half hours across 26 sets, the group served up 176 originals, 61 covers, 19 debuts and 23 bust outs (with a gap of 50 shows or more). To fans, Baker’s Dozen marked one of the band’s finest runs since the late-1990s. However, the run earned recognition from the outside world as well, with the quartet earning a banner in the rafters of MSG for their record-breaking residency.

    A Look Back At The Phish Baker’s Dozen Residency

    This most recent run did not disappoint. There was outstanding improvisation from all five members of the band (I’m including Kuroda, the lighting guy) performing at the peak of their talent. But let me talk about Guy Forget (pronounced, in French as “gee-forjet”)

    On the seventh and final night, in the middle of a long jam out of a song called Tweezer, Trey starts singing the line,

    “I’ve never met a man that I could not forget except for Guy Forget”

    I had no idea what I was hearing and I had to tug on this thread later to find out what that was all about. As I suspected, there was a story there and I sure wasn’t disappointed.

    This is a band delights playing with words. They turned the phrase “moment ends” into Moma Dance and swapped the word “mangled” for mangos to nonsensical effect. So who was Guy Forget? A Moroccan-born pro tennis player who played in France during the 80s. Why is he the subject of a Phish song? Why not?

    Of course there was much controversy within the Phish nerd-verse over exactly where “Guy Forget” ended and the Tweezer jam begin. People wanted to make sure to set the historical record straight.

    I just love the fact this band can drop a one-liner, inside joke into the middle of a song and their fanbase knows exactly what is going on. It’s a privilege to have such a dedicated following and this band has certainly earned it.