Keywords and Meaning

TechCrunch asks if twitter search gets us closer to being able to mine the world’s collective thoughts. We may be getting there as millions text their latest thoughts into their cellphones. With a simple text message, the hive mind has the potential for 4 billion nodes out in the real world (for comparison, the human brain has 100 billion neurons)

News junkies of the world turn to twitter as the latest source of raw, unfiltered information. Peering over the shoulder of various members of the House and Senate who twitter is a unique view into our government. What you see is a more intimate, human view of the people that make the news. Yet, how do you harness that noise and turn it’s output into information?

Twitter follows a long line of services which break through editorial filters, get at the source of a story so you can make your own judgements. Blogs occupied this space just a few years ago and real-time indexes such as Technorati rose to prominence as a way to get a jump on the news.

Sidenote: Alacra, admitting important news about companies breaks on the web, is launching Pulse which applies their analytics engine to extract company names from their hand-picked collection of 2,000 RSS feeds.

The need for speed is nothing new. Former Wall Street Journal newsman Craig Forman draws an arc that extends from the pigeons Baron Reuter used to deliver news of  Napoleon’s defeat at Waterloo to the real-time newswires used in the financial world today . If there’s a way for someone to profit from the knowing something before anyone else, there’s always going to be people looking for a way to get at a scoop and others looking for a way to deliver.

We want to look to twitter for the scoops but we are doomed to learn the same lessons as we have in the past about authenticity. What we gain in speed and convenience, we lose in validation and measured fact-checking. Google’s PageRank, while valueable in sorting out the reputation and tossing the hucksters, is no good when applied to real-time news which is too fresh to build up a linkmap.

Working for Dow Jones in Tokyo, I would work with bankers and reporters who would use digital newswires to deliver them the latest news from around the world. As a systems engineer setting up their workstations, I would often be asked to set up their news filters to narrow the feeds down to something reasonable (the typical newswire delivers hundreds of stories an hour, most subsribe to several newswires). In the late-90’s the tools were crude and after getting frustrated by throwing in a few keywords, I would get called in to refine things using additional tools such as company ticker symbols, or a few undocumented codes from a taxonomy of subjects that varied from newswire to newswire.

Today the problem of information overload has spread to the greater population trying to derive value from the rushing torrent of updates coming out of twitter and facebook. How do I manage all this stuff and figure out what’s important? We use the tools we have but if you think about it, Google Trends and twitter search are just keyword searches with very crude resolution. We have a long way to go before such tools will let us tap into the collective mind.

Perhaps it’s time for a crude taxonomy for social networks to help sort out the types of messages flowing back and forth? Imagine if all your tweets, facebook messages, and friendfeed streams came pre-tagged with the following tags or categories?

  • look at me, I’m doing something cool
  • check this out, it’s funny
  • books, movies, music, food, or sports
  • this is touching and will change your life
  • gadgets and meta, technology post about using technology
  • weather and the natural world
  • babies and kittens
  • my obscure hobby
  • breaking news, OMG!
  • make money now!

What other categories would you add? Librarians of the world, what keywords would you put into your search filters to help grep out what goes where? Categorization is the first step towards ranking and with ranking you get useful filters.






5 responses to “Keywords and Meaning”

  1. Todd Avatar

    Keyword extraction from Twitter could be cool, but may kill of serendipitous discovery, my favorite aspect of Twitter. If keywords or meta-categories are predetermined truly unique hawtness, unprecedented new things ( a Twitter specialty ) will just get deleted? That would be FAIL. I wonder if more of a "people with attributes" are really what’s needed. Example, I do want to know what’s going on with the latest developments for Symbian operating system, particularly activity streams and address book stuff. Rather than rely on keyword extraction, I could just assign an attribute to your tweets… twitteruser:iankennedy=novi …I can be fairly assured news filtered by real humans, THEN assigned an attribute of my choosing will bring me some good results. A tag cloud of all tweets containing “symbian, activity stream, address book” would be noisy ( pollute with people asking each other for tech support? ), difficult to pull meaning from while drinking beer at my favorite bar. Speaking of which, I urge you to reconsider your absence from South by southwest this year. Please come, if just for Sunday evening and Monday, and bring this post’s topic to us all in person – Noise reduction from social network activity streams will be a hotly contested subject this year and your opinion would be of immense value.

  2. jhstrauss Avatar

    The TechCrunch post you cite was inspired by John Borthwick’s very interesting essay on how Google’s approach to content filtering breaks in the realm of what he calls the “Now Web.” Like you say above: “Google’s PageRank, while valuable in sorting out the reputation and tossing the hucksters, is no good when applied to real-time news which is too fresh to build up a linkmap.” In the (relatively) static web, the network nodes are pages and the endorsement actions are the links between them which are effectively permanent as well as public, and thus crawlable. In the Now Web, the network nodes are people and the endorsements are ephemeral share actions, the majority of which are not public or crawlable (i.e. email, IM, Facebook — what I call the “Deep Now Web”). And so, authority also takes on a different form from the aggregate view that PageRank provides to the personal measure of how much influence an individual has with her social network on a particular topic at a given moment. I agree that we need to have a means of systematically capturing the newly important metadata of share actions and that it needs to be done at the point of sharing (see Jeff Jonas). But, I believe the more easily adopted (and thus ultimately more useful) taxonomy will be one of contextual metadata (i.e. who/what/when/where/why/how) rather than the more personal folksonomy/tagging approach you suggest.

  3. Davide DIncau Avatar
    Davide DIncau

    not sure whether classifying tweets by type would help much.

  4. Do Social Gestures a Business Model Make? Avatar

    […] went wrong with the Intense Debate comments on last night’s post on Keywords and Meaning. It’s unfortunate because there were some really thoughtful responses to the post which […]

  5. data recovery Avatar

    I love using site

Leave a Reply to jhstrauss Cancel reply

%d bloggers like this: