BrowseRank – Microsoft’s Answer to PageRank

Microsoft announced today that they’ve discovered a better way to rank web pages. While Google’s PageRank sorts roughly on the number of incoming links that point to a page, a vote of confidence by bloggers and website editors, Microsoft’s BrowseRank looks at browsing behavior to see which links get more clicks.

Sounds good on the surface. More democratic because it looks at the entire browsing population, right?

The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important. We can leverage hundreds of millions of users’ implicit voting on page importance

Not so fast. Andy Beal points out the obvious shortcomings:

“More visits?” – sure, spammers will have no idea how to inflate that metric.

“Longer time periods?” – couldn’t that also mean that your web site usability and navigation just sucks?

I would add a third. For this to work it requires that Microsoft know each and every link that you visit. I don’t know about you but there has to be a pretty good personal benefit for me to let Microsoft peer over my shoulder and take notes on every site I visit. Maybe they’ll just pay people. But as with Live Search cashback, that’s just going to attract the wrong audience and skew your biases.

Categories
Current Events

Google’s Flash-Eating Spider

This announcement is definitely cool and will open up whole new areas of the web to search. But truthfully I just wanted to post this because it lends itself to a great headline.

From the FAQ posted on the Google Webmaster Blog:

Q: What content can Google better index from these Flash files?
All of the text that users can see as they interact with your Flash file. If your website contains Flash, the textual content in your Flash files can be used when Google generates a snippet for your website. Also, the words that appear in your Flash files can be used to match query terms in Google searches.

In addition to finding and indexing the textual content in Flash files, we’re also discovering URLs that appear in Flash files, and feeding them into our crawling pipeline—just like we do with URLs that appear in non-Flash webpages. For example, if your Flash application contains links to pages inside your website, Google may now be better able to discover and crawl more of your website.

Categories
Office

The Lifestream Filter Will be the Next Great Algorithm War

I’m paraphrasing the title of this post from David Recordon who threw this line out following a chat I had with him a couple of weeks back. It’s a very insightful observation that predicts opportunities in the real-time world which lifestream services operate.

It’s now easier than ever to pull together an aggregated feed of content from across the web. Facebook and FriendFeed organize this content around your friends and contacts. MyBlogLog also presents a New in My Neighborhood view which shows a mixed feed of all your contact’s lifestream content. Yet, once you get more than a handful a friends on these systems, the number of updates (especially if any of them are using twitter) quickly spins out beyond what you can handle.

Twitter is often used to announce new blog posts and the new broadcast service from Six Apart, Blog It, only exasperates the problem by spawning multiple posts from a single Facebook entry. We live in a world where finding out what your friends are doing is not a problem. The difficulty is in filtering through the hundreds of updates that stream by each day to those events that are most relevant without losing the sense of serendipitous discovery that we experience today.

So here we are today. It’s like we’re all discovering search engines all over again. In a matter of weeks we’ve gone from “Wow! I can find everything here!” to, “Crap! Over 600,000 results for the phrase Serendipitous Discovery? How can I find the one reference I’m looking for?”

The huge opportunity ahead is a filter to bubble up the things you need to know without missing anything you want to know.

A couple of posts point to this being a trend

We’re trying a few things out at MyBlogLog that vector results based on how you have tagged yourself on your profile. Right now, in a user’s New in My World feed, it’s a straight, chronological feed based on items that match your tags. Also, because it’s based on meta-data, this only means we can present you with items that are tagged so that leaves out plain text updates such as twitter posts but we’re just getting started.

As David’s quote indicates, this is a huge opportunity and something I look forward to working on. I look forward to a robust debate on different approaches in the coming weeks!

Categories
Office

Go on, cheat a little

NY Times Crossword Promotion

Yahoo has joined up with the folks at the New York Times crosswords to promote the new Search Assist feature with a contest. The idea is that you fill the puzzle out successfully and you too can be entered into a drawing for one of five trips to Hawaii. Thing is, this thing is a gimme. Next to each clue is a link to a “Hint” which runs a search in the pane below against Yahoo’s Search Assist which will serve things up for you right there and then. It’s a great way to show off the new Search Assist and may give you a new reason to work on your crosswords with the browser handy.

I found out about this via a new group on Facebook. Join Yahoo! Pilot if you want to find out about the latest stuff going on at Yahoo! I can’t believe I found something not written up by the folks over at Yahoo! Cool thing of the Day, my usual source for tweaks and trivia about Yahoo – must have caught them asleep at the switch!

Categories
Office

Yes, but ours go to “11”

Yahoo goes to 11

If you haven’t checked out the new Yahoo Search Assist, by all means do. Someone’s finally got the clustered search and suggestive results thing right. Type something into search.yahoo.com and hesitate just a bit and the pane will come rushing out with suggestions.

On a lighter, Ryan Grove, one of the engineers who worked on the enhancements, points out that our search results now go to “11”

Categories
Office

Mining the NY Times Archives

New York Times

Dave Winer looks to the recently released New York Times archives as rich loam of fertile content upon which many applications can be built. In another life, as a product manager for factiva.com, I came to appreciate the meta-data the Times would attach to their content as something Factiva would leverage for its clients. Factiva provided investment banks and corporate libraries with content feeds from major news outlets and used meta-data on their sources (often adding additional meta-data of its own) so their clients would get precisely the content they were interested in and avoid having to wade through irrelevant results that were often the result of blunt keyword searches.

If the global PR officers of Ford or Sharp were looking for breaking news stories, keyword searches on the internet would be nearly useless as they would pull in stories of used Ford cars for sale or someone’s “sharp” looking suit. These client would pay for the meta-data and Factiva’s taxonomy consultants would offer numerous tips & tricks to hone down their filters to find exactly what was required.

With this in mind, I took a quick look at the source on the New York Times stories and found that they contain much of the meta-data that I remember.

Today’s story on Iranian President Ahmadinejad’s speech at the UN contains the following meta tags:

  • byl= Warren Hoge
  • des= International Relations;Embargoes and Economic Sanctions;Atomic Weapons
  • per=Ahmadinejad, Mahmoud
  • org=United Nations;Security Counci
  • geo= Iran

A business article on the arrival of the Microsoft game Halo 3 has the following:

  • byl=Seth Schiesel
  • des=Computer and Video Games;Computers and the Internet
  • per=Gates, Bill
  • org=Microsoft Corp;Sony Corp;Nintendo Company Limited
  • ticker=Microsoft Corp|MSFT|NASDAQ;Best Buy Company Incorporated|BBY|NYSE;Sony Corp|SNE|NYSE;Nintendo Company Limited|NTDOY|other-OTC;GameStop Corporation|GME|NYSE;Circuit City Stores Inc|CC|NYSE

From this we can see elements of the nytimes.com taxonomy poke through.

  • byl – is the byline of the author of the story
  • des – the description and how this story is classified by the New York Times
  • per – nodes for individuals
  • org – company or organizational nodes
  • ticker – public company stock symbols and their listing exchange

I’ve only just started playing around with this but using text from the meta-data fields and your favorite search engine you can already start to sort results in interesting ways.

  1. Articles about Mahmoud Ahmedinejad
  2. Articles about Gates, Bill
  3. News about Nintendo

It’s still early days as it appears that the search engines have not crawled the archives completely and a quick check of older articles are lacking in most of this meta-data. It will be interesting to see what insights skillful use of the meta-data fields will yield over the next few weeks and what applications can be built on top of them.

Reblog this post [with Zemanta]
Categories
Office

The Wall Comes Down

Everyone wondered if the New York Times would be able to pull off their Times Select premium news experiment. Despite projections of up to $10 Million in annual subscription revenues as of Wednesday morning most areas of nytimes.com will be free of charge. This is excellent news for bloggers who will now be able to point to articles on the site and know their readers will be able to follow their references with our having to pay a subscription fee.

Back when Times Select launched almost two years ago there was talk of driving subscriptions via an affiliate program. I guess that never really took off and now Vivian L. Schiller, senior vice president and general manager of nytimes.com admits that, “What wasn’t anticipated was the explosion in how much of our traffic would be generated by Google and Yahoo,”

It’s widely known that more traffic comes into the site via search engine links and blog referrals than via the front door and if you’re not converting successfully via these entry points then you’re better off monetizing the traffic via advertising.

It’ll be interesting to see if this puts pressure on wsj.com to open up as Rupert Murdoch, their new owner, has hinted.

I still think that the optimal combination of free vs. premium is the one that I outlined two years ago when Times Select launched.

Restricting access during the period when these pieces are the most valuable will drive subscriptions to TimesSelect. It makes less sense to keep these pieces under lock and key throughout the time when people are mildly curious to see what all the fuss is about and have the time to sample a frequently referenced article without having to commit to an annual subscription. I would prefer to see the program re-jigged so that TimesSelect members get first dibs on grokking the perspective of the day but after 48 hours the doors are open for any and all up until the 3 month mark when they drop back to a view which restricts non-subscribers to only the first few paragraphs.

Open access to popular pieces for a three month period would help move low cost advertising inventory and allow for the fence-sitters to properly experience the quality of the Times’ news stream should they later decide they want to get access to this stuff prior the 48 hour embargo for non-subscribers.

Call it Kennedy’s Rolling Window of news & perspective. The cheap seats only let you see what’s directly in front of the window while subscribers get to see not only what’s coming down the pike but also dig back and review what’s gone by.

Categories
Office

Climbing back up the rankings

Climbing out of the hatch

Photo by Todd Sampson

One of the most frustrating things about moving your blog to a new domain is watching your various rankings drop off a cliff and the associated loss in all the things that come with it. Despite all the attention to detail (301 redirects, revisions on all your various social networking profiles, re-writing URLs) you basically cease to exist as far as the search engines are concerned and here we are, now a month later and I’m still crawling my way back to relevance.

Reputation and influence is not portable.

SEO is just a passing hobby of mine. Feeling inspired after the last Webmaster World conference in Las Vegas, I experimented a bit on my old domain and tried to see if I could get myself ranked for “social media advertising” and was pleasantly surprised when it only took me a few weeks to reach the #1 spot for the phrase on all four major search engines. I later realized that the term was not as popular as “social media marketing” so I shifted to focus on that term. I soon ranked highly for that term as well. That was back in January.

I later lost interest and didn’t really think of it until I moved everything over to this new blog on this new domain on August 5th. Right before I pulled the plug on the old blog, I took a snapshot of my rankings on various services and have been tracking my comeback and it’s been pretty slow going.

Here’s a summary of the highest point I had reached on cavitate.net and where I am now on everwas.com:

Rankings for “social media advertising”

  • cavitate.net – either 1 or 2 for Google, Yahoo, and MSN. #6 for Ask
  • everwas.com – not even in the first 50 results

Rankings for “social media marketing”

  • cavitate.net – in the top 30 for Google, Yahoo and MSN, #5 for Ask
  • everwas.com – #31 for Google, nowhere on Yahoo, MSN. Ask still has my old domain listed at #17

Technorati Authority. Stowe Boyd had a series of posts where he tracked his rise back up the rankings after he moved his blog which is interesting for comparison except that that was back before Technorati calculated an authority value.

  • cavitate.net – 56
  • everwas.com – 12

Google Pagerank

  • cavitate.net – 4
  • everwas.com – not even rated yet, must still be in the “sandbox”

Yahoo! Site Explorer Inlinks – I didn’t measure the inlinks on cavitate.net but I’ve been watching the inlinks climb up and am now at 866.

Google Webmaster Tools Inlinks – I also never got this on the old site but just saw it jump from just a handful to 4, 593 on 9/4

MyBlogLog

  • cavitate.net – 126 members in its heyday
  • everwas.com – only 5 members have discovered this new site

It’ll be interesting to see if these numbers change much over time. If you feel like giving me a little boost, feel free to link to everwas.com using the phrase “social media marketing” or “social media advertising” and see if that will change things.

Categories
Office

Sitemaps.org

Yahoo, Microsoft, and Google now all support the same sitemap protocol. If you are concerned about the way the search engines crawl and index your site, create a sitemap and make it available. More information at sitemaps.org.

Also be sure to check out Yahoo! Site Explorer for more tools on how to manage your site including when Yahoo last crawled your site.

 

Blog Search Shootin’ Match

Bit of a blog search shooting match going on between old standby Technorati and new kid on the block, Sphere.

Technorati’s signed on ap.org and will work with Edleman on international expansion. Meanwhile, Sphere has embedded their bookmarklet into Time.com.

Technorati looks at link structure while Sphere casts the net a bit wider looking at the text on the page to get more contextually matched pages that may not be linked via an explicit url. 

I’ve have both Sphere It! and Technorati This! (sheeesh, that was hard to find that link) bookmarklets side-by-side and find the two actually compliment each other. I use "S" when I am looking for conversation around a very broad topic and am looking for the fuzzy cloud of buzz around it and I use "T" when it’s a specific topic (or flash/video site with little or no textual info such as the new nikeplus.com site) and I want to hone in on just the people linking-to-that-very-URL.

Two things I like about Sphere are the narrow time windows (last hour!) and the cool flash-based measuremap slider thingy that comes up when you select custom date range. By the way, this widget is available under a Creative Commons license from the kind folks over at Adaptive Path who worked on both Measure Map and Sphere.

sphere slider widget