Rich Jaroslovsky on the Future of News

There’s an on-going series of video interviews with journalists on the site. Two recent interviews were with Rich Jaroslovsky, my boss at SmartNews. Rich and I crossed paths years ago. He not only has a good instinct for what works for media online but also a history in both the print and online journalistic worlds and the deep memory for how things are put together and came to be the way they are today.

It is a huge vote of confidence that he’s working for SmartNews and, as you can see from the clips below, he’s here for all the right reasons. Some key quotes to call out:

excessive personalization is a rabbit hole. It at some point becomes an active negative, because what ends up happening is that you never discover anything new, you never discover anything that didn’t know ahead of time you would be interested in, and instead your worldview gets narrower and narrower.

. . .

When we launched, one of my conclusions was, serendipity is very hard to do in a digital environment. One of the great charms of SmartNews is that it has reintroduced that concept of serendipity, of finding things that you didn’t know you’d be interested in, and they turn out to be very interesting.

. . .

I’ve had many epiphanies over the years about digital journalism and how it’s different than print journalism, and one of them is that there is a craving in the audience for authenticity, for hearing things as close to the original source as possible. There are people who want to be able to access content that is from international sources, even when they are reading about stories that are being heavily covered by US media because it provides a different viewpoint.

. . .

In some ways news has been disintermediated the same way that music was. When I was in my record buying heyday and CD buying heyday, if there was a song I really liked, I had to buy the record. I had to buy the CD. And the fundamental unit was that CD, that package. I had to buy the whole package to get that one song. Now if there’s a song I like, I can buy that one song. That’s a very different model, as the music industry has learned somewhat to its despair but is adapting to. In news the same thing has happened.

The brand is no longer a destination, a place that people go to to get news. The brand is a mark of quality on that story. This is a USA Today story, I know what USA Today standards are, therefore the fact that it says USA Today, which is one of our valued partners, on top of that story—that’s a brand of quality. I know what I’m getting here. Or an NBC story, or a Huffington Post story, or a Fox News story. So it’s a very different environment, and the brand is still extremely important, but the meaning has changed quite fundamentally.


My greatest hope is the the flip side of that coin—that as journalism evolves, as new forms of journalism evolve, as new delivery mechanisms evolve, that the end product is a more informed person and a more informed populace. Because I think that an informed populace is the critical element to a successful, thriving democracy. So my great hope is that as journalism works through this period of turmoil and uncertainty, that we come out the other end with models that keep citizens informed, where people can always get the information they need to make informed decisions.

You can see the entire text of the interview on the site. I’ve also embedded both video clips below.

Part One

Part Two

Popping filter bubbles at SmartNews

It’s now just over a month since I joined SmartNews and I am digging into what’s under the hood and the mad science that drives the deceptively simple interface of the SmartNews product.


On the surface, SmartNews is a news aggregator. Our server pulls in urls from a variety of feeds and custom crawls but the magic happens when we try and make sense of what we index to refine the 10 million+ stories down to several hundred most important stories of the day. That’s the technical challenge.

The BHAG is to address the increased polarization of society. The filter bubble that results from getting your news from social networks is caused by the echo chamber effect of a news feed optimized to show you more of what you engage with and less of what you do not. Personalization is excellent for increasing relevance in things like search where you need to narrow results to find what you’re looking for but personalization is dangerously limiting for a news product where a narrowly personalized experience has what Filter Bubble author Eli Pariser called the “negative implications for civic discourse.”

So how do you crawl 10 million URLs daily and figure out which stories are important enough for everyone to know? Enter Machine Learning.

I’m still a newbie to this but am beginning to appreciate the promise of the application of machine learning to provide a solution to the problem above. New to machine learning too? Here’s a compelling example of what you can do illustrated in a recent presentation by Samiur Rahman, and engineer at Mattermark that uses machine learning to match news to their company profiles.

Samiur Rahman on Machine Learning

The word relationship map above was the result of a machine learning algorithm being set loose on a corpus of 100,000 documents overnight. By scanning all the sentences in the documents and looking at the occurrence of words that appeared in those sentences and noting the frequency and proximity of those words, the algo was able to learn that Japan: sushi as USA : pizza, and that Einstein : scientist as Picasso : painter.

Those of you paying close attention will notice that some the relationships are off slightly – France : tapas? Google : Yahoo?  This is the power of the human mind at work. We’re great with pattern matches. Machine learning algorithms are just that, something that needs continual tuning. Koizumi : Japan? Well that shows you the limitations of working with a dated corpus of documents.

But take a step back and think about it. In 24 hours, a well-written algorithm can take a blob of text and parse it for meaning and use that to teach itself something about the world in which those documents were created.

Now jump over to SmartNews and understand that our algorithms are processing 10 million news stories each day and figuring out the most important news of the moment. Not only are we looking for what’s important, we’re also determining which section to feature the story, how prominently, where to cut the headline and how to best crop the thumbnail photo.

The algorithm is continually being trained and the questions that it kicks back are just as interesting as the choices it makes.

The push and pull between discovery, diversity, and relevance are all inputs into the ever-evolving algorithm. Today I learned about “exploration vs. exploitation”. How do we tell our users the most important stories of the day in a way that covers the bases but also teaches you something new?

This is a developing story, stay tuned!

circa 2012

“I believe in the human element of news,” says Matt Galligan, the co-founder of circa, the hot news curation app released for the iPhone a few weeks ago. Matt was speaking about circa with Founding Editor Dave Cohn, at a Hacks/Hackers meetup in San Francisco, where circa is based.

The service is quite extraordinary in how hand-crafted it feels. Optimized specifically for the mobile device, the app aggregates news into a series of snippets and photos organized by topic. Aggregate is really the wrong term because each topic is manually curated by an editor who scans their online news sources and pulls apart source material to separate fact from “fluff.”

It is the presentation which makes this app shine. The care in how the images are carefully cropped and manipulated to the clever imitation ruffled edges on the screens to emulate newsprint give the app an artful feel.

Users of the app are presented with a series of topics to read, all editorially chosen. You can dive into a topic and scan through what Matt describes as a series of cards (flash cards were an important design inspiration) which highlight key facts, quotes, images, or other information that adds to the topic. Each fact that is brought in is referenced so that the reader can drill down to the originating piece should they want to read further.

The entire experience is designed for mobile. The snippets of a story are presented like snacks that you can nibble throughout the day, while waiting for the bus, between meetings, or over coffee. It is your news, broken down.

Additionally, you can Follow a topic much as you would subscribe to an RSS feed or follow a hashtag on twitter. Big news stories such as Hurricane Sandy have “arcs” that develop over time and a key feature of circa is that you can pick up and follow any thread you want to watch closely and, if you opt in, get push notifications of updates whenever a new fact is added to any thread you are following. Following developing stories is a key use case around with circa was designed. Often when following developing stories in the mainstream media, follow on stories often include prior material for the uninitiated. Circa gathers only the new material and tacks it on to it’s thread, dropping readers where they left off.

This Follow signal is particularly interesting. A Google search or Facebook like are signals of intent but they are fleeting. When someone opts in to getting push notifications of a story, that is a very strong signal of interest that will help the circa team understand what stories to follow in the future. This symbiotic loop will help improve the service over time and the early usage and engagement figures show strong engagement that bodes well for the service and its future.

John Herrman wrote that, “Twitter is a fact-processing machine on a grand scale,” when he was describing how news about Hurricane Sandy, both real and fake, was quickly shared, distributed, and verified via tweets. Circa takes a different approach. Positioning itself as a meta-source that can be trusted, it’s editors are taking the care to pull in the most important stories of the day and updating only when something is significant. It’s a human powered approach that limits their scale but they seem ok with that.

The app was a germ of an idea in December 2011. Back then it was called Circuit and was about the “closed loop” relationship between news that is read driving an algorithm that gave you more of what interested you. They started building the app in March and launched six months later as circa, a name which reflected the evolution of the product into something which was approximately around or about the news but not necessarily in it. The formal name of the company is circa 1605, the year of the first newspaper.

The team is small. 11 editors which cover 22 of the 24 hours a day (they have an editor based in China). There are three engineers, an iOS developer, their CTO who built their custom CMS, and a third engineer working on the APIs and data. The investor team is a hot shot lineup of media wizards so they have a strong wind with lots of experience from companies such as Tumblr, Facebook, and Al Jazeera behind them.

But, as Matt shared, launch day was difficult. Just 15 minutes after their app hit the iTunes store, a rouge server took down their service and for the next four hours. Those that had the app couldn’t use it while others couldn’t download it. The team answered over 1,000 tweets and were greeted to over 100 one star reviews once they recovered. But they did recover and within the next four hours after coming online they shot up to be the #1 news app in the iTunes app store, beyond the NYT and CNN.

The app is only a few weeks old but going strong. The Follow feature has a magnetic pull to bring you back into the app as stories develop over time. The team has purposefully chosen to go after areas where their approach to curation will have maximum benefit. Big news stories such as the Election or Hurricane are perfect for circa as you can follow along with out drinking from a firehose. They are next going after the tech news vertical where the typical earnings announcement or product launch story varies little from outlet to outlet.

Monetization options remain to be seen but Matt did share that their engagement numbers are very strong and, with the Follow feature, it’ll be easy to target readers with advertising or premium subscription topics. iPhone only for now but both the circa app and the team shows promise.