SmartNews Pro Tip: Save it for Later

smartnews_iconSmartNews is focused on today’s news. Because of this the app is optimized for showing you the most important stories of the moment. The idea is to get you up to speed on what’s going on and then on with your day. If we do our job well there, the thinking goes, you’ll be back.

That said, there are times when you’re glancing at the latest headlines and you run across a meaty profile in Vanity Fair or a lengthy speech transcript in Medium. I’ve seen comments in the App Store where people are looking for a way to save articles for later. There are a couple of options that I’d like to share.

Read it later with Pocket

SmartNews is integrated with Pocket. Create an account at Pocket or login with your existing one. When you share from the article page on SmartNews (another pro tip, a long press on any headline will go directly to the save menu), you have the option to Save to Pocket. Once you’ve saved it here you can go back to Pocket on the Web and read the full text of the article later. If you upgrade to Pocket Premium, they will even download, index, and archive the full text of anything you save to Pocket making later retrieval easier.

Hear it later with Pocket

Pocket TTS

Pocket recently added Text-to-Speech to their mobile app. I ride my bike to work so sometimes it’s better to have a long article read to me. This afternoon I listened to the transcript of Jennifer Granick’s excellent keynote at Black Hat 2015, The End of the Internet Dream which was posted on Medium.

It somehow seemed appropriate to have the same voice that speaks to me as Siri explaining how important it is to keep the internet open and decentralized.

Show more, is that an archive?

SmartNews Read More

Well, kinda. While we try as much as possible to keep things lightweight in the SmartNews app, we recognize that you might sometimes go more than several hours in between SmartNews fixes. We hear you. But if you’re hearing about that great story in the morning and it’s no longer there, we’ve got your back!

Scroll to the bottom of any tab other than Top and you’ll see a “Show more” link that will show you more articles in the channel. We can’t store everything but it’ll at least extend your horizon a few more hours if you want to dig in a little further.

Popping filter bubbles at SmartNews

It’s now just over a month since I joined SmartNews and I am digging into what’s under the hood and the mad science that drives the deceptively simple interface of the SmartNews product.

smartnews

On the surface, SmartNews is a news aggregator. Our server pulls in urls from a variety of feeds and custom crawls but the magic happens when we try and make sense of what we index to refine the 10 million+ stories down to several hundred most important stories of the day. That’s the technical challenge.

The BHAG is to address the increased polarization of society. The filter bubble that results from getting your news from social networks is caused by the echo chamber effect of a news feed optimized to show you more of what you engage with and less of what you do not. Personalization is excellent for increasing relevance in things like search where you need to narrow results to find what you’re looking for but personalization is dangerously limiting for a news product where a narrowly personalized experience has what Filter Bubble author Eli Pariser called the “negative implications for civic discourse.”

So how do you crawl 10 million URLs daily and figure out which stories are important enough for everyone to know? Enter Machine Learning.

I’m still a newbie to this but am beginning to appreciate the promise of the application of machine learning to provide a solution to the problem above. New to machine learning too? Here’s a compelling example of what you can do illustrated in a recent presentation by Samiur Rahman, and engineer at Mattermark that uses machine learning to match news to their company profiles.

Samiur Rahman on Machine Learning

The word relationship map above was the result of a machine learning algorithm being set loose on a corpus of 100,000 documents overnight. By scanning all the sentences in the documents and looking at the occurrence of words that appeared in those sentences and noting the frequency and proximity of those words, the algo was able to learn that Japan: sushi as USA : pizza, and that Einstein : scientist as Picasso : painter.

Those of you paying close attention will notice that some the relationships are off slightly – France : tapas? Google : Yahoo?  This is the power of the human mind at work. We’re great with pattern matches. Machine learning algorithms are just that, something that needs continual tuning. Koizumi : Japan? Well that shows you the limitations of working with a dated corpus of documents.

But take a step back and think about it. In 24 hours, a well-written algorithm can take a blob of text and parse it for meaning and use that to teach itself something about the world in which those documents were created.

Now jump over to SmartNews and understand that our algorithms are processing 10 million news stories each day and figuring out the most important news of the moment. Not only are we looking for what’s important, we’re also determining which section to feature the story, how prominently, where to cut the headline and how to best crop the thumbnail photo.

The algorithm is continually being trained and the questions that it kicks back are just as interesting as the choices it makes.

The push and pull between discovery, diversity, and relevance are all inputs into the ever-evolving algorithm. Today I learned about “exploration vs. exploitation”. How do we tell our users the most important stories of the day in a way that covers the bases but also teaches you something new?

This is a developing story, stay tuned!

Getting the Band Together Again at SmartNews

Following a month off after my unexpected liberation from Gigaom, I started this week as Director of Media & Technology Partnerships at SmartNews. I feel very fortunate to have discovered this company at a time when I believe I have a lot to offer.

First, some recent coverage,

While researching the company, I was delighted to learn they had hired Rich Jaroslovsky. Rich and I crossed paths a few times when I was working at Dow Jones as he was getting wsj.com off the ground. We both have a fascination with technology’s impact on media and I shared his mission to bring The Wall Street Journal online. We had since gone our separate ways but I always admired his love and respect for good journalism as a writer, editor, and business guy.

Rich explained to me that SmartNews thinks of itself as a machine learning company with a news front-end which is right in the nexus of what makes me tick. The co-founders, Ken Suzuki and Kaisei Hamamoto, are super-sharp engineers who see news discovery as an interesting problem to solve and hugely important for society to get right. To give you a sense for how they think, as they look for real estate for their San Francisco office, Ken and Kaisei each created their own interactive maps showing the locations of high tech startups and compared notes to determine that the area of 2nd and Howard was the ideal spot to focus their search.

I made my pitch (excerpted below) and here I am!

Two of the hardest challenges for the publishing industry are distribution and advertising. When publishers moved online, they had to reinvent their traditional distribution channels and navigate a new landscape.

Initially it was the portals such as Yahoo and AOL that would curate the best of the web. Advertising was also sold this way, manually curated and matched to broad channels of interest maintained by the portals.

As technology improved, search engines such as Google automated discovery and matching a reader’s interests to a publisher’s content. Advertising was automated and optimized via keyword matching and auction systems to extract maximum value. Distributed widgets allowed publishers to embed advertising into their sites and a combination of publisher tags and indexing that allowed them to take advantage of an ad network’s inventory.

Social media platforms have recently taken over as a source of traffic for publishers and content snippets shared via these networks represent the fastest growing segment of inbound readers for a publisher.

A common thread to success across all these channels is attractive representation of a publisher’s content within each distribution channel. Whether it’s meta-data, SEO, or “social media optimization,” each new distribution channel has spawned a new method of representing your content to the service which is doing the crawling and aggregation.

For a new distribution channel both the crawling and aggregation algorithms are key to successful presentation of content and relevant advertising to the reader.

Technology has enabled effortless distribution of news so the looming challenge is not so much the distribution of content but more its discovery and presentation. Social media burnout and personalization algorithms are still very basic and often push more and more similar content to the reader resulting in a “filter bubble” which shows the reader only what they want to see or worse, what they already know.

Working with publishers to find them new sources of readership and readers to teach them something they didn’t know is an important goal that aligns with my interests. The fact that the team is based in Japan, a culture with a strong culture of news readership, is attractive to me as I am a big fan of introducing Japan to the rest of the world.