SmartNews (where I work) is running a series of TV commercials in Japan featuring Japanese celebrity, Tamori. The tagline for the campaign is “禁断のニュースアプリ” which roughly translates as “The forbidden news application” as in it’s so addicting that you binge use it when you’ve got time alone.
There’s an on-going series of video interviews with journalists on the futureof.news site. Two recent interviews were with Rich Jaroslovsky, my boss at SmartNews. Rich and I crossed paths years ago. He not only has a good instinct for what works for media online but also a history in both the print and online journalistic worlds and the deep memory for how things are put together and came to be the way they are today.
It is a huge vote of confidence that he’s working for SmartNews and, as you can see from the clips below, he’s here for all the right reasons. Some key quotes to call out:
excessive personalization is a rabbit hole. It at some point becomes an active negative, because what ends up happening is that you never discover anything new, you never discover anything that didn’t know ahead of time you would be interested in, and instead your worldview gets narrower and narrower.
. . .
When we launched WSJ.com, one of my conclusions was, serendipity is very hard to do in a digital environment. One of the great charms of SmartNews is that it has reintroduced that concept of serendipity, of finding things that you didn’t know you’d be interested in, and they turn out to be very interesting.
. . .
I’ve had many epiphanies over the years about digital journalism and how it’s different than print journalism, and one of them is that there is a craving in the audience for authenticity, for hearing things as close to the original source as possible. There are people who want to be able to access content that is from international sources, even when they are reading about stories that are being heavily covered by US media because it provides a different viewpoint.
. . .
In some ways news has been disintermediated the same way that music was. When I was in my record buying heyday and CD buying heyday, if there was a song I really liked, I had to buy the record. I had to buy the CD. And the fundamental unit was that CD, that package. I had to buy the whole package to get that one song. Now if there’s a song I like, I can buy that one song. That’s a very different model, as the music industry has learned somewhat to its despair but is adapting to. In news the same thing has happened.
The brand is no longer a destination, a place that people go to to get news. The brand is a mark of quality on that story. This is a USA Today story, I know what USA Today standards are, therefore the fact that it says USA Today, which is one of our valued partners, on top of that story—that’s a brand of quality. I know what I’m getting here. Or an NBC story, or a Huffington Post story, or a Fox News story. So it’s a very different environment, and the brand is still extremely important, but the meaning has changed quite fundamentally.
My greatest hope is the the flip side of that coin—that as journalism evolves, as new forms of journalism evolve, as new delivery mechanisms evolve, that the end product is a more informed person and a more informed populace. Because I think that an informed populace is the critical element to a successful, thriving democracy. So my great hope is that as journalism works through this period of turmoil and uncertainty, that we come out the other end with models that keep citizens informed, where people can always get the information they need to make informed decisions.
You can see the entire text of the interview on the futureof.news site. I’ve also embedded both video clips below.
SmartNews SF had over 50 Japanese university students visit the office to learn about doing business at a US startup and learn about how to start their career. This trip is part of the Japanese Ministry of Foreign Affairs-funded Kakehashi Project to promote greater understanding and opportunities between the US and Japan.
I’m always looking for a chance to practice my Japanese so I jumped at the chance. I tried to give as much of it in Japanese as I could but, as you can see, the slides are in English.
The main topics were:
SmartNews, how it works, why it’s interesting and why it’s a cool company.
My career, how I ended up at SmartNews, and what I learned along the way.
How to get a job at a US company, what tools to use, and how to use them.
Thank you Dennis, Jessica, Naoki, Chika, and Shunan for helping set up and handling the crowd and thank you Ken Funabashi from the Japan Consulate, Stacy Hughes, and Shimizu-san for giving SmartNews the opportunity.
We all remember the biggest stories of 2015, El Chapo’s escape, Ronda Rousey’s KO, and who can forget The Dress? In the spirit of discovery, we at SmartNews would like to highlight the stories that you might have missed. Following on the hidden gems theme, I took a look at each of the SmartNews categories and looked for the outliers. My somewhat unscientific methodology looked for stories from sources that would not normally appear in the category but were picked up and featured based on a topic analysis, hopefully introducing a source to a new audience that would not normally be exposed to that publication.
GQ describes itself as a men’s fashion and style magazine. When Marshall Sella tests the Bitcoin waters, SmartNews puts his piece in front of the Business readers. Marshall describes his time with Charlie Shrem, an early Bitcoin entrepreneur (bitrepreneur?) whose LinkedIn profile now shows him cooling his heels at Lewisburg Federal Prison.
It’s not often that Scientific American shows up in the Entertainment section. Cindi May’s The Problem with Female Superheroes took a look at how characters such as Storm and Dazzler in the recent X-Men films may be adversely affecting the young audiences who watch them. “Saving the world in spiked heels” may not be giving young girls a realistic expectation of their abilities. We hope the upcoming Dawn of Justice does a better job.
We all cringed when we saw the video of the 12-year-old boy who tripped and punched a hole in the 350-year-old painting valued at $1.5 million. Oliver Holms of The Guardian covers the restoration effort (thankfully it was insured) and points to other mishaps such as when a pair of Qing dynasty Chinese vases and a Picasso did not fare as well. SmartNews placed this one in the Lifestyle section which is where our Art & Culture are featured.
“I thought it was a CIA surveillance device,” said Brett McBay in Modesto, California after instructing his son to shoot his neighbor’s drone from out of the sky with a 12-gauge shotgun. Cyrus Farivar at Ars Technica brought up a number of issues including the right to privacy (the skies around your home) and the respect for private property, (Eric Joe’s homemade hexacopter drone), and of course the right to fire off buckshot into the sky. SmartNews to placed this story into the US category where much of our gun violence stories have been running. Inquiring minds want to know if this Brett McBay of Modesto is the same Brett McBay whose twitter profile states he is the District Representative for a California State Assemblymember.
SB Nation covers sports and, yes, there is a basketball in this bit but it’s used to explain the Magnus Effect from physics and, for that reason, this article showed up in Science.
The Nation likes to dig (and sometimes poke) which usually lands them in the US section for political coverage. Back in May, Dave Zirin asked why mainstream sports sites were not covering the case of NBA player Thabo Sefolosha, who was tackled by NYPD outside a nightclub, injured, and subsequently missed the playoffs with his team. This story introduced Sports readers to The Nation style of media inquiry. Seven months later ESPN published an in-depth investigative piece on this same story.
SmartNews has built a sophisticated, duplicate content filter so that when the latest press release from a presidential candidate, disaster, crime, or culinary sensation hits the proverbial viral loop, the breaking news from multiple outlets does not overwhelm the app and crowd out other stories. SmartNews strives to promote only the best and unique stories to our readers.
But there are times when you want to dig deeper on an issue or read an alternative take. Introducing the Recommended widget.
You can find the Recommended widget at the bottom of the SmartView of any article in the SmartNews app. Swipe left on any article to get to the simplified SmartView of that article. Scroll down to the bottom of the article to get to the Recommended widget.
The first three headlines (in purple) are from the publisher of the original piece (Rolling Stone). If the publisher has continued coverage of the story, you’ll see past stories about the topic giving you deeper context around what you just read. In this example, there is a link to an earlier story about the auction followed by an interview with Ringo Starr and then a piece about The Beatles and their album Rubber Soul.
The bottom two headlines (in green) are culled from our daily crawl of 10 million+ headlines and matched entirely based on a custom SmartNews algorithm. Here we see two other stories about the Ringo Starr auction, one from The Guardian and the second from NME.
How do we do it? That sophisticated de-dupe filter we built to reduce articles that are too much alike? Turn it around and it makes a fantastic related-articles algorithm!
Each article is automatically “read” and key terms, companies, people, and other entities are extracted along with data around the author, publisher, length of the piece and many other factors that are used to make a data representation of the article. When two representations overlap significantly we give them a similarity score. The higher the score, the more similar the two articles are for the purposes of filtering or recommending.
I like to think of the Recommended widget as a jumping off point for further exploration. Headlines 1-3 go deeper into the past with a specific source while headlines 4 & 5 go broader along the same topic but across different publications. Choose your adventure.
The similar articles feature is not new. I use a WordPress plugin on this blog to power the Related box you see below each post. Most news sites have something similar, usually driven by keyword or tag matching, against a limited content set. SmartNews has a more sophisticated matching algorithm across a much broader universe of articles and I think you’ll notice the difference right away.
There are at least two sides to every story. The Planned Parenthood videos were a polarizing topic that monopolized the news cycle several weeks ago. How do you teach an algorithm a point of view? How do you optimize for discovery and strike the right balance for diversity while avoiding duplication?
SmartNews is a news aggregation app driven by machine learning algorithms. The platform is tuned for discovery (as opposed to personalization). After using it regularly, I began collecting screenshots of my favorite examples when the app taught me something new or showed me two items side-by-side that suggested a subtle intelligence.
The science and application of artificial intelligence to personalization is well understood. From Amazon’s people-that-bought-this-also-bought-that to Pandora’s Music Genome Project, software has been recommending what you’ll like next best based on what you’ve liked so far for years.
The new frontier in artificial intelligence is machine learning. Companies such as Spotify and Netflix are hard at work trying to predict future tastes based on an evolving understanding of collective tastes. Sure, learning assumes knowledge of the past, but projecting that learning into the future is much harder as you build a model based on an understanding of something that does not exist. Rather than showing you something we know you’ll like based on what you liked in the past, machine learning discovers things you didn’t know you would like.
First a little context. SmartNews, while deceptively simple, has a lot going on under the hood. At any time, the SmartNews app shows around 250 headlines across 8 categories. These headlines are selected from millions of stories that are scanned each day. In order to ensure that the stories featured in the app are the most important and interesting, a number of things must take place.
After harvesting URLs, the text of each article is run through a classifier that examines things such as the headline, author byline, publication date, images and video embeds. These pieces are analyzed by a semantic engine that extracts data so the algorithm can map the article to a topic cluster and place it into the appropriate subject category. (I wrote about how this is done in an earlier post)
Importance estimation is where we rank an article and determine where it will go in the app relative to other articles. Does it go towards the top of a section or towards the bottom? If the top, does it deserve featured treatment? Maybe it’s so topical it needs to be pushed to the Top page, which is reserved for only the most important stories of the moment.
Finally, diversification ensures there is a good mix of stories in each category. If there are 40 stories about guacamole and peas, here’s where we determine which to show and which to push to the background. If there’s a new development on a story, the update will push its way in and take prominence over an older story.
These are just details to give you context. The most amazing thing to me is when the app surfaces a “hidden gem” that I would not normally run across if I were using an RSS reader hard-coded to a collection of feeds, or a social network that is limited to news shared by my friends.
The best way to appreciate SmartNews as a discovery engine is to use it daily, but if you haven’t had a chance, here are a few more of my favorite Gems below:
While the Center for Medical Progress’ undercover video interviews with Planned Parenthood staffers may have been shocking, the representation of two points of view helped me see both sides of the issue. What was interesting was the Cosmopolitan article (a source I normally do not read) had the best measured rebuttal.
Much of the climate change news ends up in the Science category. As that story grows in relevance to us all, more publications dig into it. If you haven’t read this terrifying Rolling Stone piece, read it now.
Here’s an example of a developing story getting an update. ESPN reports that WWE is cutting its relationship with Hulk Hogan his comments that were offensive. People Magazine follows up with the story of his apology. Oh, also notice that the algorithm put both stories into the Entertain section.
As news of the killing of Cecil the Lion went viral, the algorithm was smart enough to surface a side of the story from a local Minnesota paper.
The screenshot above, more than any of the others, shows the freaky intelligence working behind the scenes. Like those times when an algorithmically generated playlist just nails the transition of one song into the next, drawing the causality between gun violence in the US to how such an environment might have prepared an off-duty soldier to do the right thing shows how a well-designed system can be greater than just the sum of its component parts.
Do you use SmartNews? Have you had the same experience? Send along some of your own Hidden Gems and I’ll add them to the gallery.
SmartNews is focused on today’s news. Because of this the app is optimized for showing you the most important stories of the moment. The idea is to get you up to speed on what’s going on and then on with your day. If we do our job well there, the thinking goes, you’ll be back.
That said, there are times when you’re glancing at the latest headlines and you run across a meaty profile in Vanity Fair or a lengthy speech transcript in Medium. I’ve seen comments in the App Store where people are looking for a way to save articles for later. There are a couple of options that I’d like to share.
Read it later with Pocket
SmartNews is integrated with Pocket. Create an account at Pocket or login with your existing one. When you share from the article page on SmartNews (another pro tip, a long press on any headline will go directly to the save menu), you have the option to Save to Pocket. Once you’ve saved it here you can go back to Pocket on the Web and read the full text of the article later. If you upgrade to Pocket Premium, they will even download, index, and archive the full text of anything you save to Pocket making later retrieval easier.
Hear it later with Pocket
Pocket recently added Text-to-Speech to their mobile app. I ride my bike to work so sometimes it’s better to have a long article read to me. This afternoon I listened to the transcript of Jennifer Granick’s excellent keynote at Black Hat 2015, The End of the Internet Dream which was posted on Medium.
It somehow seemed appropriate to have the same voice that speaks to me as Siri explaining how important it is to keep the internet open and decentralized.
Show more, is that an archive?
Well, kinda. While we try as much as possible to keep things lightweight in the SmartNews app, we recognize that you might sometimes go more than several hours in between SmartNews fixes. We hear you. But if you’re hearing about that great story in the morning and it’s no longer there, we’ve got your back!
Scroll to the bottom of any tab other than Top and you’ll see a “Show more” link that will show you more articles in the channel. We can’t store everything but it’ll at least extend your horizon a few more hours if you want to dig in a little further.
It’s now just over a month since I joined SmartNews and I am digging into what’s under the hood and the mad science that drives the deceptively simple interface of the SmartNews product.
On the surface, SmartNews is a news aggregator. Our server pulls in urls from a variety of feeds and custom crawls but the magic happens when we try and make sense of what we index to refine the 10 million+ stories down to several hundred most important stories of the day. That’s the technical challenge.
The BHAG is to address the increased polarization of society. The filter bubble that results from getting your news from social networks is caused by the echo chamber effect of a news feed optimized to show you more of what you engage with and less of what you do not. Personalization is excellent for increasing relevance in things like search where you need to narrow results to find what you’re looking for but personalization is dangerously limiting for a news product where a narrowly personalized experience has what Filter Bubble author Eli Pariser called the “negative implications for civic discourse.”
So how do you crawl 10 million URLs daily and figure out which stories are important enough for everyone to know? Enter Machine Learning.
I’m still a newbie to this but am beginning to appreciate the promise of the application of machine learning to provide a solution to the problem above. New to machine learning too? Here’s a compelling example of what you can do illustrated in a recent presentation by Samiur Rahman, and engineer at Mattermark that uses machine learning to match news to their company profiles.
The word relationship map above was the result of a machine learning algorithm being set loose on a corpus of 100,000 documents overnight. By scanning all the sentences in the documents and looking at the occurrence of words that appeared in those sentences and noting the frequency and proximity of those words, the algo was able to learn that Japan: sushi as USA : pizza, and that Einstein : scientist as Picasso : painter.
Those of you paying close attention will notice that some the relationships are off slightly – France : tapas? Google : Yahoo? This is the power of the human mind at work. We’re great with pattern matches. Machine learning algorithms are just that, something that needs continual tuning. Koizumi : Japan? Well that shows you the limitations of working with a dated corpus of documents.
But take a step back and think about it. In 24 hours, a well-written algorithm can take a blob of text and parse it for meaning and use that to teach itself something about the world in which those documents were created.
Now jump over to SmartNews and understand that our algorithms are processing 10 million news stories each day and figuring out the most important news of the moment. Not only are we looking for what’s important, we’re also determining which section to feature the story, how prominently, where to cut the headline and how to best crop the thumbnail photo.
The algorithm is continually being trained and the questions that it kicks back are just as interesting as the choices it makes.
A story about President Obama playing a round of golf. Is it a sports story or is it a political story?
The push and pull between discovery, diversity, and relevance are all inputs into the ever-evolving algorithm. Today I learned about “exploration vs. exploitation”. How do we tell our users the most important stories of the day in a way that covers the bases but also teaches you something new?