How Netflix Is Using Your Taste in Movies

With all the attention that privacy (or the lack of it) received in 2013, there are some forms of snooping that you might actually appreciate.

If you use sites like Gmail or Facebook, you probably know that they are mining your data and usage in order to give you ads that are better-suited to your interests. I know that may not sound so great, but it is an improvement on getting ads that are completely irrelevant to your life. But sites like Amazon and Netflix are also mining your data, not to show you ads but to show you more relevant recommendations. Their systems have become more sophisticated and more granular at judging your preferences. On Netflix, they look at the genres that you watch. In the early days, this would have been broad categories like drama, comedy, action, romantic etc. But their genres have gotten very specific, sometimes to a humorous degree - as in Fight-the-System Documentaries, Period Pieces About Royalty Based on Real Life, Foreign Satanic Stories from the 1980s.

Alexis Madrigalis a senior editor at The Atlantic, where he oversees the Technology Channel. He's the author of Powering the Dream: The History and Promise of Green Technology. He wondered how Netflix with its 40 million users (more than HBO now) decides on the genres that a film fits into in order to quantify your personal tastes.

We sometimes call this taxonomy or folksonomy, when it is done by "the crowd."

His interest turned into a bit of an obsession and then he discovered that he could scrape (capture) each and every microgenre that Netflix's algorithm has ever created. He discovered that Netflix possesses not several hundred genres, or even several thousand, but 76,897 unique ways to describe types of movies.

You may not be a movie fan, Netflix subscriber or even very interested in Big Data - but organizations (companies anf colleges) are very interested in knowing about you. Therefore, you should have some interest and understanding of what is being done to you.

Madrigal wrote a script to pull that data and then spent several weeks understanding, analyzing, and reverse-engineering how Netflix's vocabulary and grammar work. He realized that there was no way he could go through all those genres by hand, so he used a piece of software called UBot Studio to incrementally go through each of the Netflix genres and copy them to a file. 

He discovered many very specific genres in the system, such as:

Emotional Independent Sports Movies

Spy Action & Adventure from the 1930s

Cult Evil Kid Horror Movies

Sentimental set in Europe Dramas from the 1970s

Romantic Chinese Crime Movies

Mind-bending Cult Horror Movies from the 1980s

Time Travel Movies starring William Hartnell

Visually-striking Goofy Action & Adventure

British set in Europe Sci-Fi & Fantasy from the 1960s

Critically-acclaimed Emotional Underdog Movies

Perry MasonIn the article he wrote forThe Atlantic, there is a generator which will give you many of the genres. It is an imperfect system. He found an oddly large number of genres for the actor Raymond Burr (best known for an old TV show Perry Mason). Why? 

He explains: "The vexing, remarkable conclusion is that when companies combine human intelligence and machine intelligence, some things happen that we cannot understand. Let me get philosophical for a minute. In a human world, life is made interesting by serendipity," Yellin told me. "The more complexity you add to a machine world, you're adding serendipity that you couldn't imagine. Perry Mason is going to happen. These ghosts in the machine are always going to be a by-product of the complexity. And sometimes we call it a bug and sometimes we call it a feature. Perry Mason episodes were famous for the reveal, the pivotal moment in a trial when Mason would reveal the crucial piece of evidence that makes it all makes sense and wins the day. Now, reality gets coded into data for the machines, and then decoded back into descriptions for humans. Along the way, humans ability to understand what's happening gets thinned out. When we go looking for answers and causes, we rarely find that aha! evidence or have the Perry Mason moment. Because it all doesn't actually make sense. Netflix may have solved the mystery of what to watch next, but that generated its own smaller mysteries. Sometimes we call that a bug, and sometimes we call it a feature."


Trackback specific URI for this entry


Display comments as Linear | Threaded

No comments

The author does not allow comments to this entry