How Netflix Is Using Your Taste in Movies

With all the attention that privacy (or the lack of it) received in 2013, there are some forms of snooping that you might actually appreciate.

If you use sites like Gmail or Facebook, you probably know that they are mining your data and usage in order to give you ads that are better-suited to your interests. I know that may not sound so great, but it is an improvement on getting ads that are completely irrelevant to your life. But sites like Amazon and Netflix are also mining your data, not to show you ads but to show you more relevant recommendations. Their systems have become more sophisticated and more granular at judging your preferences. On Netflix, they look at the genres that you watch. In the early days, this would have been broad categories like drama, comedy, action, romantic etc. But their genres have gotten very specific, sometimes to a humorous degree - as in Fight-the-System Documentaries, Period Pieces About Royalty Based on Real Life, Foreign Satanic Stories from the 1980s.

Alexis Madrigalis a senior editor at The Atlantic, where he oversees the Technology Channel. He's the author of Powering the Dream: The History and Promise of Green Technology. He wondered how Netflix with its 40 million users (more than HBO now) decides on the genres that a film fits into in order to quantify your personal tastes.

We sometimes call this taxonomy or folksonomy, when it is done by "the crowd."

His interest turned into a bit of an obsession and then he discovered that he could scrape (capture) each and every microgenre that Netflix's algorithm has ever created. He discovered that Netflix possesses not several hundred genres, or even several thousand, but 76,897 unique ways to describe types of movies.

You may not be a movie fan, Netflix subscriber or even very interested in Big Data - but organizations (companies anf colleges) are very interested in knowing about you. Therefore, you should have some interest and understanding of what is being done to you.

Madrigal wrote a script to pull that data and then spent several weeks understanding, analyzing, and reverse-engineering how Netflix's vocabulary and grammar work. He realized that there was no way he could go through all those genres by hand, so he used a piece of software called UBot Studio to incrementally go through each of the Netflix genres and copy them to a file. 

He discovered many very specific genres in the system, such as:

Emotional Independent Sports Movies

Spy Action & Adventure from the 1930s

Cult Evil Kid Horror Movies

Sentimental set in Europe Dramas from the 1970s

Romantic Chinese Crime Movies

Mind-bending Cult Horror Movies from the 1980s

Time Travel Movies starring William Hartnell

Visually-striking Goofy Action & Adventure

British set in Europe Sci-Fi & Fantasy from the 1960s

Critically-acclaimed Emotional Underdog Movies

Perry MasonIn the article he wrote forThe Atlantic, there is a generator which will give you many of the genres. It is an imperfect system. He found an oddly large number of genres for the actor Raymond Burr (best known for an old TV show Perry Mason). Why? 

He explains: "The vexing, remarkable conclusion is that when companies combine human intelligence and machine intelligence, some things happen that we cannot understand. Let me get philosophical for a minute. In a human world, life is made interesting by serendipity," Yellin told me. "The more complexity you add to a machine world, you're adding serendipity that you couldn't imagine. Perry Mason is going to happen. These ghosts in the machine are always going to be a by-product of the complexity. And sometimes we call it a bug and sometimes we call it a feature. Perry Mason episodes were famous for the reveal, the pivotal moment in a trial when Mason would reveal the crucial piece of evidence that makes it all makes sense and wins the day. Now, reality gets coded into data for the machines, and then decoded back into descriptions for humans. Along the way, humans ability to understand what's happening gets thinned out. When we go looking for answers and causes, we rarely find that aha! evidence or have the Perry Mason moment. Because it all doesn't actually make sense. Netflix may have solved the mystery of what to watch next, but that generated its own smaller mysteries. Sometimes we call that a bug, and sometimes we call it a feature."


Gartner's Trends List for 2012

Now that we are midway through 2012, the Gartner Symposium gives us the IT research firm's tech trends for the year. As usual, it's a mix of emerging and existing technologies.

According to campustechnology.com, the top ten includes:

    The use of media tablets and other small-form-factor computing devices;

    The continuing explosion of mobile-centric applications and interfaces;

    The growth of app stores and marketplaces;

    Contextual and social user experience;

    The "internet of things";

    Next-generation analytics;

    The proliferation of big data;

    The smarter use of in-memory computing;

    The recognition of the value of extreme low-energy servers; and

    The continued acceptance of cloud computing.


Technologies on the Horizon That Will Impact Higher Education

Every year I read the the NMC Horizon Report to see what they predict will be the technologies that will have an impact in higher education. The 2012 report was released jointly by NMC and the EDUCAUSE Learning Initiative.

The report looks at technologies that will have an impact in the next five years (near term, mid-term, and longer term). They also examine "critical challenges" facing education.

The near-term technologies are mobile apps and tablet computing which are changing the nature of computing for end users and developers. The larger suites of integrated software are being replaced by free and cheap apps that focus on doing one or a few things well and integrate with other apps easily. And, though I agree that mobile computing, tablets (iPads etc.) are influencing teaching and learning, I don't see any clear impact yet.

Two technologies that are more mid-term (2 or 3 years from having a major impact) are game-based learning and learning analytics.

"Learning analytics" may have more of an impact on the administration and decision-making levels than directly in the classroom. The term usually refers to both traditional strategies used in student retention, and newer methods of aggregating data from many sources to get a picture of how learning is happening and what is working best. If you have read previous reports, you know that long-term items, like learning analytics, often move closer in time if they grab traction in schools. Learning analytics, for example, seems to have benefited from some funded initiatives in the past few years.

Game-based learning has been on the list for a few years, but I don't feel like it has gotten any closer to making an impact. At one time, virtual worlds was a somewhat related technology, and that has almost dropped off the educational planet the past two years. Both technologies are ones that offer the possibility of using collaboration, problem solving, communication, critical thinking, and digital literacy. But the results have not been all that impressive. Online social games have certainly been big the past few years, but their application or any transference to learning is still lacking.

If you're writing your proposals for grants and conferences, you might want to get a jump on those technologies that are still four or five years out. Two to look into are gesture-based computing and the "Internet of Things."

Gesture-based computing fits right into gaming and mobile devices. Think of Wii games and swiping that smartphone or tablet.  The ideas driving its use in education is that it can transcend linguistic and cultural limitations. Watch a two year-old play with an iPad and you realize that not relying on language or any specific language might be a major plus. These devices also encourage interaction and just plain old play as a way to explore and learn. That is certainly true with younger students, but not lost on older and adult learners. Android and Apple smart phones and tablets, the Microsoft Surface, ActivPanel, Nintendo Wii and Microsoft Kinect systems, are all playing with these ideas.



Internet of Things

The "Internet of Things" is further out there in years and in my ability to explain exactly what it means, or might one day mean, to education. It is about the evolution of smart objects which are interconnected items in ways that make the line between the physical object and digital information very blurry or invisible.

You should look into IPv6 and how it is used in small devices with unique identifiers. You probably know a bit about RFID devices that are used in stores to track products, purchases and inventory. They store data and they can send that information to external devices via the Internet. We can already use them in schools to do similar things like tracking attendance, research subjects, and equipment. But how it might be used for learning is about as blurry as the line it is erasing.

Which brings us to challenges. In brief, these are the five technology-oriented challenges facing higher education according to the report.

1) Economic pressures from new education models, forcing traditional institutions to control costs while maintaining services;

2) The need for new forms of scholarly corroboration as traditional peer review and approval become more and more difficult to apply in light of new methods of dissemination;

3) The growing importance of digital literacy and lack of digital literacy preparation among faculty;

4) Traditional institutional barriers to the adoption of new technologies; and

5) Technological upheavals that are putting libraries "under tremendous pressure to evolve new ways of supporting and curating scholarship."

In my educational world, economics is very important, but the barriers of 3 and 4 are much tougher to overcome.


The Internet of Things




It has been five years since Tim O'Reilly pitched the idea of Web 2.0. The term caught on in a big way. In fact, almost everything seems to be labeled 2.0 these days.



Recently, O'Reilly and John Battelle (they run the Web 2.0 conference - now a Web 2.0 Summit - together along with TechWeb) released a white paper called " Web Squared: Web 2.0 Five Years On."



One thing the paper examines is how the social web might intersect with the Internet of Things. Not familiar with that? Don't feel badly. It hasn't caught popular fire yet, so there's still time for you to read up and be able to chat about it at the first faculty meeting.



The Internet of Things is concerned with real world objects that are connected to the Internet. The concept of the internet of things is associated with the Auto-ID Labs. The real world objects that are connected to the Net might be household appliances, cars, books or any electronic, "smart," or RFID-enabled object.



If this sounds like something from The Jetsons, you're on the right track. Here's what was said recently on ReadWriteWeb about one smart appliance.



The Internet fridge is probably the most oft-quoted example of what the Internet of Things - when everyday objects are connected to the Internet - will enable. Imagine a refrigerator (so the story goes) that monitors the food inside it and notifies you when you're low on, for example, milk. It also perhaps monitors all of the best food websites, gathering recipes for your dinners and adding the ingredients automatically to your shopping list. This fridge knows what kinds of foods you like to eat, based on the ratings you have given to your dinners. Indeed the fridge helps you take care of your health, because it knows which foods are good for you and which clash with medical conditions you have. And that's just part of the sci-fi story of the Internet fridge.


Okay, so my home gets smarter and more connected. What about schools?



The O'Reilly paper defines "web squared" as "web meets world." (Also the idea that the web is growing exponentially.) Something that still holds over from that Web 2.0 concept from 2004 is the belief that this new web would be harnessing collective intelligence. In 2009, that now includes mobile and internet-connected objects.



Smartphones with a microphone, camera, motion sensor, proximity sensor, and location sensor (GPS) are powerful "things" that can be used in the classroom but can be taken home and into the field.



Sure, RFID tags can keep track of books in the bookstore or storeroom, but, hopefully, educators and students will come up with more than a supermarket approach to the Internet of Things.



Aren't classrooms supposed to be about collective intelligence? Isn’t intelligence, at least partially, the characteristic that allows an organism to learn from and respond to its environment?



I doubt that I will have the money to make it out to the Summit in San Francisco this October (it's by invitation), but I hope there are some educators attending (and presenting?). Schools can't afford to be left on the beach just watching another technology wave crest.



By the way, the Internet of Things is not the Web of Things. It's a very tangled web we are weaving...



Download the Web Squared White Paper (PDF, 1.3MB)



Watch the Web Squared Webcast