Your Data Is Big, But Is It Thick?

Big data is a big topic in business and is moving into education more and more.  At the New Jersey Institute of Technology where I work, there is a certificate program in this area.

I knew this a decade ago as "data mining" and recently I see the term "thick data" being used. As far as I can tell (the term isn't even in Wikipedia yet), that term is taken from other fields, including anthropology. A "thick" description of a human behavior is one that explains not just the behavior, but its context as well, such that the behavior becomes meaningful to an outsider. Thick data is taking big data and giving it context.

Big Data embraces technology, decision-making and public policy. Supplying the technology is a fast-growing market, increasing at more than 30% a year and likely to reach $24 billion by 2016, according to a forecast by IDC, a research firm.

The NJIT certificate focuses on managing and mining Big Data analytics to understand business customers, develop new products and cut operational costs. Most of the jobs emerging in Big Data require knowledge of programming and the ability to develop applications, as well as an understanding of how to meet business needs. I can see people currently working in computing as candidates for this program.

What about in education? The skills most often mentioned in connection with Big Data jobs include math, statistics, data analysis, business analytics and natural language processing. Those are not skills I associate with most educators. Who will put the Big Data into that Thick Data context for education?

Google Removes Ads from Apps for Education

More than 30 million students, teachers and administrators globally rely on Google Apps for Education. About 40 percent of nonprofit colleges use Google for institutional email. Under pressure from privacy advocates, Google announced that it had permanently removed all ads from its Apps for Education. That includes the Gmail service. That means that the company can no longer harvest students’ information for advertising purposes.

Google had previously given college administrators the option of allowing the company to scan student Gmail accounts for key words and to deliver targeted advertisements to those students. Apparently, few administrators opted to allow the ads so many users won’t actually see a change.

Google said that Gmail collects data on all incoming and outgoing messages for several reasons. The practice allows the company to identify certain messages as spam, and makes it possible for users to unearth old emails with key-word searches. Scanning for potential advertising key words was part of that larger process, but the company has isolated and eliminated that part of the scanning process for Google Apps clients.

In California, two college students joined a recent attempt to bring a class-action lawsuit against Google for violating state and federal privacy laws in its data-collection techniques, according to Education Week. There has been discussion about whether Google’s data collection might violate Family Educational Rights and Privacy Act (FERPA). 

There Is Open and Then There Is Closed

open closed

Going back all the way to the early days of MOOCs (less than a decade, of course), the Open part of Massive Open Online Courses was a very important part of the equation. OPEN meant a number of things, including:

Access - open to all, regardless of age, location or previous experience and education

Free - without cost

Open Tools - using free and open tools like Moodle, blogs etc.

Reuse – the right to reuse the content in its unaltered / verbatim form

Revise – the right to adapt, adjust, modify, or alter the content itself

Remix – the right to combine the original or revised content with other content to create something new

Redistribute – the right to make and share copies of the original content, your revisions, or your remixes with others

That is not true for many of the big MOOC providers. Another blow against the Open Everything Empire comes with the announcement that Udacity will no longer give learners the opportunity to earn free, “non-identity-verified” certificates. People will still be able to view Udacity’s online course materials without paying, but those who want a credential will have to pay. Udacity feels their courses are worth something and plans to charge students accordingly. Udacity had earlier pulled back on believing that MOOCs are best-suited for academic pursuits and better applied to traing and lifelong learning. That is what many universities consider to be "non-credit" courses.

How long before the courses are not even open to those who aren't willing to pay to learn?

The big MOOC providers already tend not to use open source platforms and most don't allow the courses to be remixed, reused or redistributed.

The openness is eroding.