Tuesday, February 19, 2013

Adventures in MOOC Part 2

I am now in the homestretch of a Coursera MOOC that lasts eight weeks. At the five-week milestone of a Data Analysis course using the R statistical programming language, I have survived four weekly quizzes and the first written assignment. See my original posting for the back story.

Seriously, I almost didn't take the very first quiz. It was that intimidating. But after passing three of them, I started to feel pretty good. Perhaps I was even ready for a sexy Data Scientist swagger.

But then a major setback shook my confidence: I saw the Data Analysis written assignment. You had to download data, sift through it for associations, generate graphs, and write a paper. Some of the other students were complaining in the discussion forum that completing it took them over twenty hours.

I fumbled with the data analysis for quite a while. I considered admitting defeat and just watch the lectures in the future. Mark this John Hopkins class as an "audit" instead of getting a graded certificate.

I went to bed on Friday night thinking, "This is crazy! I don't have the time for this stuff. This course isn't that important."

Luckily for my scholastic endeavors, by 6:30am the next morning my attitude had improved and I knocked out the assignment within four hours. What had stopped me the previous night was dirty data. The higher-ed bums had made the data bad so that your first step needed to be to convert formats and clean things up.

In the morning, I had the presence of mind to look through the course discussion forums and read the comments of other disgruntled students who had already commented on the dirty-data-trick.

You see, in real life, you have to deal with dirty data. Whether you are working with your internal ERP structured data, files from a third-party external source, or Big Data from around the globe, some of it will be dirty and need up-front handling before jumping into the data analysis. Through the use of dirty data in the assignment, the teacher was making a real-life point.

Whoever thinks participating in an online virtual college course is easier than sitting in a physical classroom is very mistaken. This is serious stuff.

Students in online education deserve our respect.

However, one major issue with the assignment was its timing. The teacher--Jeff Leek of Simply Statistics fame--assigned it during Week 3, but I wouldn't understand how to do it until after Week 4's lectures and quiz. I fumbled through the assignment before I knew what I am doing.

Because there are thousands of students, the teacher cannot grade the assignments. Instead, he needs the students to grade each other according to a standard Yes/No rubric where each evaluation question is worth zero to five points. Each student has to evaluate the work of four peers or receive a 20% deduction in their assignment grade.

After I submitted my assignment and started on the peer reviews, I immediately got that "oh crap" feeling. Looking at the work of others, I started to gain a better insight into what the teacher might have wanted. Compared to theirs, my data analysis seemed pretty simple. At the time, I didn't understand how to use R to program a linear model with confounding, interacting variables. Confound it!

Oh well, my data analysis report was formatted nicely and the R graphs looked pretty; I can only hope my peer graders will be gracious and overlook my simple statistics.

To get the positive karma flowing, I gave all of my peers perfect scores. 

No comments:

About Me

My photo

I am a project-based software consultant, specializing in automating transitions from legacy reporting applications into modern BI/Analytics to leverage Social, Cloud, Mobile, Big Data, Visualizations, and Predictive Analytics using Information Builders' WebFOCUS. Based on scores of successful engagements, I have assembled proven Best Practice methodologies, software tools, and templates.

I have been blessed to work with innovators from firms such as: Ford, FedEx, Procter & Gamble, Nationwide, The Wendy's Company, The Kroger Co., JPMorgan Chase, MasterCard, Bank of America Merrill Lynch, Siemens, American Express, and others.

I was educated at Valparaiso University and the University of Cincinnati, where I graduated summa cum laude. In 1990, I joined Information Builders and for over a dozen years served in regional pre- and post-sales technical leadership roles. Also, for several years I led the US technical services teams within Cincom Systems' ERP software product group and the Midwest custom software services arm of Xerox.

Since 2007, I have provided enterprise BI services such as: strategic advice; architecture, design, and software application development of intelligence systems (interactive dashboards and mobile); data warehousing; and automated modernization of legacy reporting. My experience with BI products include WebFOCUS (vendor certified expert), R, SAP Business Objects (WebI, Crystal Reports), Tableau, and others.