Seriously, I almost didn't take the very first quiz. It was that intimidating. But after passing three of them, I started to feel pretty good. Perhaps I was even ready for a sexy Data Scientist swagger.
But then a major setback shook my confidence: I saw the Data Analysis written assignment. You had to download data, sift through it for associations, generate graphs, and write a paper. Some of the other students were complaining in the discussion forum that completing it took them over twenty hours.
I fumbled with the data analysis for quite a while. I considered admitting defeat and just watch the lectures in the future. Mark this John Hopkins class as an "audit" instead of getting a graded certificate.
I went to bed on Friday night thinking, "This is crazy! I don't have the time for this stuff. This course isn't that important."
Luckily for my scholastic endeavors, by 6:30am the next morning my attitude had improved and I knocked out the assignment within four hours. What had stopped me the previous night was dirty data. The higher-ed bums had made the data bad so that your first step needed to be to convert formats and clean things up.
In the morning, I had the presence of mind to look through the course discussion forums and read the comments of other disgruntled students who had already commented on the dirty-data-trick.
You see, in real life, you have to deal with dirty data. Whether you are working with your internal ERP structured data, files from a third-party external source, or Big Data from around the globe, some of it will be dirty and need up-front handling before jumping into the data analysis. Through the use of dirty data in the assignment, the teacher was making a real-life point.
Whoever thinks participating in an online virtual college course is easier than sitting in a physical classroom is very mistaken. This is serious stuff.
Students in online education deserve our respect.
However, one major issue with the assignment was its timing. The teacher--Jeff Leek of Simply Statistics fame--assigned it during Week 3, but I wouldn't understand how to do it until after Week 4's lectures and quiz. I fumbled through the assignment before I knew what I am doing.
Because there are thousands of students, the teacher cannot grade the assignments. Instead, he needs the students to grade each other according to a standard Yes/No rubric where each evaluation question is worth zero to five points. Each student has to evaluate the work of four peers or receive a 20% deduction in their assignment grade.
After I submitted my assignment and started on the peer reviews, I immediately got that "oh crap" feeling. Looking at the work of others, I started to gain a better insight into what the teacher might have wanted. Compared to theirs, my data analysis seemed pretty simple. At the time, I didn't understand how to use R to program a linear model with confounding, interacting variables. Confound it!
Oh well, my data analysis report was formatted nicely and the R graphs looked pretty; I can only hope my peer graders will be gracious and overlook my simple statistics.
To get the positive karma flowing, I gave all of my peers perfect scores.