Saturday, February 23, 2013

Adventures in MOOC Part 3

When I was in first grade, I won an art contest among my young peers. On a white poster board, I had drawn a puppy playing with a ball and titled the work, "I Love And Feed My Dog."

I had added streak marks to indicate a rapid movement of the ball and stretched out the puppy's body to give the impression he was moving toward the viewer. 

In an award ceremony in front of the entire school, my teacher escorted me and my puppy picture onto the gymnasium floor to meet the principal for a ribbon. The six-foot tall suit greeted us and, as he started to hand me the bright red ribbon, took a glance at my winning art. 

Rumored to have only one working eye, he probably could not appreciate the intended perspective of the art. 

The principal immediately withdrew the ribbon and turned his attention to my teacher, "Are you sure this kid drew this? Did one of his parents do it for him?" 

This glass-eyed Ruler of Children evaluated the work for five seconds and immediately called a six-year old a liar and cheat. 

My teacher came to my defense and convinced him that I had actually drawn the picture. Had she withered under the imposing Mr. Authority, it is hard to say what would have happened. With a finger pointed at the gym exit, Principal Meany probably would have banished me from the school forever.

I may not remember this event exactly but when he handed down the ribbon to the three-foot tall version of me, he growled. 

Which brings me to the sixth week of my online Coursera MOOC. You can read my earlier adventures here.

After the first Data Analysis written assignment, the professor posted a notice about a concern of plagiarism. More than likely, a handful of students are using the web to share their work. 

Here's the challenge for the Coursera professor. He cannot police thousands of students and evaluate their work. He has to allow peers to review others' work and make a decision. But here is the rub: he also does not have time to evaluate whether the peers did a good job with grading. 

If a peer marks down your work claiming that he or she suspects you did not do this work on your own, you have no mechanism for appealing that grade. There are just too many students. Here are his comments:
A charge of plagiarism is a very serious accusation and should only be made on the basis of strong evidence. It is currently very difficult to prove or disprove a charge of plagiarism in the MOOC peer assessment setting. As a result, I am not expecting you to police your classmates’ work for plagiarism. You should evaluate the work of your classmates on the merits of what they have submitted. You should only mark them down if you are absolutely 100% confident that their submission constitutes an act of egregious plagiarism. I am not in a position to evaluate whether or not a submission actually constitutes plagiarism, and I will not be able to entertain appeals or to alter any grades that have been assigned through the peer evaluation system. 

So imagine a peer reviews your work and thinks, "You could not have done this alone! I'm giving you little cheat a piece of my mind and failing you on this assignment." Your evaluator points a bony finger to the door and sends you out into the street, dragging your poster board behind you and crushing your dreams for a red ribbon.

In a MOOC, your teacher is not there to defend you. The Coursera professor is too busy; there are too many students. You have no recourse on Coursera.  

Or maybe not. It would seem that if Coursera is to be taken seriously, they would need a fair grading process and formal mechanism for appealing decisions. After all, not all of the students are above average. The same people who are getting bad grades are also assigning grades.

Coursera announced there are now 2.7 million people enrolled in their courses. Would you trust your college grade to be crowd-sourced by a random sampling of millions of people around the world? 

Just like my one-eyed Principal was not a fair art judge, neither would be all of your peers in a MOOC. 

Wednesday, February 20, 2013

Gartner Slaps BI Vendor Birst for its Giddiness

Here is a copy of the e-mail that BI software vendor Birst sent out today, apologizing for an earlier communication that was not properly approved by industry analyst Gartner:


Dear Doug,
On Wednesday, February 13, 2013, you received an email from Wynn White on behalf of Birst promoting our inclusion in Gartner's 2013 Magic Quadrant for Business Intelligence and Analytics Platforms. That communication was not authorized by Gartner, nor was it in compliance with Gartner's Copyright and Quote Policy. Birst apologizes for this error. The following has been approved by Gartner.
Birst has been positioned by Gartner, Inc. in the Challengers quadrant of the 2013 Magic Quadrant for Business Intelligence and Analytics Platforms. The report presents a global view of Gartner's opinion of software vendors that should be considered by organizations seeking to use business intelligence (BI) platforms. After evaluating 38 different BI and analytics software vendors on 15 functional areas, Gartner analysts placed Birst in the Challengers quadrant based primarily on completeness of vision and ability to execute.
What does this all mean? We're stoked to be named a Challenger.
Get your complimentary copy of the Gartner 2013 Business Intelligence and Analytics Magic Quadrant.
Regards,
ww sig.png
Wynn White
Vice President, Marketing
Birst

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Birst. Gartner does not endorse any vendor; product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.



Who can blame Birst for its giddiness on being in the BI and Analytics Magic Quadrant? Congratulations to them on their success.  

Tuesday, February 19, 2013

Adventures in MOOC Part 2

I am now in the homestretch of a Coursera MOOC that lasts eight weeks. At the five-week milestone of a Data Analysis course using the R statistical programming language, I have survived four weekly quizzes and the first written assignment. See my original posting for the back story.

Seriously, I almost didn't take the very first quiz. It was that intimidating. But after passing three of them, I started to feel pretty good. Perhaps I was even ready for a sexy Data Scientist swagger.

But then a major setback shook my confidence: I saw the Data Analysis written assignment. You had to download data, sift through it for associations, generate graphs, and write a paper. Some of the other students were complaining in the discussion forum that completing it took them over twenty hours.

I fumbled with the data analysis for quite a while. I considered admitting defeat and just watch the lectures in the future. Mark this John Hopkins class as an "audit" instead of getting a graded certificate.

I went to bed on Friday night thinking, "This is crazy! I don't have the time for this stuff. This course isn't that important."

Luckily for my scholastic endeavors, by 6:30am the next morning my attitude had improved and I knocked out the assignment within four hours. What had stopped me the previous night was dirty data. The higher-ed bums had made the data bad so that your first step needed to be to convert formats and clean things up.

In the morning, I had the presence of mind to look through the course discussion forums and read the comments of other disgruntled students who had already commented on the dirty-data-trick.

You see, in real life, you have to deal with dirty data. Whether you are working with your internal ERP structured data, files from a third-party external source, or Big Data from around the globe, some of it will be dirty and need up-front handling before jumping into the data analysis. Through the use of dirty data in the assignment, the teacher was making a real-life point.

Whoever thinks participating in an online virtual college course is easier than sitting in a physical classroom is very mistaken. This is serious stuff.

Students in online education deserve our respect.

However, one major issue with the assignment was its timing. The teacher--Jeff Leek of Simply Statistics fame--assigned it during Week 3, but I wouldn't understand how to do it until after Week 4's lectures and quiz. I fumbled through the assignment before I knew what I am doing.

Because there are thousands of students, the teacher cannot grade the assignments. Instead, he needs the students to grade each other according to a standard Yes/No rubric where each evaluation question is worth zero to five points. Each student has to evaluate the work of four peers or receive a 20% deduction in their assignment grade.

After I submitted my assignment and started on the peer reviews, I immediately got that "oh crap" feeling. Looking at the work of others, I started to gain a better insight into what the teacher might have wanted. Compared to theirs, my data analysis seemed pretty simple. At the time, I didn't understand how to use R to program a linear model with confounding, interacting variables. Confound it!

Oh well, my data analysis report was formatted nicely and the R graphs looked pretty; I can only hope my peer graders will be gracious and overlook my simple statistics.

To get the positive karma flowing, I gave all of my peers perfect scores. 

Monday, February 11, 2013

2012 Top 1% of LinkedIn Profile Views

LinkedIn just notified me that I was in the top 1% of all LinkedIn profile views in 2012. With over 200 million users, that means I am just in the top two million.

If you and I are not yet connected on LinkedIn, be sure to send me an invitation.


Wednesday, February 6, 2013

Adventures in MOOC

I am now in my third week of a MOOC adventure. In case you are not familiar with the term, MOOC is a massively open online class.

In particular, I am enrolled in an eight-week Data Analysis class from the John Hopkins Bloomberg School of Public Health, offered through the Coursera platform.

Professor Jeff Leek describes the R statistical programming course as:

"This course is an applied statistics course focusing on data analysis. The course will begin with an overview of how to organize, perform, and write-up data analyses. Then we will cover some of the most popular and widely used statistical methods like linear regression, principal components analysis, cross-validation, and p-values. Instead of focusing on mathematical details, the lectures will be designed to help you apply these techniques to real data using the R statistical programming language, interpret the results, and diagnose potential problems in your analysis. You will also have the opportunity to critique and assist your fellow classmates with their data analyses. Here is a post where I describe how data analysis fits in with other quantitative subjects: http://simplystatistics.org/2013/01/10/the-landscape-of-data-analysis/"

Being "massively open" means that there can be lots of students. I cannot find a good indicator of how many students there really are, but one discussion board forum has almost 3,000 page views. One student also put together a Google Map of where around the world people are located. While not everybody probably participated in the map, I guess there are at least hundreds if not thousands of students in class with me. 

From NY Times Published: January 26, 2013
From NY Times, Published: January 26, 2013
In a recent article, Thomas Friedman wrote about a Coursera "revolution" after learning there were almost two and a half million people taking Coursera courses. That is up from just 300,000 students less than one year ago. 

With thousands of people in each course, a teacher is not going to have time to get to know the students, grade their quizzes, answer their questions, or perform other typical teacher activities. Instead, the MOOC solution is to have the students do things themselves. 

To enable students, Coursera courses consist of many self-service components:
  • Online videos
  • Downloadable lecture notes
  • Self-scoring online quizzes
  • Lots of auxiliary reference material
  • Course Wiki 
  • Course discussion forums 
  • Course Meetups 
  • Assignments which are peer graded 

My first surprise was the weekly quiz. They are along the lines of "Using the knife and bottle of whiskey under your seat, perform a self-appendectomy..." 

Back in my college days, you had to have all of the quiz answers safely stored within your head. That is no longer true in today's world of MOOCs. 

Instead, the quiz question might ask you to download a comma-separated file from a website and load it into a "data frame" using the R programming language. From there, you need to follow a list of instructions such as create a subset of data, perform data "munging"--I have been doing IT for over thirty years and that was a new term to me--and then answer a final question like: "On rows 124, 246, and 368, what are the values of ABCVAR?" 

Once I realized the quiz expects you to search for answers, I was fine. 

Like work at my Christian college, there is an honor code. At Valparaiso University, I always wrote at the bottom of each test: "I have neither given nor received, nor will I tolerate, the use of unauthorized aid." Coursera has a check-box version of this oath. 

If MOOC students get past the initial fear--I closed the first quiz for an entire day before venturing back for a second look---I think most will pass with flying colors. The reason being you can take each quiz up to four times, with your recorded grade being the very last attempt. 

Now, if the Coursera system gave you another set of random questions each time, it would be tough.

However, it graciously just gives you the same questions again and even shows you which ones you got right and wrong. They might put the multiple-choice answers in a different sequence and rewrite the question a little, but the quiz basically stays the same. 

So if you were given a question with four multiple-choice answers along with four attempts to solve it, I do believe most people would be able to pass the test (if not get 100%). These tests are half of the total score, so pretty much a given. 

The two assignments that make up the other half of the score might be a different issue. Those are to be graded by your peers according to a published rubric. I apology to my peers, but I am concerned about their ability to grade. 

But that will be another blog posting. 

Sunday, February 3, 2013

Goodreads: The Millionaire Fastlane by DeMarco


The Millionaire Fastlane: Crack the Code to Wealth and Live Rich for a Lifetime!The Millionaire Fastlane: Crack the Code to Wealth and Live Rich for a Lifetime! by M.J. DeMarco

My rating: 3 of 5 stars





Throughout his book, DeMarco yelled at me for being an idiot and not a millionaire like him. By the end, I felt like a disgusting loser for having a well-paying job instead of being self-made, living in a fancy mansion, and driving an expensive sports car.

If you too want to be ashamed of your lack of millions of dollars, by all means read DeMarco's book.

For much of the book, I kept thinking, "OK, DeMarco, stop rambling and just tell me your secret." After he disclosed his secret to millions, he then rambled on for way too long.

The last few chapters actually seem to be completely unneeded, only making (some) sense when DeMarco finally disclosed he had organized the book according to a big acrostic of FASTLANE SUPERCHARGED. Cute, but totally unnecessary.

One of his millionaire secrets is never-ever-ever trade the hours in your days for money. If somebody is paying you an annual salary or by the hour, you are a fool, a big loser. To be a winner, you must be an independent producer of things that sell themselves while you snorkel in the Caribbean.

DeMarco's often crude style of writing implies he is a young kid who made it rich quickly and is now using book sales to keep his money rolling in.

Still, I liked DeMarco's book. He shares his Fastlane secrets to quick riches, makes some good points, and provides the reader with a fairly enjoyable book.

View all my reviews


About Me

My photo

I am a project-based consultant, helping data-intensive firms use agile methods and automation tools to replace legacy reporting and bring in modern BI/Analytics to leverage Social, Cloud, Mobile, Big Data, Visualizations, and Predictive Analytics. For several world-class vendors, I led services teams specializing in providing software implementation and custom application development. Based on scores of successful engagements, I have assembled proven methodologies and automated software tools.

During twenty years of technical consulting, I have been blessed to work with smart people from some of the world's most respected organizations, including: FedEx, Procter & Gamble, Nationwide, The Wendy's Company, The Kroger Co., JPMorgan Chase, MasterCard, Bank of America Merrill Lynch, Siemens, American Express, and others.

I was educated at Valparaiso University and the University of Cincinnati, graduating summa cum laude. In 1990, I joined Information Builders, the vendor of WebFOCUS BI and iWay enterprise integration products, and for over a dozen years served in branch leadership roles. For several years, I also led technical teams within Cincom Systems' ERP software product group and the custom software services arm of Xerox.

Since 2007, I have provided enterprise BI services such as: strategic advice; architecture, design, and software application development of intelligence systems (interactive dashboards and mobile); data warehousing; and automated modernization of legacy reporting.