Saturday, June 15, 2013

The Best Big Data Software

In the June 2013 issue of Software Development Times, the editors have once again selected the industry's leading firms. This is their 11th annual choosing of the SD Times 100. You can download the entire issue here.

One of their selected categories grouped the software vendors in the hot Big Data and Business Intelligence market.







The magazine--one of my favorites--introduced these innovative companies with this quote:
"In this new category, the editors of SD Times recognize that with exponential data growth comes exponential problem growth. It also creates a storage problem, a retrieval problem, and a problem in understanding it all, so organizations 'doing' Big Data get actionable information that keeps them a step or two ahead of the competition. These vendors are the ones who've tackled the giant data problem with aplomb."

And the winners are:

10gen
10gen developed the open-source MongoDB noSQL Big Data database. In a recent announcement, IBM and 10gen will jointly craft a new standard for enterprise databases specifically for the mobile market. See this article about the shake-up happening in the enterprise database market.

Apache Hadoop
The Apache Software Foundation is a community of developers working on open-source software. It is the poster child for successful open-source software.

If you are a software professional, it is highly unlikely that you do not use at least one of their projects: Apache web server, Ant application development tools, ActiveMQ message queuing, Derby relational database, Flex web browser application development platform, Lucene search engine, the OpenOffice productivity suite, the Struts web application development framework, Subversion source code management system, the Tomcat Java app server, and others.

Within the Big Data space, Apache has some of the leading technologies: Cassandra platform, HBase read/write access, Mahout machine learning library, the Hadoop distributed computing platform, and the Pig analytics tool. If you are not familiar with Hadoop, it was initially based on papers published around 2004 by Google on how it was handling massive amounts of data.

Cloudera
Some of the original Big Data software developers from the Apache Software Foundation and companies such as Yahoo and Google quickly formed their own firms to take advantage of this emerging market (smart guys!). Cloudera is one such vendor specializing in Apache Hadoop.

DataStax
DataStax was formed to specialize in the Apache Cassandra platform. You can read about them here.

FatCloud
FatCloud makes the FatDB NoSQL database for the Windows .NET platform.

Hortonworks
Hortonworks is another leading firm specializing in the Apache Hadoop software.

Objectivity
Objectivity is the maker of a graph database and Big Data database tools.

Pentaho
Pentaho is an open-source BI software platform that is actively pursuing the Big Data analytics space. As part of that offering, they not long ago announced their Instaview data visualization product.

Splunk
Splunk is targeting the Big Data niche of machine-generated data. Websites, servers, mobile phones, and other devices are constantly spitting out huge amounts of data. Splunk wants to help organizations unlock the hidden potential of this potentially actionable information.



As you read this list, the term "open-source" should have repeatedly jumped out at you. Either the editors of SD Times are in love with the OSS concept or there is a real revolution going on within the software industry (the real answer may be both, I suppose).

It is also interesting to see the Big Data power cluster of individuals originally associated with Apache projects: Cloudera, Hortonworks, and DataStax.

Along with mobile application development, today's hot space for software professionals is Big Data. 

1 comment:

DataH said...

Doug, one other open source technology to mention is HPCC Systems from LexisNexis, a data-intensive supercomputing platform for processing and solving big data analytical problems. Their open source Machine Learning Library and Matrix processing algorithms assist data scientists and developers with business intelligence and predictive analytics. Its integration with Hadoop, R and Pentaho extends further capabilities providing a complete solution for data ingestion, processing and delivery. In fact, a webhdfs implementation, (web based API provided by Hadoop) was recently released.

About Me

My photo

I am a project-based software consultant, specializing in automating transitions from legacy reporting applications into modern BI/Analytics to leverage Social, Cloud, Mobile, Big Data, Visualizations, and Predictive Analytics using Information Builders' WebFOCUS. Based on scores of successful engagements, I have assembled proven Best Practice methodologies, software tools, and templates.

I have been blessed to work with innovators from firms such as: Ford, FedEx, Procter & Gamble, Nationwide, The Wendy's Company, The Kroger Co., JPMorgan Chase, MasterCard, Bank of America Merrill Lynch, Siemens, American Express, and others.

I was educated at Valparaiso University and the University of Cincinnati, where I graduated summa cum laude. In 1990, I joined Information Builders and for over a dozen years served in regional pre- and post-sales technical leadership roles. Also, for several years I led the US technical services teams within Cincom Systems' ERP software product group and the Midwest custom software services arm of Xerox.

Since 2007, I have provided enterprise BI services such as: strategic advice; architecture, design, and software application development of intelligence systems (interactive dashboards and mobile); data warehousing; and automated modernization of legacy reporting. My experience with BI products include WebFOCUS (vendor certified expert), R, SAP Business Objects (WebI, Crystal Reports), Tableau, and others.