We feel that finding the needle in the haystack is the goal, so ensuring that the data you are analyzing has been modeled and filtered is of the utmost importance.
The idea that the combination of predictive algorithms and Big Data will change the world is a tempting one, and it may end up being true. But for now, the industry is facing a reality check when it comes to Big Data analytics.
Instead of focusing on what algorithms to use, your Big Data success depends more on how well you cleaned, integrated and transformed your data.
More important than algorithms
The dirty little secret of Big Data analytics is all the work that goes into prepping and cleaning the data before it can be analyzed. You may have the sharpest data scientists on the planet writing the most advanced algorithms the universe has ever seen, but it won’t amount to a hill of beans if your data set is dirty, incomplete or flawed. That’s why up to 80 percent of the time and effort in Big Data analytic projects is spent on cleaning, integrating and transforming the data.
Validate the data
Instead of focusing on algorithms, people ought to be focused on validating the data. Everybody basically has the same algorithms. I would suggest that the Big Data teams have a good grasp on the role Ontology plays in their Big Data initiative, especially when dealing with unstructured data.
In the case of financial markets the new FIBO (Financial Industry Business Ontology) standard should be well understood. Dodd-Frank brings a whole slew of new challenges.