The reality is that data in most organisations are distributed across multiple operational and analytical systems in various big data and traditional data stores, including Apache Hadoop, relational databases and NoSQL stores such as MongoDB. With social media, cloud applications and syndicated data services leading to expanding volume, variety and velocity of data, many organisations are realisin