23 January, 2017
Really Big Data At Walmart: Real-Time Insights From Their 40+ Petabyte Data Cloud
Walmart – the world’s biggest retailer with over 20,000 stores in 28 countries, is in the process of building the world’ biggest private cloud, to process 2.5 petabytes of data every hour.
To make sense of all of this information, and put it to work solving problems, the company has created what it calls its Data Café – a state-of-the-art analytics hub located within its Bentonville, Arkansas headquarters.
Here, over 200 streams of internal and external data, including 40 petabytes of recent transactional data, can be modelled, manipulated and visualized. Teams from any part of the business are invited to bring their problems to the analytics experts and then see a solution appear before their eyes on the nerve centre’s touch screen “smart boards”.
This tool has cut down the amount of time it takes to solve complex business questions, which are reliant on multiple external and internal variables, from weeks to minutes.
Senior Statistical Analyst Naveen Peddamail – who won his job with the company through a competition on crowd-sourced data competition website Kaggle – spoke to me about the project.
He said “If you can’t get insights until you’ve analyzed your sales for a week or a month, then you’ve lost sales within that time.
“If you can cut down that time from two or three weeks to 20 or 30 minutes, then that saves a lot of money for Walmart and stopped us losing sales. That’s the real value of what we have built with the data café.”
When bombarded with huge amounts of verified, quantifiable data at high speeds, problems caused by human error or miscalculation at the planning or execution stage of a particular business activity will often simply melt away.