Power BI to Query 1 Trillion Row Data Sets – What’s Next?
It is an exciting time to be in the field of analytics. Data has never been more accessible, manipulatable, mergeable, analyzable or sharable. Making this possible is the highly competitive environment of self-serve analytics platforms. Of the dozens to choose from, Microsoft has made a few strategic moves to solidify a competitive foothold.
Microsoft launched Power BI for Office 365 in 2013. The initial launch of Power BI was a platform that had similar functionality to Excel add-ins Power Query, Power View and Power Pivot. At a freemium price point and as part of the Office 365 experience, Microsoft ensured that business users have access to this powerful tool at a low entry point. Although the current Power BI release supports 10 billion rows, in early 2018, Microsoft will release a version of Power BI that will support the ability to query 1 trillion rows.
From a technical perspective, Microsoft successfully set the stage for any organization to leverage analytics to create data driven organizations. Almost any position in an organization can become a big data analyst with little technical knowledge.
So … what’s next?
The IT group should pivot their tasks to the important role of becoming a data steward. With more people creating and sharing ‘information’, IT should ensure that the right people have the right access to the right information. IT has the difficult job of balancing organizational innovation vs data security, this is a constant negotiation between the business and IT.
Now that the data is easier to manipulate and share, each department may create their own data definitions causing confusion across the organization. The need will arise for a data dictionary defining all calculations and manipulatable data fields. For example: when does someone become a customer? Is it: when they sign a contract, give their credit card information or when revenue is booked? Depending on the department, the answer to this question will likely be different. Therefore, the simple calculation of customer retention will vary between marketing and accounting, thus, creating a need for a data dictionary.
Once data definitions are created, there will likely be different theories on causation. For example: why did a customer leave? There may be several theories that exist within an organization regarding client attrition. Price, quality of service, market factors and competition may provide different theories that may be proved through analytics. Organizations must create a culture around the scientific method.
Too often in business, we come to overarching conclusions without testing the hypothesis. Peer review and healthy debate should be accepted as part of a data analyst’s methodology.
Openly testing and debating observations will only have a positive impact if the results are directly tied back to strategy and a change management plan. Change is extremely difficult. According to a recent study by Towers Watson, only 25% of change initiatives are ultimately considered successful. For an analytics program to show long term value, the change initiative must be successful. After all, how can an organization leverage the value of their data if they are unwilling to change their behavior?