Disclaimer: Many points made in this post have been derived from discussions with various parties, but do not represent any individuals or organisations.

Defining clear roles, responsibilities and ways of working is very important. Although my other post has already described the Engine and the Driver, it is interesting to understand what capabilities should remain centralised and what should be decentralised for an organisation to become more effective in their data analytics journey.

Let’s start by looking at the essential functions required to facilitate a data-driven organisation.

  • Infrastructure - with a few exceptions of highly-regulated sectors, the direction for a data infrastructure has been moving towards the cloud. Thanks to the serverless architecture and container technology, the shift not only reduces the operational complexity but also allows for higher reliability, availability and scalability, which are essential attributes for a data platform.
  • Data Pipelines - a robust movement and processing of data from one point to another is required to make it suitable for consumption. A data pipeline can be either a simple ELT/ETL process or a complex orchestration including real-time streaming and modelling. The emergence of streaming engines such as StormFlink and Spark also makes real-time analysis easier.
  • Reporting and Analysis - the ultimate goals of getting into the data space is to gain additional values. It can be a simple process that turns data into informational summaries or a complex analysis that extracts meaningful insights in a descriptive, a predictive or a prescriptive way. The product of such reporting or analysis can be presented in different ways subject to the usability and functionality requirements.
  • Other Functions - security and governance are considered intrinsic functions to the data platform. Access controls and appropriate policies must be in place to safeguard against attacks and unintended usage of sensitive data. Suitable capabilities on monitoring, auditing and billing are also essential depending on the operational requirements of each organisation.

Before considering what capabilities should be decentralised or remain centralised, it is worth to understand what can happen under a different context.

#data #data-analytics #data-strategy #data-asset #agile-teams #business #data-pipeline #analytics

How to Define Data Analytics Capabilities | Hacker Noon
1.40 GEEK