Madilyn  Kihn

Madilyn Kihn


Data Quality — You’re Measuring It Wrong

Data Quality — You’re Measuring It Wrong. Introducing a better way: data downtime

#data #data-quality #data-team #data-science #data-engineering #programming

What is GEEK

Buddha Community

Data Quality — You’re Measuring It Wrong
Siphiwe  Nair

Siphiwe Nair


Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Madilyn  Kihn

Madilyn Kihn


Data Quality — You’re Measuring It Wrong

Data Quality — You’re Measuring It Wrong. Introducing a better way: data downtime

#data #data-quality #data-team #data-science #data-engineering #programming

Gerhard  Brink

Gerhard Brink


Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.


As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).

This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

How to Measure Data Quality

You need reliable information to make decisions about risk and business outcomes. Often this information is so valuable that you may seek to purchase it directly through 3rd-party data providers to augment your internal data.
Yet, how often do you consider the impact of data quality on your decisions? Poor information can have detrimental effects on business decisions, and consequently, your business performance, innovation, and competitiveness. In fact, according to Gartner, 40% of all business initiatives fail to achieve their targeted benefits as a result of poor data quality.
Organizations that treat information as a corporate asset should engage in the same quality assessment discipline as with their traditional assets. This means monitoring and improving the quality and value of their information continuously.
In this post, we’re going to explore a practical framework for assessing and comparing data quality. You can apply this to data produced internally, as well as data purchased from 3rd-party vendors.

#data-science #data-driven #data-management #data #data-quality

How to Fix Your Data Quality Problem

Introducing a better way to prevent bad data.

Image for post

Data quality is top of mind for every data professional — and for good reason. Bad data****costs companies valuable time, resources, and most of all, revenue. So why are so many of us struggling with trusting our data? Isn’t there a better way?

The data landscape is constantly evolving, creating new opportunities for richer insights at every turn. Data sources old and new mingle in the same data lakes and warehouses, and there are vendors to serve your every need, from helping you build better data catalogs to generating mouthwatering visualizations (leave it to the NYT to make mortgages look sexy).

Not surprisingly, one of the most common questions customers ask me is “what data tools do you recommend?

More data means more insight into your business. At the same time, more data introduces a heightened risk of errors and uncertainty. It’s no wonder data leaders are scrambling to purchase solutions and build teams that both empower smarter decision making and manage data’s inherent complexities.

But I think it’s worth asking ourselves a slightly different question. Instead, consider: **“what is required for our organization to make the best use of — and trust — our data?”**

Data quality does not always solve for bad data

It’s a scary prospect to make decisions with data you can’t trust, and yet it’s an all-too-common practice of even the most competent and experienced data teams. Many teams first look to data quality as an anecdote for data health and reliability. We like to say “garbage in, garbage out.” It’s a true statement — but in today’s world, is that sufficient?

Businesses spend time, money, and resources buying solutions and building teams to manage all this infrastructure with the pipe(line) dream of one day being a well-oiled, data-driven machine — but data issues can occur at any stage of the pipeline, from ingestion to insights. And simple row counts, ad hoc scripts, and even standard data quality conventions at ingestion just won’t cut it.

#data-science #data-analysis #data-quality #towards-data-science #data #data analysis