Welcome to the data team! Please solve everything.

Let’s begin with a common situation. Somebody with at least an ounce of authority, whether it be a plucky MBA marketing intern or the CEO, is faced with a difficult problem. They schedule a project kickoff meeting with various folks to brainstorm a solution, and know that it will need to be supported by data if it has any chance of being approved by their superiors. Although they make the meeting invite optional for a few folks, they ensure that somebody from the data analytics or science team (henceforth referred to as “data team”) will be there.

A quick note: in my opinion, there still isn’t a consensus about the difference between data analytics and data science. The tasks that data analysts perform at one company would fit the job description of data scientists at another. Regardless, know when I refer to the “data team” for the rest of this article and future ones, I’m describing end users of data who are doing data reporting, analysis and/or modeling to help an organization accomplish its goals. They’re not data engineers, tasked with building, maintaining, and improving data pipelines and other infrastructure. They’re also not machine learning engineers, who are building models that are integrated into the company’s product, such as recommender or algorithmic pricing systems.

Let’s imagine you’re on this highly demanded data team. On the one hand, you’re directly involved in solving some of the most important problems your company faces. On the other, it’s very possible that some of the problems you’re faced with falls into one of the following unfortunate categories:

1. The problem is not informed by data.

Although progress is being made in natural language processing, creative writing is best left to humans. Brand messaging in particular comes to mind.

We didn’t get the successful slogans “Where’s the Beef?”, “Just Do It”, or “Got Milk?” from big data piped into fancy modeling frameworks and dynamic dashboards. In fact, if we imagine that a data and modeling team with access to historical brand advertising campaign performance was heavily involved at the ad agencies of the latter two campaigns, we might have ended up with “Where’s the Air?” and “Where’s the Milk?” The goal of these advertisers was to come up with a punchy, original slogan that _differentiates _themselves from other brands. The algorithmically generated slogans I shared are clearly ridiculous, but they drive home the fact that historical data and typical modeling techniques aren’t appropriate for this problem.

Image for post

import creativity as crtv. Photo by Pixabay from Pexels.

At best, the data folks who are involved in the project have their time wasted and politely excuse themselves as the project progresses. At worst, they dazzle stakeholders with impressive looking but irrelevant analyses, crowding out the actual work that needs to be done by other project contributors.

As more data, in particular new datasets, are generated over time and therefore reliance on data for decision making increases, we’ll find ourselves less frequently in this situation than the subsequent ones.

#data-science #leadership #decision-making #advice #analytics #data analysis

What is GEEK

Buddha Community

Welcome to the data team! Please solve everything.
Siphiwe  Nair

Siphiwe Nair

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Gerhard  Brink

Gerhard Brink

1620629020

Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

Introduction

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).


This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

Gerhard  Brink

Gerhard Brink

1624699032

Introduction to Data Libraries for Small Data Science Teams

At smaller companies access to and control of data is one of the biggest challenges faced by data analysts and data scientists. The same is true at larger companies when an analytics team is forced to navigate bureaucracy, cybersecurity and over-taxed IT, rather than benefit from a team of data engineers dedicated to collecting and making good data available.

Creative, persistent analysts find ways to get access to at least some of this data. Through a combination of daily processes to save email attachments, run database queries, and copy and paste from internal web pages one might build up a mighty collection of data sets on a personal computer or in a team shared drive or even a database.

But this solution does not scale well, and is rarely documented and understood by others who could take it over if a particular analyst moves on to a different role or company. In addition, it is a nightmare to maintain. One may spend a significant part of each day executing these processes and troubleshooting failures; there may be little time to actually use this data!

I lived this for years at different companies. We found ways to be effective but data management took up way too much of our time and energy. Often, we did not have the data we needed to answer a question. I continued to learn from the ingenuity of others and my own trial and error, which led me to the theoretical framework that I will present in this blog series: building a self-managed data library.

A data library is _not _a data warehousedata lake, or any other formal BI architecture. It does not require any particular technology or skill set (coding will not be required but it will greatly increase the speed at which you can build and the degree of automation possible). So what is a data library and how can a small data analytics team use it to overcome the challenges I’ve described?

#big data #cloud & devops #data libraries #small data science teams #introduction to data libraries for small data science teams #data science

Analyzing Data From U.S. Road Accidents With Data Visualization

Every 24 seconds, a life is lost on the road, and it costs countries around 3% of their gross domestic product - World Health Organization.

With a fatality rate of 12.3% per 100,000 inhabitants, traffic accidents are a leading cause of death in the United States. In 2019, it was reported that 36,096 lives were lost on U.S. roads and according to the National Highway Traffic System Administration (NHTSA), it costs about $871 billion annually to the U.S. economy.

In this article, we would be analyzing data related to US road accidents, which can be utilized to study accident-prone locations and also helps understand the factors that influence road fatalities in the United States.

“Having access to accurate and updated information about the current road situation enables drivers, pedestrians, and passengers to make informed road safety decisions.”

- Association For Safe International Road Travel.

#data-science #big-data-analytics #data-integration #solving-data-integration #data #data-analysis

Chet  Lubowitz

Chet Lubowitz

1595429220

How to Install Microsoft Teams on Ubuntu 20.04

Microsoft Teams is a communication platform used for Chat, Calling, Meetings, and Collaboration. Generally, it is used by companies and individuals working on projects. However, Microsoft Teams is available for macOS, Windows, and Linux operating systems available now.

In this tutorial, we will show you how to install Microsoft Teams on Ubuntu 20.04 machine. By default, Microsoft Teams package is not available in the Ubuntu default repository. However we will show you 2 methods to install Teams by downloading the Debian package from their official website, or by adding the Microsoft repository.

Install Microsoft Teams on Ubuntu 20.04

1./ Install Microsoft Teams using Debian installer file

01- First, navigate to teams app downloads page and grab the Debian binary installer. You can simply obtain the URL and pull the binary using wget;

$ VERSION=1.3.00.5153
$ wget https://packages.microsoft.com/repos/ms-teams/pool/main/t/teams/teams_${VERSION}_amd64.deb

#linux #ubuntu #install microsoft teams on ubuntu #install teams ubuntu #microsoft teams #teams #teams download ubuntu #teams install ubuntu #ubuntu install microsoft teams #uninstall teams ubuntu