Ruthie  Bugala

Ruthie Bugala

1620431700

Analyze Azure Cosmos DB data using Azure Synapse Analytics

This article will help you understand how to analyze Azure Cosmos DB data using Azure Synapse Analytics.

Introduction

Azure Cosmos DB is a multi-model NoSQL database that supports hosting various types of data that are transactional in nature. OLTP systems employ transactional databases for hosting operational data. To analyze large volumes of transactional data, relational databases do not scale or perform to the needs of large-scale analytics. Columnar data warehouses are one of the preferred, effective, and proven means of analyzing and aggregating large volumes of data for big data scale analytics. Azure Synapse is the data warehouse offering in the Microsoft Azure technology stack. The challenge with analyzing transactional data in relational databases using columnar warehouses is that one needs to replicate and/or relocate data from operational repositories into analytical repositories. Hybrid transactional analytical processing (HTAP) is a methodology or approach where data hosted in a relational format is auto-organized in a columnar format eliminating the need to replicate and/or relocate the data to a great extent. Azure offers a feature to analyze data hosted in Cosmos DB using Azure Synapse. In this article, we will learn how to implement the same.

Pre-requisites

We are assuming that we are hosting data in the Cosmos DB instance. To simulate this assumption, we would need an Azure Cosmos DB account implemented using the Core (SQL) API, with all the preview features turned on. Once you have an account created, you would be able to see an account listed as shown below.

#azure #sql azure #azure synapse analytics #azure

What is GEEK

Buddha Community

Analyze Azure Cosmos DB data using Azure Synapse Analytics
Christa  Stehr

Christa Stehr

1603941420

Support for Synapse SQL serverless in Azure Synapse Link for Azure Cosmos DB

Co-authored by Rodrigo Souza, Ramnandan Krishnamurthy, Anitha Adusumilli and Jovan Popovic (Azure Cosmos DB and Azure Synapse Analytics teams)

Azure Synapse Link now supports querying Azure Cosmos DB data using Synapse SQL serverless. This capability, available in public preview, allows you to use familiar analytical T-SQL queries and build powerful near real-time BI dashboards on Azure Cosmos DB data.

As announced at Ignite 2020, you can now also query Azure Cosmos DB API for Mongo DB data using Azure Synapse Link, enabling analytics with Synapse Spark and Synapse SQL serverless.

Support for T-SQL queries and building near real-time BI dashboards

Azure Synapse SQL serverless (previously known as SQL on-demand) is a serverless, distributed data processing service offering built-in query execution fault-tolerance and a consumption-based pricing model. It enables you to analyze your data in Cosmos DB analytical store within seconds, without any performance or RU impact on your transactional workloads.

Using OPENROWSET syntax and automatic schema inference, data and business analysts can use familiar T-SQL query language to quickly explore and reason about the contents in Azure Cosmos DB analytical store. You can query this data in place without the need to copy or load the data into a specialized store.

You can also create SQL views to join data in the analytical stores across multiple Azure Cosmos DB containers, to better organize your data in a semantic layer that will accelerate your data exploration and reporting workloads. BI Professionals can quickly create Power BI reports on top of these SQL views in Direct Query mode.

You can further extend this by building a logical data warehouse to create and analyze unified views of data across Azure Cosmos DB, Azure Data Lake Storage and Azure Blob Storage.

 

#analytics #announcements #api for mongodb #core (sql) api #data architecture #query #azure cosmos db #azure synapse analytics #serverless sql pools #sql on-demand #synapse link #synapse sql serverless

Ruthie  Bugala

Ruthie Bugala

1620431700

Analyze Azure Cosmos DB data using Azure Synapse Analytics

This article will help you understand how to analyze Azure Cosmos DB data using Azure Synapse Analytics.

Introduction

Azure Cosmos DB is a multi-model NoSQL database that supports hosting various types of data that are transactional in nature. OLTP systems employ transactional databases for hosting operational data. To analyze large volumes of transactional data, relational databases do not scale or perform to the needs of large-scale analytics. Columnar data warehouses are one of the preferred, effective, and proven means of analyzing and aggregating large volumes of data for big data scale analytics. Azure Synapse is the data warehouse offering in the Microsoft Azure technology stack. The challenge with analyzing transactional data in relational databases using columnar warehouses is that one needs to replicate and/or relocate data from operational repositories into analytical repositories. Hybrid transactional analytical processing (HTAP) is a methodology or approach where data hosted in a relational format is auto-organized in a columnar format eliminating the need to replicate and/or relocate the data to a great extent. Azure offers a feature to analyze data hosted in Cosmos DB using Azure Synapse. In this article, we will learn how to implement the same.

Pre-requisites

We are assuming that we are hosting data in the Cosmos DB instance. To simulate this assumption, we would need an Azure Cosmos DB account implemented using the Core (SQL) API, with all the preview features turned on. Once you have an account created, you would be able to see an account listed as shown below.

#azure #sql azure #azure synapse analytics #azure

 iOS App Dev

iOS App Dev

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Eric  Bukenya

Eric Bukenya

1624713540

Learn NoSQL in Azure: Diving Deeper into Azure Cosmos DB

This article is a part of the series – Learn NoSQL in Azure where we explore Azure Cosmos DB as a part of the non-relational database system used widely for a variety of applications. Azure Cosmos DB is a part of Microsoft’s serverless databases on Azure which is highly scalable and distributed across all locations that run on Azure. It is offered as a platform as a service (PAAS) from Azure and you can develop databases that have a very high throughput and very low latency. Using Azure Cosmos DB, customers can replicate their data across multiple locations across the globe and also across multiple locations within the same region. This makes Cosmos DB a highly available database service with almost 99.999% availability for reads and writes for multi-region modes and almost 99.99% availability for single-region modes.

In this article, we will focus more on how Azure Cosmos DB works behind the scenes and how can you get started with it using the Azure Portal. We will also explore how Cosmos DB is priced and understand the pricing model in detail.

How Azure Cosmos DB works

As already mentioned, Azure Cosmos DB is a multi-modal NoSQL database service that is geographically distributed across multiple Azure locations. This helps customers to deploy the databases across multiple locations around the globe. This is beneficial as it helps to reduce the read latency when the users use the application.

As you can see in the figure above, Azure Cosmos DB is distributed across the globe. Let’s suppose you have a web application that is hosted in India. In that case, the NoSQL database in India will be considered as the master database for writes and all the other databases can be considered as a read replicas. Whenever new data is generated, it is written to the database in India first and then it is synchronized with the other databases.

Consistency Levels

While maintaining data over multiple regions, the most common challenge is the latency as when the data is made available to the other databases. For example, when data is written to the database in India, users from India will be able to see that data sooner than users from the US. This is due to the latency in synchronization between the two regions. In order to overcome this, there are a few modes that customers can choose from and define how often or how soon they want their data to be made available in the other regions. Azure Cosmos DB offers five levels of consistency which are as follows:

  • Strong
  • Bounded staleness
  • Session
  • Consistent prefix
  • Eventual

In most common NoSQL databases, there are only two levels – Strong and EventualStrong being the most consistent level while Eventual is the least. However, as we move from Strong to Eventual, consistency decreases but availability and throughput increase. This is a trade-off that customers need to decide based on the criticality of their applications. If you want to read in more detail about the consistency levels, the official guide from Microsoft is the easiest to understand. You can refer to it here.

Azure Cosmos DB Pricing Model

Now that we have some idea about working with the NoSQL database – Azure Cosmos DB on Azure, let us try to understand how the database is priced. In order to work with any cloud-based services, it is essential that you have a sound knowledge of how the services are charged, otherwise, you might end up paying something much higher than your expectations.

If you browse to the pricing page of Azure Cosmos DB, you can see that there are two modes in which the database services are billed.

  • Database Operations – Whenever you execute or run queries against your NoSQL database, there are some resources being used. Azure terms these usages in terms of Request Units or RU. The amount of RU consumed per second is aggregated and billed
  • Consumed Storage – As you start storing data in your database, it will take up some space in order to store that data. This storage is billed per the standard SSD-based storage across any Azure locations globally

Let’s learn about this in more detail.

#azure #azure cosmos db #nosql #azure #nosql in azure #azure cosmos db

Gerhard  Brink

Gerhard Brink

1620629020

Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

Introduction

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).


This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management