Monty  Boehm

Monty Boehm

1671469380

What is Snowflake Database - Guide

The snowflake database is all the rage these days. But what are they, exactly? And what makes them so special?

In this guide, we’ll answer those questions and more. We’ll explain what snowflake databases are, how they work, and why you might want to use one in your business. Plus, we’ll give you a few tips on how to get started with the snowflake database if you’re not quite sure where to start.

1. What is a snowflake database?

Snowflake databases are a more modern type of database architecture. Their rise in popularity reflects the way companies use data today and what platforms they prefer to access their information.

Unlike its predecessor, the star schema, snowflake schemas allow greater flexibility when designing and managing databases. They’re also easier to maintain and update than traditional relational databases (RDB). And that’s not all: many find them easier to query as well.

2. What is a relational database? 

A relational database is one of the most common ways to store large amounts of data in an organized manner.

It works by associating pieces of related data with each other, forming what’s called a table. A table can contain many different types of data (strings, numbers, dates, etc.), which are what make up its rows and columns.

3. What is a star schema? 

The star schema is one of the most common ways to organize databases in relational database management systems (RDBMS). It’s an old method that was popular before snowflake databases came into existence. 

The data in the RDBMS is organized into what are called fact tables, which hold values, and what are called dimension tables, which hold metadata used for querying the database. These two types of tables form what’s called the star schema model.

4. How does a snowflake database work? 

Snowflake databases help companies solve modern-day problems with traditional relational databases, such as the star schema. They address issues like data fragmentation, maintenance overhead, and computing power. This is what makes them so popular.

A snowflake database organizes the same types of information present in relational databases into dimensional models. The most significant difference between a snowflake model and a star schema is that the dimensions in a snowflake database don’t depend on each other for storage or querying purposes. This gives you greater flexibility when thinking about what tables to build and what columns to put within them.

5. What are some advantages of using snowflake schemas over traditional relational databases? 

Many organizations, especially those dealing with large amounts of structured data, opt to use snowflake databases instead of RDBMS. Here are what some of the advantages are:

Flexible schema design: Snowflake databases allow you to design schemas that reflect how business users think about data. Not what the database engine needs to store the data effectively. This helps reduce complexity and boost performance.

Simplified management: Snowflake schemas make it easier for companies to spot problems arising from changes in their organization’s data model. They’re also easier to maintain because they don’t require complex ETL processes as RDBMSs do. And lastly, there’s less computing overhead than other types of database structures because snowflake structures distribute individual tables across multiple servers.

Enhanced querying capabilities: Since dimensions in a snowflake database aren’t dependent on each other, there’s usually little data duplication. This allows companies to query the entire snowflake more efficiently than an RDBMS.

6. What are some alternatives to snowflake databases? 

Snowflake models aren’t the only way for businesses to store their data. There are a few alternative ways: what we call normalized and denormalized data storage (the latter is also known as denormalized data). They’re helpful in certain situations, but they don’t offer the same advantages as snowflakes do over star schemas. This is why those who need those advantages tend to prefer snowflakes over other types of database structures.

Normalized Data Storage: This method converts several tables into one. This helps resolve duplication issues because there’s only one table. However, it can get really complicated to maintain because of all the necessary joins between tables.

De-Normalized Data Storage (Denormalized): This does remove normalization for what would be different tables and put the same information in instead. This, like normalized storage, reduces data redundancy. Still, it also has its own set of problems. Including making queries more complex and costly than if they were done on a snowflake model or other alternative data structure.

7. Why choose a snowflake database for your business?

Snowflake databases are what are known as dimensional models. They’re typically used for online analytical processing (OLAP), which means they’re great at handling large volumes of data. This makes them perfect for what businesses now need to do, such as analyzing large amounts of structured and unstructured data, pulling insights from machine learning systems, and making real-time decisions based on what the data shows.

What sets snowflakes apart is how they organize information so companies can store what matters most while also allowing them to extend this storage across multiple servers. Less computing overhead than RDBMSs, they help improve performance by efficiently filtering out unnecessary information that isn’t relevant to business users’ particular tasks.

8. Tips on getting started with the snowflake database 

What should your first steps be if you’re just starting out with snowflake databases?   We’ve got a few tips for you: what better way to get started than by building what’s known as a dimensional model. This is what data architects and business intelligence (BI) analysts use to map how data is connected and where it can be found within the snowflake. Data modeling is an integral part of this process. So you need to know what kind of design best suits your company’s needs - whether you want to go hybrid or fully dimensional.

As far as networking goes, what you’ll want to do next is install grids on each server that your snowflake will run on. These are what bring all the servers together to have what they need to work with the snowflake model. Once these grids are installed, what you should do next is import what’s known as a knowledge module. These will ensure that all of your servers communicate effectively and can handle what they need to, so you get excellent performance from what might be a large amount of data.

Learning the snowflake database can be a daunting task, but it can be easy to get up to speed with the right resources. 

If you need help setting up or managing your snowflake database, don’t hesitate to contact us. Our experts are more than happy to help you get started and make sure your database is running smoothly.

Original article source at: https://www.blog.duomly.com/

#snowflake #database 

What is Snowflake Database - Guide
Desmond  Gerber

Desmond Gerber

1670684068

Model Snowflake Materialized Views in Vertabelo

Cloud technologies are becoming more and more popular. Recently, Vertabelo ADDED SUPPORT FOR THE SNOWFLAKE DATABASE. An additional feature, much awaited by our users, was support for materialized views in Snowflake. We are happy to announce that you can now model materialized views in a Snowflake database using Vertabelo.

What Is a Materialized View?

Materialized views are different from simple VIEWS. While simple views allow us to save complicated queries for future use, materialized views store a copy of the query results. They are not always perfectly up-to-date, but can be very useful if the results of complicated queries must be obtained very quickly.

In this article, we will show you how to model Snowflake materialized views in VERTABELO.

Modeling Snowflake Materialized Views in Vertabelo

1. Add a Materialized View

To add a view, click the Add materialized view icon.

How to Model Snowflake Materialized Views in Vertabelo

Alternatively, in Model Structure (in the left panel), right-click on Materialized views and choose Add materialized view.

How to Model Snowflake Materialized Views in Vertabelo

2. Change the View Name

If you don’t change the default view name, the following warning will pop up:

How to Model Snowflake Materialized Views in Vertabelo

To fix that, click on the materialized view. In the right panel, under General, type the name of your choice.

How to Model Snowflake Materialized Views in Vertabelo

You can also add a comment to the view if you’d like.

3. Add the Query

The next step is to add a query. Use the SQL query field in the Materialized View Properties panel.

How to Model Snowflake Materialized Views in Vertabelo

Next, scroll down and click Update columns.

How to Model Snowflake Materialized Views in Vertabelo

A new window will appear. Verify the columns that will be generated based on the SQL query you provided. Then, click Update columns.

How to Model Snowflake Materialized Views in Vertabelo

The view’s columns should be updated:

How to Model Snowflake Materialized Views in Vertabelo

Note that the materialized view must have at least one column. Otherwise, the following error will appear:

How to Model Snowflake Materialized Views in Vertabelo

How to Edit Columns

If the automatically generated columns are incorrect or if they need to be changed, you can always modify them manually in the Columns section of the right panel. Here you can change column names and types, add new columns, or delete existing columns.

How to Model Snowflake Materialized Views in Vertabelo

Materialized View Options

Additional SQL Scripts

To configure additional SQL scripts, select the materialized view. In the right panel, scroll down to the Additional SQL scripts section. You can add scripts that will be run before and after the materialized view is created. They can perform actions that cannot be modelled directly in Vertabelo, such as defining functions or stored procedures, adding users, or setting permissions for objects like views or tables.

How to Model Snowflake Materialized Views in Vertabelo

Additional Properties

To see additional view properties, select the view. In the right panel, scroll down to the Additional properties section.

How to Model Snowflake Materialized Views in Vertabelo

Click Set to add property.

How to Model Snowflake Materialized Views in Vertabelo

Click Unset to remove it.

How to Model Snowflake Materialized Views in Vertabelo

Let’s discuss the role of each materialized view property in Snowflake:

Schema – This is the name of the schema in which the table will be placed.

Secure – If this is set to yes, the view is marked as secure. This means that the underlying tables, internal structural details, and the data in the view’s base tables are hidden and can only be accessed by authorized users.

Cluster by – This property is a comma-separated list of columns or expressions that determine how to cluster the materialized view. When the materialized view is clustered, the data inserted is sorted based on the clustering columns. This can improve performance when the data is queried, as not all the rows will be scanned.

Format

To change the view’s appearance, select the right panel and scroll down to the Format section. To change the table background color, click on the color field under Fill color. Select the desired color in the color picker. You can also choose the table’s line color and set its size to fixed.

How to Model Snowflake Materialized Views in Vertabelo

Learn More About Materialized Views in Vertabelo

In this article, we discussed what materialized views are and how to model them in Vertabelo. You can LEARN MORE ABOUT MATERIALIZED VIEWS HERE. To see what new features were introduced in Vertabelo this year, check out  2020 WAS VERTABELO DATABASE MODELER’S YEAR. IT’LL GET EVEN BETTER IN 2021.

Original article source at: https://www.vertabelo.com/

#views #model #snowflake 

Model Snowflake Materialized Views in Vertabelo
Gordon  Murray

Gordon Murray

1670667429

Model Snowflake External Tables in Vertabelo

There is a lot more data than stored in databases. This raises the question of how to access all the external data from the database. The external tables come to the rescue! Read along to learn more about external tables in Snowflake and how to model them in Vertabelo.

This article focuses on external tables in the Snowflake database. We will first introduce the Snowflake database and the concept of external tables. We’ll then see how to model Snowflake external tables in VERTABELO.

To find out more about Vertabelo’s support for the Snowflake database, check out THIS ARTICLE.

Let’s get started!

External Tables in Snowflake

We will start with the basics. Let’s find out more about Snowflake and the concept of external tables.

What Is Snowflake?

Snowflake provides various services. It is a cloud computing data warehouse where you can store and analyze data. It is also an SQL database that lets you store, access, and retrieve data.

External Tables in Snowflake

Source: HTTPS://WWW.SNOWFLAKE.COM/BLOG/MANAGING-SNOWFLAKES-COMPUTE-RESOURCES/

Snowflake runs on Amazon Web Services, Microsoft Azure, and Google Cloud. It introduces a convenient separation of data storage from computing, thus allowing for seamless scaling of both the storage and the CPU independently.

What Are External Tables?

We are used to working with database tables whose data resides internally in a database. But we can also create tables with data external to the database. These are the so-called external tables.

What else should you know about external tables?

  • While the data in external tables resides outside the database, external tables make it feel as if it is inside.
  • External tables let you use their data for querying and joining operations.
  • External tables are read-only. So, you cannot perform any DML operations on them.
  • Querying data external to a database can negatively impact query performance. You can fix it by creating materialized views based on external tables and running queries on these views.

Let’s visualize external tables in a database.

External Tables in Snowflake

External tables do what the name indicates. They take data from external files and make it available in a database. Standard tables, on the other hand, contain data that resides internally in a database.

Modeling Snowflake External Tables in Vertabelo

Now, let’s get started with Vertabelo. We’ll learn how to create external tables in Snowflake and modify their properties.

But first, let’s create a new physical data model for our Snowflake database. Then we can design external tables in ER diagrams.

External Tables in Snowflake

Now, we are ready to start!

There are many ways you can draw ER diagrams. Learn more by visiting our articles on HOW TO DRAW AN ER DIAGRAM ONLINE and TOP 7 ENTITY RELATIONSHIP (ER) DIAGRAM ONLINE TOOLS.

Creating External Tables in Snowflake

A Snowflake ER diagram would not be complete without external tables. Vertabelo provides you with convenient ways of creating them.

Via the Toolbar

Just click the button, and you’ll get the external table ready.

External Tables in Snowflake

Via the Left Panel

You can also do it via the left-hand panel, like this:

External Tables in Snowflake

Now, our external table is ready.

External Tables in Snowflake

Let’s fix the errors and warnings by modifying the properties in the right-hand panel.

Modifying Properties of External Tables

Now that we have created an external table in Snowflake, we can fill in the properties.

Changing the Name

When an external table is created, there is a warning that says we should change its default name. We can do that in the General section of the right-hand panel, like this:

External Tables in Snowflake

We got rid of the warning! Let’s move on.

Adding Columns

We can add columns to an external table in the same way we do for standard tables. In the Columns section of the right-hand panel, there is an Add column button. By clicking it, we can add some columns.

neral section of the right-hand panel, like this:

External Tables in Snowflake

But there are still two errors. Let’s resolve those.

Adding the File and its Location

The remaining errors tell us the file location and file format fields are missing. We can easily add them in the Additional properties section of the right-hand panel. First, click the Set button next to the Location and File format fields. Then, you can type in the location of the external file and its format.

External Tables in Snowflake

We have created an external table and got rid of all the errors. Now, we can explore other properties.

Other Properties for External Tables

There are many other properties in the Additional properties section of the right-hand panel. Let’s go through them one by one.

External Tables in Snowflake

The available properties are, from the top:

  • The Schema property asks you for the database schema name where the external table exists.
  • The Location property requires the location of the external file (as mentioned in the previous section). The trace of this property in the generated SQL script is WITH LOCATION = <value of the Location field>.
  • The Partition by property asks you how to partition the table. The trace of this property in the generated SQL script is the PARTITION BY
  • The File format property requires the format of the external file. The trace of this property in the generated SQL script is FILE_FORMAT = <value of the File format field>.
  • The AWS SNS topic property is optional. The trace of this property in the generated SQL script is AWS_SNS_TOPIC = <value of the AWS SNS topic field>.
  • The Pattern property lets you filter the data that matches the given pattern. The trace of this property in the generated SQL script is PATTERN = '<value of the Pattern field>'.
  • The Auto refresh property is set to Yes by default (i.e., after clicking the Set button). This property ensures periodical synchronization of the data in the external table with the data in the file unless you set it to No. The trace of this property in the generated SQL script is AUTO_REFRESH = TRUE|FALSE.
  • The Refresh on create property is set to Yes by default (i.e., after clicking the Set button). This property ensures the data of the external table is synchronized with the data in the file data at the table creation time unless you set it to No. The trace of this property in the generated SQL script is REFRESH_ON_CREATE = TRUE|FALSE.

Other Properties for Columns of External Tables

In the Columns section of the right-hand panel, you can expand each column by clicking the arrow next to it.

External Tables in Snowflake

There are two properties: Expression and Constraint. The Expression property defines the alias name for a column. The trace of this property in the generated SQL script is columnName AS columnAlias. The Constraint property defines the constraint set on a column. It may be NOT NULL, DEFAULT, PRIMARY KEY, etc.

These are all the properties you can use when creating ER diagrams that contain external tables in Vertabelo.

What’s Next?

External tables in Snowflake can be considered an interface that lets you view data external to your database. Not only can you view external data, but you can also query and join with other tables. And if you create a view based on an external table, there are benefits such as improved query performance. External tables are a powerful tool that pays to know.

Now, it’s time for you to create ER diagrams with external tables in Vertabelo on your own. Be sure to check out the current developments in Vertabelo by following THIS ARTICLE. Good luck!

Original article source at: https://www.vertabelo.com

#snowflake #table #database 

 Model Snowflake External Tables in Vertabelo

Leaf: Distributed ID Generate Service Written in Java

Introduction

Leaf refers to some common ID generation schemes in the industry, including redis, UUID, snowflake, etc. Each of the above approaches has its own problems, so we decided to implement a set of distributed ID generation services to meet the requirements. At present, Leaf covers Meituan review company's internal finance, catering, takeaway, hotel travel, cat's eye movie and many other business lines. On the basis of 4C8G VM, through the company RPC method, QPS pressure test results are nearly 5w/s, TP999 1ms.

You can use it to encapsulate a distributed unique id distribution center in a service-oriented SOA architecture as the id distribution provider for all applications

Quick Start

Leaf Server

Leaf provides an HTTP service based on spring boot to get the id

run Leaf Server

build

git clone git@github.com:Meituan-Dianping/Leaf.git
cd leaf
mvn clean install -DskipTests
cd leaf-server

run

maven

mvn spring-boot:run

or

shell command

sh deploy/run.sh

test

#segment
curl http://localhost:8080/api/segment/get/leaf-segment-test
#snowflake
curl http://localhost:8080/api/snowflake/get/test

Configuration

Leaf provides two ways to generate ids (segment mode and snowflake mode), which you can turn on at the same time or specify one way to turn on (both are off by default).

Leaf Server configuration is in the leaf-server/src/main/resources/leaf.properties

configurationmeaningdefault
leaf.nameleaf service name 
leaf.segment.enablewhether segment mode is enabledfalse
leaf.jdbc.urlmysql url 
leaf.jdbc.usernamemysql username 
leaf.jdbc.passwordmysql password 
leaf.snowflake.enablewhether snowflake mode is enabledfalse
leaf.snowflake.zk.addresszk address under snowflake mode 
leaf.snowflake.portservice registration port under snowflake mode 

Segment mode

In order to use segment mode, you need to create DB table first, and configure leaf.jdbc.url, leaf.jdbc.username, leaf.jdbc.password

If you do not want use it, just configure leaf.segment.enable=false to disable it.

CREATE DATABASE leaf
CREATE TABLE `leaf_alloc` (
  `biz_tag` varchar(128)  NOT NULL DEFAULT '', -- your biz unique name
  `max_id` bigint(20) NOT NULL DEFAULT '1',
  `step` int(11) NOT NULL,
  `description` varchar(256)  DEFAULT NULL,
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`biz_tag`)
) ENGINE=InnoDB;

insert into leaf_alloc(biz_tag, max_id, step, description) values('leaf-segment-test', 1, 2000, 'Test leaf Segment Mode Get Id')

Snowflake mode

The algorithm is taken from twitter's open-source snowflake algorithm.

If you do not want to use it, just configure leaf.snowflake.enable=false to disable it.

Configure the zookeeper address

leaf.snowflake.zk.address=${address}
leaf.snowflake.enable=true
leaf.snowflake.port=${port}

configure leaf.snowflake.zk.address in the leaf.properties, and configure the leaf service listen port leaf.snowflake.port.

monitor page

segment mode: http://localhost:8080/cache

Leaf Core

Of course, in order to pursue higher performance, you need to deploy the Leaf service through RPC Server, which only needs to introduce the leaf-core package and encapsulate the API that generates the ID into the specified RPC framework.

Attention

Note that leaf's current IP acquisition logic in the case of snowflake mode takes the first network card IP directly (especially for services that change IP) to avoid wasting the workId

Download Details:
Author: Meituan-Dianping
Source Code: https://github.com/Meituan-Dianping/Leaf
License: Apache-2.0 license

#database  #java 

Leaf: Distributed ID Generate Service Written in Java

What is Snowflake Database - Guide for 2022

https://www.blog.duomly.com/what-is-snowflake-database/

The snowflake database is all the rage these days. But what are they, exactly? And what makes them so special?

In this guide, we’ll answer those questions and more. We’ll explain what snowflake databases are, how they work, and why you might want to use one in your business. Plus, we’ll give you a few tips on how to get started with the snowflake database if you’re not quite sure where to start.

#database #databases #snowflake #cloud 

What is Snowflake Database - Guide for 2022
Franz  Becker

Franz Becker

1633669200

Airbytes Deployment and Usage Guide

In this post we will use Airbyte, one of the most exciting Open source ELT tools in modern data engineering to create near real-time replication by fetching data from Snowflake to a 2-Node Oracle RAC Database on OCI. Please note Airbyte is a Extract-Load and Transform (ELT) tool and not a ETL tool like Airflow which can do DAG(Directed Acyclic Graphs) executions. There is a significant difference between ETL vs ELT.
#airbyte  #oracle #snowflake 

Airbytes Deployment and Usage Guide

Top 7 Free Resources To Learn Snowflake

In less than a decade, Snowflake has emerged as one of the most reliable platforms for data warehousing. Benoit Dageville, Thierry Cruanes and Marcin Żukowski founded the company in July 2012. Snowflake offers data-warehouse-as-a service or cloud-based data storage and analytics services. 

 

Read more: https://analyticsindiamag.com/top-7-free-resources-to-learn-snowflake/

#snowflake 

Top 7 Free Resources To Learn Snowflake
Darian  Hyatt

Darian Hyatt

1624674485

Designing Snowflake for Scalable Data Applications

Building applications that scale as your customer base grows can be challenging. It’s important to have an efficient tenancy approach that meets compliance needs for both you and your customers. In addition, as the load increases or becomes variable, you need to deliver on performance SLAs consistently.

Watch this webinar to learn how to implement scalable data applications powered by Snowflake that deliver a great experience to your customers, and get advice on tenancy options that meet the needs of your application.

#developer #snowflake

Designing Snowflake for Scalable Data Applications

Ingest Tweets on Snowflake with Apache NiFi and Informatica Intelligent Cloud Services

Context

In this article, we will see how semi-structured information such as a JSON can be managed, transformed, and stored within a relational database in the cloud to take advantage of the best of both worlds, the flexibility of JSON and the robustness of an RDBMS.

We will also see how to “download” this information to a classic data analysis and exploitation environment with tools such as Oracle and Tableau.

Prerequisites

To make this laboratory it is necessary to have the Apache NiFi installed on the computer, this can be done in two ways, installing it on-premises,

https://nifi.apache.org/download.html

or by downloading the “sandbox” provided by CLOUDERA, also free of charge.

https://www.cloudera.com/downloads/hortonworks-sandbox/hdf.html

Hortonworks Sandbox HDF 3.1 running on Virtual Box

Also, it is necessary to have a Snowflake account, as of the date of publication of this article, Snowflake offers a free 30-day trial.

https://signup.snowflake.com

Once you have a Snowflake account, from there you can connect to Informatica’s Intelligent Cloud Services (IICS from now on) and enjoy free processing up to a certain number of rows. (This connection will be seen later in the article).

#snowflake #informatica #cloud #nifi #etl #oracle

Ingest Tweets on Snowflake with Apache NiFi and Informatica Intelligent Cloud Services

Moving from Pandas to Spark

When your datasets start getting large, a move to Spark can increase speed and save time.

Most data science workflows start with Pandas. Pandas is an awesome library that lets you do a variety of transformations and can handle different kinds of data such as CSVs or JSONs etc. I love Pandas — I made a podcast on it called “Why Pandas is the New Excel”. I still think Pandas is an awesome library in a data scientist’s arsenal. However, there comes up a point where the datasets you are working on get too big and Pandas starts running out of memory. It is here that Spark comes into the picture.

I am writing this blog post in a Q and A format with questions you might have and I also had when I was getting started.

#pandas #big-data #snowflake #spark #machine-learning #moving from pandas to spark

Moving from Pandas to Spark
Grace  Lesch

Grace Lesch

1623400334

SQL and AdventOfCode 2020, on Snowflake

2020 has been a year full of surprises — one of them being that I left Google and joined Snowflake. Last year I did my first run for #AdventOfCode with SQL, and this year the challenge is helping me get acquainted with a different SQL syntax than the one I was used to with BigQuery.

I’ll leave some notes here on my major discoveries and learnings — while developing my new Snowflake expertise.

#sql #database #snowflake

SQL and AdventOfCode 2020, on Snowflake
Aisu  Joesph

Aisu Joesph

1623141540

Car buying is fun with Snowflake Part 3

In part 2, entire pipeline to ingest data into Snowflake was automated using Azure Logic App and SnowPipe. JSON Data was loaded in Snowflake landing table with a single 1 column called JSON_DATA.

Ideally, there should also be a datetime column that will contain the date and time of when data was loaded into landing table. However, due to a limitation in SnowPipe it will not allow any additional column when using JSON format. If you try, you will get following error.

Snowflake’s Streams and Tasks feature can be leveraged to move this data into a 2nd landing table with additional columns such as load_dttm (load date time).

Snowflake Stream help with CDC (change data capture). It sort of works like a Kafka topic and will contain 1 row per changes in it base table. In this case VMS_Azure_Blob_LZ1 (landing zone 1)

//Create a stream on VMS_Azure_Blob_LZ1 table
CREATE OR REPLACE STREAM VMS_AZURE_BLOB_LZ1_STREAM ON TABLE “VMS”.”PUBLIC”.”VMS_AZURE_BLOB_LZ1";

//Verify using
SHOW STREAMS;
//Verify that stream works by invoking REST API to load some same data in LZ1 and then run a Select on stream
SELECT * FROM VMS_AZURE_BLOB_LZ1_STREAM;

Next step is to insert data present in stream to Landing Zone 2 table. It will be a simple SQL insert like this

//Create a 2nd Landing table (Seq is used to generate auto incremented ids)

create or replace TABLE VMS_AZURE_BLOB_LZ2 (
 SEQ_ID NUMBER(38,0) NOT NULL DEFAULT VMS.PUBLIC.VMS_AZURE_BLOB_LZ2_SEQ.NEXTVAL,
 LOAD_DTTM TIMESTAMP_NTZ(9) NOT NULL DEFAULT CURRENT_TIMESTAMP(),
 JSON_DATA VARIANT NOT NULL
)COMMENT='Data will be inserted from stream and task'
;
//Test and verify that select from Stream works as intended before using it in Task
INSERT INTO "VMS"."PUBLIC"."VMS_AZURE_BLOB_LZ2" (JSON_DATA) (
  SELECT 
    JSON_DATA
  FROM
    VMS_AZURE_BLOB_LZ1_STREAM
  WHERE
    "METADATA$ACTION" = 'INSERT'

Remember the whole idea is to automate, so Inserts needs to run automatically. It can be done so using Snowflake Task. It basically works as a Task Scheduler using “cron” time format. In this case it is set as 30 20 * * * to run at 8:30 PM PT after all the dealerships close.

CREATE OR REPLACE TASK VMS_AZURE_BLOB_MOVE_LZ1_TO_LZ2_TASK
  WAREHOUSE = TASKS_WH //A specific WH created to be used for Tasks only to show up on bill as separate line item
  SCHEDULE  = 'USING CRON 30 20 * * * America/Vancouver' //Process new records every night at 20:30HRS
WHEN
  SYSTEM$STREAM_HAS_DATA('VMS_AZURE_BLOB_LZ1_STREAM')
AS
  INSERT INTO "VMS"."PUBLIC"."VMS_AZURE_BLOB_LZ2" (JSON_DATA) (
  SELECT 
    JSON_DATA
  FROM
    VMS_AZURE_BLOB_LZ1_STREAM
  WHERE
    "METADATA$ACTION" = 'INSERT'
);

Notice that in above DDL, warehouse specified is TASKS_WH, it was created specifically to run Tasks and in the monthly billing that Snowflake generates it will come as a separate line time. This way it is easier to monitor and track costs of Tasks by aligning them together so that compute can do few of them together, resulting into cost savings.

#snowflake #blob #tasks #azure #snowflake

Car buying is fun with Snowflake Part 3
Grace  Lesch

Grace Lesch

1622559005

Loading JSON data into Snowflake

Have you ever faced any use case or scenario where you’ve to load JSON data into  the [Snowflake] ? We better know JSON data is one of the common data format to store and exchange information between systems. JSON is a relatively concise format. If we are implementing a database solution, it is very common that we will come across a system that provides data in JSON format. Snowflake has a very straight forward approach to load JSON data. In this blog, we will understand this approach in a step-wise manner.

#database #snowflake #json #data

Loading JSON data into Snowflake
Ian  Robinson

Ian Robinson

1622392800

KSnow: SNOWSQL - II

In our previous blog SNOWSQL – I, we introduced you to SnowSQL – a new modern command-line tool. Today we are going to talk about how SnowSQL helps to query data interactively.

SnowSQL Customization

SnowSQL works well after its installation, we can make customizations to get full advantage of SnowSQL.

For example, by defining connections in the config file, we can preset my environment (account, warehouse, database, etc…) and don’t have to worry about connection strings or exposed passwords.

To keep track of which environment we are in, we can customize the prompt to show only the information we need.

#big data and fast data #snowflake #big data #cloud data warehouse #snowflake #snowsql

KSnow: SNOWSQL - II
Ian  Robinson

Ian Robinson

1621629060

Snowflake Data Encryption and Data Masking Policies

This post will describe data encryption and data masking functionalities.

Introduction

Snowflake architecture has been built with security in mind from the very beginning. A wide range of security features are made available from data encryption to advanced key management to versatile security policies to role-based data access and many more, at no additional cost. This post will describe data encryption and data masking functionalities.

Snowflake Security Reference Architecture

Data Encryption in Snowflake

Data Masking - Column Level Security

Data Masking Policy for Row Level Security

#big data #data security #snowflake #data masking #data encryption

Snowflake Data Encryption and Data Masking Policies