This article will help you learn to develop U-SQL jobs locally, which once ready, can be deployed on Azure Data Lake Analytics service on the Azure cloud.

Introduction

In the previous article, Developing U-SQL jobs on Azure Data Lake Analytics, we learned to develop an Azure Data Lake Analytics job that can read data from files stored in a data lake storage account, process and same and write the output to a file. We also learned how to optimize the performance of the job. Now that we understand the basic concepts of working with these jobs, let’s say we are considering using this service for a project in which multiple developers would be developing these jobs on their local workstations. In that case, we need to enable the development team with the tools that they can use to develop these jobs. They can also develop these jobs using the console, but often that is the not most efficient approach. And the web console does not have full-fledged features that often locally installed IDEs have to support large scale code development.

Setting up sample data in Azure Data Lake Storage Account

While performing development locally, one may need test data on the cloud as well on the local machine. We will explore both options. In this section let’s look at how to set up some sample data that can be used with U-SQL jobs.

Navigate to the dashboard page of the Azure Data Lake Analytics account. On the menu bar, you would find an option named Sample Scripts as shown below.

Click on the Sample Scripts menu item, as a screen would appear as shown below. There are two options – one to install sample data and the second is to install the U-SQL advanced analytics extensions that allow us to use languages like R, Python etc.

Click on the sample data warning icon, which will start copying sample data on the data lake storage account. Once done, you would be able to see a confirmation message as shown below. This completes the setting up of sample data on the data lake storage account.

Setting up a local development environment

Visual Studio Data Tools provides the development environment as well as project constructs to develop U-SQL jobs as well as projects related to Azure Data Lake Analytics. It is assumed that you have Visual Studio installed on your local machine. If not, consider installing at least a community edition of Visual Studio which is available freely for development purposes. Once Visual Studio is installed, we can configure different component installation. Open the component configuration page and you would be able to see different component options that you can optionally install on your local machine.

Select Data storage and processing toolset as shown below. On the right-hand side, if you check the details, you would find that this stack contains the Azure Data Lake and Stream Analytics Tools, which is the set of tools and extensions that we need for developing projects related to Azure Data Lake Analytics.

#azure #jobs #sql azure #u-sql #azure #azure data lake analytics

Building U-SQL jobs locally for Azure Data Lake Analytics
1.45 GEEK