Improve Row Count Estimates for Table Variables without Changing Code

Table variables can cause performance issues with joins when they contain a large number of rows. In SQL Server 2019, Microsoft has improved how the optimizer works with table variables which can improve performance without making changes to your code. In this article, Greg Larsen explains how this feature works and if it really does make a difference.

You probably have heard that table variables work fine when the table variable only contains a small number of records, but when the table variable contains a large number of records it doesn’t perform all that well. A solution for this problem has been implemented in version 15.x of SQL Server (Azure SQL Database and SQL Server 2019) with the rollout of a feature called Table Variable Deferred Compilation

_Table Variable Deferred Compilation _is one of many new features to improve performance that was introduced in the Azure SQL Database and SQL Server 2019. This new feature was included in the _Intelligent Query Processing _(IQP). See Figure 1 for a diagram that shows all the IQP features introduced in Azure SQL Database and SQL Server 2019, as well as features that originally were part of the Adaptive Query Processing feature included in the older generation of Azure SQL Database and SQL Server 2017.

Figure 1: Intelligent Query Processing

In releases of SQL Server prior to 15.x, the database engine used a wrong assumption on the number of rows that were in a table variable. Because of this bad assumption, the execution plan that was generated didn’t work too well when a table variable contained lots of rows. With the introduction of SQL Server 2019, the database engine now defers the compilation of a query that uses a table variable until the table variable is used the first time. By doing this, the database engine can more accurately identify cardinality estimates for table variables. By having more accurate cardinality numbers, queries that have large numbers of rows in a table variable will perform better. Those queries will need to be running against a database with a database compatibility level set to 150 (version 15.x of SQL Server) to take advantage of this feature. To better understand how deferred compilation improves the performance of table variables that contain a large number of rows, I’ll run through an example, but first, I’ll discuss what is the problem with table variables in versions of SQL Server prior to version 15.x.

What is the Problem with Table Variables?

A table variable is defined using a DECLARE statement in a batch or stored procedure. Table variables don’t have distribution statistics and don’t trigger recompiles. Because of this, SQL Server is not able to estimate the number of rows in a table variable like it does for normal tables. When the optimizer compiles code that contains a table variable, prior to 15.x, it assumes a table is empty. This assumption causes the optimizer to compile the query using an expected row count of 1 for the cardinality estimate for a table variable. Because the optimizer only thinks a table variable contains a single row, it picks operators for the execution plan that work well with a small set of records, like the NESTED LOOPS operator for a JOIN operation. The operators that work well on a small number of records do not always scale well when a table variable contains a large number of rows. Microsoft documented this problem and recommends that temp tables might be a better choice than using a table variable that contains more than 100 rows. Additionally, Microsoft even recommends that if you are joining a table variable with other tables that you consider using the query hint RECOMPILE to make sure that table variables get the correct cardinality estimates. Without the proper cardinality estimates queries with large table variables are known to perform poorly.

With the introduction of version 15.x and the _Table Variable Deferred Compilation _feature, the optimizer delays the compilation of a query that uses a table variable until just before it is used the first time. This allows the optimizer to know the correct cardinality estimates of a table variable. When the optimizer has an accurate cardinality estimate, it has a good chance at picking execution plan operators that perform well for the number of rows in a table variable. In order for the optimizer to defer the compilation, the database must have its compatibility level set to 150. To show how deferred compilation of table variables work, I’ll show an example of this new feature in action.

#homepage #performance #performance monitor

What is GEEK

Buddha Community

Improve Row Count Estimates for Table Variables without Changing Code
Tyrique  Littel

Tyrique Littel

1604008800

Static Code Analysis: What It Is? How to Use It?

Static code analysis refers to the technique of approximating the runtime behavior of a program. In other words, it is the process of predicting the output of a program without actually executing it.

Lately, however, the term “Static Code Analysis” is more commonly used to refer to one of the applications of this technique rather than the technique itself — program comprehension — understanding the program and detecting issues in it (anything from syntax errors to type mismatches, performance hogs likely bugs, security loopholes, etc.). This is the usage we’d be referring to throughout this post.

“The refinement of techniques for the prompt discovery of error serves as well as any other as a hallmark of what we mean by science.”

  • J. Robert Oppenheimer

Outline

We cover a lot of ground in this post. The aim is to build an understanding of static code analysis and to equip you with the basic theory, and the right tools so that you can write analyzers on your own.

We start our journey with laying down the essential parts of the pipeline which a compiler follows to understand what a piece of code does. We learn where to tap points in this pipeline to plug in our analyzers and extract meaningful information. In the latter half, we get our feet wet, and write four such static analyzers, completely from scratch, in Python.

Note that although the ideas here are discussed in light of Python, static code analyzers across all programming languages are carved out along similar lines. We chose Python because of the availability of an easy to use ast module, and wide adoption of the language itself.

How does it all work?

Before a computer can finally “understand” and execute a piece of code, it goes through a series of complicated transformations:

static analysis workflow

As you can see in the diagram (go ahead, zoom it!), the static analyzers feed on the output of these stages. To be able to better understand the static analysis techniques, let’s look at each of these steps in some more detail:

Scanning

The first thing that a compiler does when trying to understand a piece of code is to break it down into smaller chunks, also known as tokens. Tokens are akin to what words are in a language.

A token might consist of either a single character, like (, or literals (like integers, strings, e.g., 7Bob, etc.), or reserved keywords of that language (e.g, def in Python). Characters which do not contribute towards the semantics of a program, like trailing whitespace, comments, etc. are often discarded by the scanner.

Python provides the tokenize module in its standard library to let you play around with tokens:

Python

1

import io

2

import tokenize

3

4

code = b"color = input('Enter your favourite color: ')"

5

6

for token in tokenize.tokenize(io.BytesIO(code).readline):

7

    print(token)

Python

1

TokenInfo(type=62 (ENCODING),  string='utf-8')

2

TokenInfo(type=1  (NAME),      string='color')

3

TokenInfo(type=54 (OP),        string='=')

4

TokenInfo(type=1  (NAME),      string='input')

5

TokenInfo(type=54 (OP),        string='(')

6

TokenInfo(type=3  (STRING),    string="'Enter your favourite color: '")

7

TokenInfo(type=54 (OP),        string=')')

8

TokenInfo(type=4  (NEWLINE),   string='')

9

TokenInfo(type=0  (ENDMARKER), string='')

(Note that for the sake of readability, I’ve omitted a few columns from the result above — metadata like starting index, ending index, a copy of the line on which a token occurs, etc.)

#code quality #code review #static analysis #static code analysis #code analysis #static analysis tools #code review tips #static code analyzer #static code analysis tool #static analyzer

Improve Row Count Estimates for Table Variables without Changing Code

Table variables can cause performance issues with joins when they contain a large number of rows. In SQL Server 2019, Microsoft has improved how the optimizer works with table variables which can improve performance without making changes to your code. In this article, Greg Larsen explains how this feature works and if it really does make a difference.

You probably have heard that table variables work fine when the table variable only contains a small number of records, but when the table variable contains a large number of records it doesn’t perform all that well. A solution for this problem has been implemented in version 15.x of SQL Server (Azure SQL Database and SQL Server 2019) with the rollout of a feature called Table Variable Deferred Compilation

_Table Variable Deferred Compilation _is one of many new features to improve performance that was introduced in the Azure SQL Database and SQL Server 2019. This new feature was included in the _Intelligent Query Processing _(IQP). See Figure 1 for a diagram that shows all the IQP features introduced in Azure SQL Database and SQL Server 2019, as well as features that originally were part of the Adaptive Query Processing feature included in the older generation of Azure SQL Database and SQL Server 2017.

Figure 1: Intelligent Query Processing

In releases of SQL Server prior to 15.x, the database engine used a wrong assumption on the number of rows that were in a table variable. Because of this bad assumption, the execution plan that was generated didn’t work too well when a table variable contained lots of rows. With the introduction of SQL Server 2019, the database engine now defers the compilation of a query that uses a table variable until the table variable is used the first time. By doing this, the database engine can more accurately identify cardinality estimates for table variables. By having more accurate cardinality numbers, queries that have large numbers of rows in a table variable will perform better. Those queries will need to be running against a database with a database compatibility level set to 150 (version 15.x of SQL Server) to take advantage of this feature. To better understand how deferred compilation improves the performance of table variables that contain a large number of rows, I’ll run through an example, but first, I’ll discuss what is the problem with table variables in versions of SQL Server prior to version 15.x.

What is the Problem with Table Variables?

A table variable is defined using a DECLARE statement in a batch or stored procedure. Table variables don’t have distribution statistics and don’t trigger recompiles. Because of this, SQL Server is not able to estimate the number of rows in a table variable like it does for normal tables. When the optimizer compiles code that contains a table variable, prior to 15.x, it assumes a table is empty. This assumption causes the optimizer to compile the query using an expected row count of 1 for the cardinality estimate for a table variable. Because the optimizer only thinks a table variable contains a single row, it picks operators for the execution plan that work well with a small set of records, like the NESTED LOOPS operator for a JOIN operation. The operators that work well on a small number of records do not always scale well when a table variable contains a large number of rows. Microsoft documented this problem and recommends that temp tables might be a better choice than using a table variable that contains more than 100 rows. Additionally, Microsoft even recommends that if you are joining a table variable with other tables that you consider using the query hint RECOMPILE to make sure that table variables get the correct cardinality estimates. Without the proper cardinality estimates queries with large table variables are known to perform poorly.

With the introduction of version 15.x and the _Table Variable Deferred Compilation _feature, the optimizer delays the compilation of a query that uses a table variable until just before it is used the first time. This allows the optimizer to know the correct cardinality estimates of a table variable. When the optimizer has an accurate cardinality estimate, it has a good chance at picking execution plan operators that perform well for the number of rows in a table variable. In order for the optimizer to defer the compilation, the database must have its compatibility level set to 150. To show how deferred compilation of table variables work, I’ll show an example of this new feature in action.

#homepage #performance #performance monitor

Samanta  Moore

Samanta Moore

1621137960

Guidelines for Java Code Reviews

Get a jump-start on your next code review session with this list.

Having another pair of eyes scan your code is always useful and helps you spot mistakes before you break production. You need not be an expert to review someone’s code. Some experience with the programming language and a review checklist should help you get started. We’ve put together a list of things you should keep in mind when you’re reviewing Java code. Read on!

1. Follow Java Code Conventions

2. Replace Imperative Code With Lambdas and Streams

3. Beware of the NullPointerException

4. Directly Assigning References From Client Code to a Field

5. Handle Exceptions With Care

#java #code quality #java tutorial #code analysis #code reviews #code review tips #code analysis tools #java tutorial for beginners #java code review

Houston  Sipes

Houston Sipes

1604088000

How to Find the Stinky Parts of Your Code (Part II)

There are more code smells. Let’s keep changing the aromas. We see several symptoms and situations that make us doubt the quality of our development. Let’s look at some possible solutions.

Most of these smells are just hints of something that might be wrong. They are not rigid rules.

This is part II. Part I can be found here.

Code Smell 06 - Too Clever Programmer

The code is difficult to read, there are tricky with names without semantics. Sometimes using language’s accidental complexity.

_Image Source: NeONBRAND on _Unsplash

Problems

  • Readability
  • Maintainability
  • Code Quality
  • Premature Optimization

Solutions

  1. Refactor the code
  2. Use better names

Examples

  • Optimized loops

Exceptions

  • Optimized code for low-level operations.

Sample Code

Wrong

function primeFactors(n){
	  var f = [],  i = 0, d = 2;  

	  for (i = 0; n >= 2; ) {
	     if(n % d == 0){
	       f[i++]=(d); 
	       n /= d;
	    }
	    else{
	      d++;
	    }     
	  }
	  return f;
	}

Right

function primeFactors(numberToFactor){
	  var factors = [], 
	      divisor = 2,
	      remainder = numberToFactor;

	  while(remainder>=2){
	    if(remainder % divisor === 0){
	       factors.push(divisor); 
	       remainder = remainder/ divisor;
	    }
	    else{
	      divisor++;
	    }     
	  }
	  return factors;
	}

Detection

Automatic detection is possible in some languages. Watch some warnings related to complexity, bad names, post increment variables, etc.

#pixel-face #code-smells #clean-code #stinky-code-parts #refactor-legacy-code #refactoring #stinky-code #common-code-smells

Fannie  Zemlak

Fannie Zemlak

1604048400

Softagram - Making Code Reviews Humane

The story of Softagram is a long one and has many twists. Everything started in a small company long time ago, from the area of static analysis tools development. After many phases, Softagram is focusing on helping developers to get visual feedback on the code change: how is the software design evolving in the pull request under review.

Benefits of code change visualization and dependency checks

While it is trivial to write 20 KLOC apps without help of tooling, usually things start getting complicated when the system grows over 100 KLOC.

The risk of god class anti-pattern, and the risk of mixing up with the responsibilities are increasing exponentially while the software grows larger.

To help with that, software evolution can be tracked safely with explicit dependency change reports provided automatically to each pull request. Blocking bad PR becomes easy, and having visual reports also has a democratizing effect on code review.

Example visualization

Basic building blocks of Softagram

  • Architectural analysis of the code, identifying how delta is impacting to the code base. Language specific analyzers are able to extract the essential internal/external dependency structures from each of the mainstream programming languages.

  • Checking for rule violations or anomalies in the delta, e.g. finding out cyclical dependencies. Graph theory comes to big help when finding out unwanted or weird dependencies.

  • Building visualization for humans. Complex structures such as software is not easy to represent without help of graph visualization. Here comes the vital role of change graph visualization technology developed within the last few years.

#automated-code-review #code-review-automation #code-reviews #devsecops #software-development #code-review #coding #good-company