TABLESAMPLE Clause in SQL Server 2005

A few weeks ago I sent a post about OUTPUT Clause in SQL Server 2005.  One of other enhancements of T-SQL in SQL Server 2005 is TABLESAMPLE clause which is the topic of this post.

First of all what this new clause does?  Sometimes you want to deal with large number of rows in a table where you need to retrieve an approximate value of them.  For example you need to know the average of a column in a table approximately.  Doing a calculation on this column when it's not necessary to use all rows isn't a good job.  TABLESAMPLE clause helps you to select a specified number of rows randomly and perform your calculations on this column.

TABLESAMPLE is a clause that appears immediately after the name of a table in SELECT commands.  There are different algorithms to select a sample table that are listed in SQL:1999 but SQL Server 2005 only supports one of them which is SYSTEM and you have to specify this right after the TABLESAMPLE.  TABLESAMPLE also gets a parameter which is a number.  This number specifies the probability of having a specific SQL page in result.  For example 5 specifies that with probability of 5% a given SQL page will be used in our sample table.

As an example suppose that we have a simple table with 10000 rows.  This table has an identifier column and a Value column which keeps an integer number.  This table is filled with random numbers in Value column between 1 and 10000.  Now we want to calculate the average of these numbers approximately.  One solution is to select all values and calculate their average.  But the other solution is to take a part of this table and calculate its average.  Mathematically this is possible because our randomly generated numbers have a normal model and all final averages will be in a given interval (as you will see later).

Ok, knowing this first we use TABLESAMPLE clause to select a part of our table.

SELECT id, Value

FROM MyTable TABLESAMPLE SYSTEM(20 PERCENT)

 

Now we can calculate the average of Value column for our sample table.

SELECT AVG(Value) AS Average

FROM MyTable TABLESAMPLE SYSTEM(20 PERCENT)

Running this query twice we get following results.

Now playing: Paco De Lucia - Quizas, Quizas, Quizas

[advertisement] Axosoft OnTime 2008 is four developer tools in one: bug tracking, project wiki, feature management, and help desk. It manages your development process so developers can focus on coding. Installed or Hosted – Free Single-user license -- Free 30-day team trial.

3 Comments : 04.25.07

Feedbacks

 avatar
#1
DotNetKicks.com
04.25.2007 @ 10:42 PM
You've been kicked (a good thing) - Trackback from DotNetKicks.com
 avatar
#2
Keyvan Nayyeri
05.02.2007 @ 7:40 PM
For me .NET ReportViewer control (Windows Forms control or ASP.NET server control) has been a useless
 avatar
#3
vadivu
03.26.2008 @ 7:04 AM

Good

Leave a Comment