Geek City: What's Worse Than a Table Scan?

[This post has been updated from a post on my old blog. The original is here.]

I have frequently heard SQL Server developers and DBAs gasp when a query plan is indicating that SQL Server is performing a table scan, thinking that is the worst thing that could ever happen to a query. The truth is, it's far from the worst thing and in addition, not all table scans are created equal.

One thing that is far worse that a table scan is to execute a query that uses a nonclustered index, and having that query look up every single row in a table! Although that is a horrible thing to behold, it is not the topic of this post.

Today, I'm going to show you that two different table scans on the same data in a heap can give very different performance.

The behavior has to do with a technique that SQL Server uses when a row in a heap is increased in size so that it no longer fits in the original page. This usually occurs when a variable length column is updated to take more space.  If SQL Server just moved the row to another page, any nonclustered indexes would have to be updated to indicate the new page address.  (Remember, if the underlying table is a heap, nonclustered indexes point to the data row using a actual address.) Since there can be up to 999 nonclustered indexes on a single table, that could potentially be a LOT of work. So instead, when a row in a heap has to move, SQL Server leaves behind a forwarding pointer in place of the row that has moved. The nonclustered indexes continue to point to the old location, and then SQL Server just needs one more page lookup to find the new location. If just a few rows need to be looked up, this expense is minimal and more than made up for my the savings of not having to update all the nonclustered indexes every time a row moves.

However, what happens when there are LOTS of forwarding pointers?

The metadata function sys.dm_db_index_physical_stats has a column that indicates how many forwarded records are in a table. For tables with clustered indexes, this will always be 0.

Let's look at an example. I'll make a copy of the Person.Address table in the AdventureWorks2014 database, and add a new varchar column to it. Initially, the column takes no space.

USE AdventureWorks2014
            WHERE name = 'Address2' AND schema_id =1)
        DROP TABLE dbo.Address2;
SELECT *, convert (varchar(500), 'comments') AS comments
   INTO Address2
FROM Person.Address;

Now lets look at the pages and forwarded records using sys.dm_db_index_physical_stats:

SELECT index_type_desc, page_count, avg_page_space_used_in_percent,
FROM sys.dm_db_index_physical_stats(db_id('AdventureWorks2014'), 
       object_id('Address2'),null, null, 'detailed');

Here are my results:


Note that the pages are almost full (over 98%) and there are no forwarded records.

Now I'll increase the length of all the new columns and check the physical stats again:

UPDATE Address2
SET comments = replicate('a', 500);
SELECT index_type_desc, page_count, avg_page_space_used_in_percent,
FROM sys.dm_db_index_physical_stats(db_id('AdventureWorks2014'), 
       object_id('Address2'),null, null, 'detailed');

Here are my results from the same query showing the physical stats:


The output shows me I have 1670 pages in the table and 15236 forwarded records.

Let's see what happens when we read every row in the table:

SELECT * FROM Address2;


The logical I/O value returned by STATISTICS IO tells us that instead of just reading through every page, for a total of 1670 reads, SQL Server jumps out of sequence and follows the forwarding pointer for every forwarded record. So the number of logical reads is the sum of the number of pages plus the number of forwarded records:

1670 + 15236 = 16906

I was discussing this behavior with my friend and colleague Tibor Karaszi and he proposed an explanation for this behavior. He related it to the same behavior that Itzik Ben-Gan has described for why SQL Server will always follow page pointers when scanning a clustered index if consistent reads are desired. The alternative would be to just read the pages in disk order, or page number order, which can be determined by examining the IAM structures for the object. For clustered tables, we need to follow the page pointers instead of the IAMs  to make sure that if a row is moved due to an update while the scan is occurring, that we don't read the same row twice (if the row is moved to a higher page number) or skip the row altogether (if the row is moved to a lower page number.)

But what about a heap? Are there potential problems scanning a heap while updates are occurring? Could we potentially read the same row twice or skip a row, since there is no 'ordered list' to read? Tibor suggested the following:

I believe that forwarding pointers take care of just that. Because of forwarding pointers, the "root" location for a row is stable. So, even if the row moves during a scan, the "root location"(forwarding stub) is at the same position. We have concluded that the scan uses the forwarding pointers when reading the rows. This means that a scan is not sensitive to row movements during the scan. It cannot "skip" rows that are there, or read the same row twice.

So a few forwarding pointers are not a bad thing, but having lots of them can increase the work done during scans or partial scans by a considerable amount.

So how do you get rid of forwarding pointers? There are 4 ways:

1. If the row is updated, so that its size decreases, AND if there is still room on the page where the row came from, it will be moved back. This is not dependable, so it isn't really recommended as a solution.  When I updated my Address2 table, most of the forwarded records were moved, but not all:

UPDATE Address2
SET comments = '';
SELECT index_type_desc, page_count, avg_page_space_used_in_percent,
FROM sys.dm_db_index_physical_stats(db_id('AdventureWorks'), 
       object_id('Address2'),null, null, 'detailed');

My results showed that I am still left with 175 forwarded records. This is a great improvement over 15236, but it's still a lot. In my situation, I got such a high number of rows being moved back because I didn't do any further inserts into the original table so all the empty space from the original rows was still available. 

2. Forwarded records will be cleaned up when you shrink the data file. This is definitely NOT recommended as a solution; I am only mentioning it for completeness. SQL Server does so much moving of data and updating nonclustered index pointers when shrinking a file, that updating the forwarded records is not very much extra work at all.

3. Since forwarded records only exist in heaps, you can make the table not a heap. Build a clustered index, and all the forwarded records will go away. Some people will say this is the best solution. 

4. If you really don't want the clustered index, you can simply ALTER the table with the REBUILD option. 


After the REBUILD, my physical stats look like this:



All the forwarded records are gone as the data has been moved to all new pages.

Hopefully, this information will be useful to you.