Index Fragmentation - Understanding Page Count

Hi experts,
I'm collecting index fragmentation info, including the number of page counts.

  1. does the number of pages relate to the number of rows in the table?
  2. Will rebuilding indexes (or any technique which greatly reduced fragmentation) result in fewer pages?
  3. Does fewer pages result in faster reads, querying?

Generally, yes to all 3. Specifically:

  1. Yes, more rows will need more pages. It also depends on any fillfactor that you specify for non-clustered or clustered indexes. If it's less than 100% then you'll use more pages to store the same number of rows.

  2. Fragmentation just means that pages are physically out-of-order based on the key of the index. If you took a book apart and rearranged the pages randomly, that would be fragmented, but would take up the same amount of space. Rebuilding the index would be analogous to not only rearranging the pages in the correct order, but also closing up any blank space on the pages themselves. Reorganizing the index would just put the pages in the correct order, and is usually faster and less work than rebuilding.

  3. Most often yes. There are some cases where the storage layer, like a large SAN with a lot of cache, would perform better if the pages were spread out. But that's not a function of data fragmentation, but rather of placing data on multiple drives. Very large databases (at least 10 TB) would benefit from that, anything less than 1 TB would be better served by defragmenting and minimizing the number of pages used.

2 Likes

Assuming the table is 200 pages or more (for trivial table sizes, answers are less noticeable):

  1. Yes.
  2. Yes. After initial INSERT, if pages become fragmented, those pages will, by definition, be less full, and thus have some empty space.
  3. For a single, keyed lookup, no. Total pages and even fragmentation do not affect those. For any read that must read across multiple rows, yes, fewer pages are faster.

Above all that, the single biggest factor in overall performance is determining the best clustered index for every table. The next is to keep the statistics up to data. Index fragmentation / rebuilds are only third in priority for overall performance.

2 Likes

Thanks for your detailed reply, Scott. it's very informative.

Thanks for your detailed reply, Robert. it's very informative.