Exchange 2007 Online Maintenance Database Scanning (Part 1)

by [Published on 16 Sept. 2008 / Last Updated on 16 Sept. 2008]

A look at the online maintenance database scanning features introduced in Exchange 2007 SP1.

If you would like to be notified when Neil Hobson releases the next part of this article series please sign up to the MSExchange.org Real time article update newsletter.

Introduction

Online maintenance database scanning comprises two main processes, namely checksumming and page zeroing of the database pages.  Both of these processes are available in the Release To Manufacturing (RTM) version of Exchange 2007 and although they are still available in the Service Pack 1 version of Exchange 2007, several changes have been made which I will cover in the two parts of this article.  As well as what the changes actually are, we’ll obviously take a look at how you enable these features as well as how you monitor their progress via the event log entries that they create.  There are also some performance implications that you will need to be aware of too.

In part one of this article we’ll look at the database checksumming process whilst in part two we’ll look at the database page zeroing process.  Let’s get going and check out the checksumming.

Database Checksumming

Checksumming simply checks the integrity of a database.  If you’ve been using previous versions of Exchange, you may well remember that there are two key things that occur during an online streaming backup of the databases:

  • The database transaction logs are flushed.
  • Integrity checks are performed.

One of the classic errors that has put fear into Exchange administrators over the years is the dreaded -1018 error, which was placed into the application event log by the integrity checks performed during a streaming backup.  This streaming backup and integrity checking process continued to occur with the Exchange 2007 Release To Manufacturing (RTM) version but of course with the introduction of Volume Shadow Copy Service (VSS) backups in Exchange 2007, coupled with the de-emphasizing of the streaming backup interface, performing a checksum became an issue.  Why is this?  Well, the answer to that lies in the use of Exchange 2007 high availability technologies such as Clustered Continuous Replication (CCR).  As you might expect, many organizations have implemented technologies like CCR as part of a high availability strategy and as I’m sure you are no doubt aware, this technology provides data redundancy in the form of passive copies of the databases.  One of the key advantages of technologies such as CCR is that the VSS backups can be configured to occur on the passive copy of the database.  This is good for performance, since the backup process doesn’t affect the active copy of the database and therefore doesn’t directly affect the users that are using it.  The problem, though, is that in such a scenario the active copy of the database does not have the integrity checks run against it.

To overcome this, organizations had to either take the databases offline and manually run the ESEUTIL utility, or perhaps move the Clustered Mailbox Server (CMS) between cluster nodes on a regular basis, thereby taking the backup from different nodes.  Clearly, none of these are ideal solutions.

To solve this particular problem, Microsoft made the checksumming process available during online maintenance in Exchange 2007 SP1.  In case you didn’t know, online maintenance performs a series of important tasks to make sure that your databases are operating correctly and efficiently.  These tasks include areas such as clearing items from the deleted items dumpster, cleaning up deleted mailboxes and performing an online defragmentation.  We’ll cover changing the online maintenance time window in just a moment.

Enabling Database Checksumming

First, though, you need to know what to do to enable the online database checksumming process because it’s not enabled by default; Microsoft has made this an opt-in feature.  To enable the process requires a change to the registry.  The registry sub-key listed below will not exist by default so you must create it.  The change to make is:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\ParametersSystem
Name: Online Maintenance Checksum
Type: DWORD
Value: 1

A screen shot of how this registry value looks is shown in Figure 1.  Once you’ve made the registry change, you will need to restart the Microsoft Exchange Information Store service for the change to become effective.


Figure 1:
Checksumming Registry Key

As I said earlier, making this change means that the checksumming feature will be invoked the next time that online maintenance is run.  You may remember that online database maintenance can be scheduled to run via a setting on the properties of a mailbox database in the Exchange Management Console.  To find this setting, follow these steps:

  1. Run the Exchange Management Console.
  2. Navigate to the Server Configuration node in the Console tree, then locate and select the Mailbox node beneath it.
  3. Make sure the correct Exchange server is highlighted in the Result pane and you’ll then see the storage groups and databases on this server within the Work pane.
  4. Right-click the relevant database and choose Properties from the context menu.
  5. From the resulting mailbox database property window, notice the Maintenance schedule: option on the General tab as shown in Figure 2.  This allows you to configure when online database maintenance will be run.  You have a choice of several pre-defined time periods or the option of a custom schedule.


Figure 2:
Online Maintenance Schedule

Of course you can also use the Exchange Management Shell to set the online maintenance window.  The cmdlet to use is the Set-MailboxDatabase cmdlet with the –MaintenanceSchedule parameter.  If you’re not quite sure of how to format a parameter in the Exchange Management Shell, one useful way to determine this information is to use the corresponding Get cmdlet, Get-MailboxDatabase in this example, and pipe the results to the format-list cmdlet using just your required parameter.  For example, you could use this cmdlet:

Get-MailboxDatabase | fl MaintenanceSchedule

An example of how this looks is shown in Figure 3.


Figure 3:
Get-MailboxDatabase cmdlet

Event Log Entries

There are new event log entries that will help you examine the progress of the database checksumming feature.  First, when the process starts, the following event is logged:

Source: ESE
Category: Online Defragmentation
Event ID: 717
Description: Online maintenance is starting the database checksumming background task for database <database name>.

An example of this event is shown in Figure 4.


Figure 4:
Event 717

Once the process has completed, the following event is logged:

Source: ESE
Category: Online Defragmentation
Event ID: 721
Description: Online maintenance database checksumming background task is completed for database <database name>.  This pass started on <date> and ran for a total of <n> seconds, requiring <n> invocations over <n> days.

The description field then proceeds to give you an operation summary which includes key information such as the number of pages seen and bad checksums encountered.  An example of this event is shown in Figure 5.  Obviously this is a small test database that I’m using since the process only took 12 seconds to complete; production databases will take much longer of course.


Figure 5:
Event 721

The other event relating to database checksumming that you may see is event 723 which occurs if the process encounters an error.

Performance Impact

However, before you dive in and enable the database checksumming feature, it’s always good practice to understand any likely performance impact from doing so.  Fortunately, there doesn’t appear to be any large impact on the users from doing so.  This is because from information that I saw presented by Microsoft at TechEd IT Forum in November 2007, the introduction of database checksumming showed a small increase in the % Processor Time counter, whilst the RPC Averaged Latency counter increased by only approximately 10ms.

Having said that, it is still prudent to monitor the process should you implement it within your environment and to allow you to do this Microsoft provides additional performance counters.  These counters basically tell you how many database pages per second are being read:

  • MSExchangeDatabase\Online Maintenance (DB Scan) Pages Read/sec.
  • MSExchangeDatabase==>Instances\Online Maintenance (DB Scan) Pages Read/sec

However, to see these counters in the Performance tool you need to enable the extended Extensible Storage Engine (ESE) performance counters.  To do this requires another registry modification as follows:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ESE\Performance
Name: Show Advanced Counters
Type: DWORD
Value: 1

This registry configuration is shown in Figure 6.


Figure 6:
Advanced Counter Registry Configuration

Once you’ve enabled this registry key note that there is no need to restart the server or any particular service; simply re-start the Performance tool and you will see the additional counters as shown in Figure 7.


Figure 7:
Additional Performance Counters

Summary

Database checksumming is an important process to run within an Exchange 2007 environment since it can give you an indication of bad checksums that may be encountered within your database.  This part of the article has covered how to enable this process and what to look for once enabled.  In part two, we’ll look at the other process that you can enable if you require, namely database page zeroing.

If you would like to be notified when Neil Hobson releases the next part of this article series please sign up to the MSExchange.org Real time article update newsletter.

Featured Links