Understanding the Exchange Information Store

by Rodney Buike [Published on 8 Dec. 2005 / Last Updated on 8 Dec. 2005]

We would like to welcome Rodney Buike to our team of authors as he presents his first article to MSExchange.org readers. The Information Store is the heart and soul of Exchange Server 2000 and 2003. Understanding the fundamentals of the Information Store is important for anyone managing an Exchange server.

If you don’t believe me, stop the Microsoft Exchange Information Store service and count the seconds before your phone starts ringing!

The Information Store is made up of a number of components. Figure 1 shows a graphical layout of a typical Exchange server.


Figure 1

Exchange 2000 and 2003 use the same Information Store but there are some differences depending on the version. Table 1 describes these differences.

Store Features

Exchange 2000* or Exchange 2003 Standard Pre-SP2

Exchange 2003 Standard /w SP2

Exchange 2000 or 2003 Enterprise

# of Storage Groups

1 + 1 RSG**

1 + 1 RSG**

4 + 1 RSG**

# of Stores

1 Mailbox store and 1 Public Folder Store per Storage Group

1 Mailbox store and 1 Public Folder Store per Storage Group

5 per Storage Group

Store Size Limit

16GB per Store

75GB per Store

16TB per Store

Table 1

* Any Exchange 2000 service pack level
**RSG = Recovery Storage Group

Storage Groups and Databases

A Storage Group will contain one or more Mailbox and Public Folder stores, depending on the version and the needs of the organization. Mailbox stores contain the user and system mailboxes and the Public Folder Store contains the Public Folders and their contents. For most organizations, a single Storage Group, with one Mailbox Store and one Public Folder Store is more than enough, however as the database grows in size, splitting one large database into multiple smaller databases can ease the management of backups.

A default Exchange installation will create a Storage Group that contains a Mailbox Store and a Public Folder Store.  Each Mailbox Store is made up of a database set that contains two files:

  • Priv1.edb is a rich-text database file that contains the email messages, text attachments and headers for the users e-mail messages
  • Priv1.stm is a streaming file that contains multi-media data that is formatted as MIME data.

Similarly, each Public Folder Store is made up of a database set that also contains two files:

  • Pub1.edb is a rich-text database file that contains the messages, text attachments and headers for files stored in the Public Folder tree.
  • Pub1.stm is a streaming file that contains multi-media data that is formatted as MIME data

For every EDB file there will be an associated STM file.

Exchange utilizes what Microsoft terms a single-instance message store. This single-instance message store works on a per database basis. What does this mean? If an e-mail message is sent to multiple mailboxes that are all in the same database, the message is stored once and each mailbox has a pointer to the message. The transaction is also logged in the transaction logs for the Storage Group that contains the database. However, if the e-mail message is sent to multiple mailboxes that are located in different databases, the message is copied to each database and written to the transaction logs for each Storage Group that contains the database with a copy of the message. 

For example, if I send 10 users a 1MB email message and all the mailboxes are located in the same database, one copy of the message is written to the database and each mailbox points to this message which will consume 1MB of disk space in total. If the 10 recipients are located in two different databases, each database will get a copy of this message which will consume 2MB of disk space. As you can see this is a much more efficient use of space as opposed to the alternative of 10 1MB messages using up 10 MB of disk space.

Aside from the database files, Storage Groups also contain system files and transaction logs. There are two system files, Tmp.edb which is a temporary database where transactions are processed, and E##.chk. The E##.chk file maintains the checkpoint for the Storage Group. The ## represents the Storage Group number with the First Storage Group file called E00.chk. This checkpoint file keeps track of the last committed transaction. If you are ever forced to perform a recovery, this file contains the point at which the replaying of transaction logs starts.

Transaction Logs

The transaction logs are some of the most crucial files when it comes to a working Exchange server. Microsoft Exchange Server uses transaction logs as a disaster recovery method that can bring a Exchange database back to a consistent state after a crash. Before anything is written to the EDB file, it is first written to a transaction log. Once the transaction has been logged, the data is written to the database when convenient.

Until a transaction is committed to the database, it is available from memory and recorded in the transaction logs. This is why you will see store.exe use up to 1GB of memory after the Exchange server has been in use for a while. After an Exchange server is brought back up after a crash, the checkpoint file points to the last committed transaction in the transaction logs which are then replayed from that point on. This form of write-ahead logging is important for you to know. 

There are four types of transaction logs:

  • E##.log is the current transaction log for the database.  Once the log file reaches 5MB in size it is renamed E#######.log and a new E##.log is created.  As with the checkpoint file the ## represents the Storage Group identifier.  While the new E##.log file is being created you will see a file called Edbtmp.log which is a template for Exchange server log files.
  • E#######.log are the secondary transaction logs.  They are numbered sequentially starting with E0000001.log using the hexadecimal numbering format and are 5MB in size.
  • Res1.log is a reserved log file that is limited to 5MB in size.  When the disk has run out of space, transactions are written to this log file while you work on clearing up space on the disk.
  • Res2.log is another reserved log with the same function as Res1.log.

Transaction logs can grow at a fast pace as each and every transaction is recorded to the log files. There are two ways to manage this growth with the recommended method being a regular full backup of the Information Store. Upon successful backup, the transactions are committed to the database and then purged. 

The other method is to enable circular logging. Circular logging is disabled by default as it only allows you to recover Exchange data since the last full backup. With circular logging enabled the transaction logs are purged as the transactions are committed to the database. If you have to restore from backup, the transaction logs will not be replayed and all transactions since that backup will be lost.

The two reserved log files, Res1.log and Res2.log, are used to “save” 10MB of space on the disk in case there is no more free space. When the disk runs out of free space, the transactions are logged to the reserve logs as the Information Store shuts down gracefully. You will not be able to restart the Information Store service until you clear up some disk space.

Best Practices

As with anything there are some best practices you can follow in order to maintain a healthy Information Store.

  • Locating the Exchange program files, SMTP queues, transaction logs and database files on separate disk arrays is ideal. If budget constraints will not allow for this, locating the program files, transaction logs and SMTP queues on separate partitions on one disk array and the database files on a separate disk array will still offer some performance increases at a reduced cost.
  • All files should be located on redundant disk arrays. RAID 1 is the minimum recommended level, with RAID 5 offering an increase in performance and RAID 10 offering the best performance but at an increased cost.
  • Perform regular, full backups of the Information Store to commit the transactions and flush the log files. This can be done with the native Windows backup tool, NTBackup, or a third party solution. Even if you live on the wild side and do not keep backups of your data, it is important to do this to prevent the disk from filling up with log files and running out of space.
  • Do not use circular logging. As mentioned circular logging will not allow you to replay the transaction logs limiting you to recovering only the data from the latest full backup set.

The Information Store is the most critical component of Exchange Server 2000/2003 and a proper understanding of its structure is important to know for anyone tasked with managing and maintaining an Exchange server. 

For more information see:

Backup and Restore Exchange with NTBackup
http://www.msexchange.org/tutorials/Exchange-2003-Backup-Restore-NTBACKUP.html

Moving Exchange Database and Log Files
http://msexchange.org/tutorials/MF001.html

Moving SMTP Queues
http://www.msexchange.org/tutorials/SMTP_Virtual_Server_Uncovered.html

Featured Links