SQL Server 2008 CDC IO测试

前两天测试了一下SQL Server 2008的CDC功能,今天又花了一些时间对CDC功能对IO的影响进行了一些评估。

整个评估的思路是这样的:

  1. 创建两个数据库
  2. 在两个数据库中分别创建一张结构完全相同的表,一个数据库启用CDC功能,而另外一个禁用CDC功能
  3. 向两张表中写入相同行数的数据
  4. 利用动态管理视图sys.dm_io_virtual_file_stats来获得两个数据库文件上的IO
  5. 利用sysindexes来获得两个数据库中数据表的存储消耗情况

因为是在虚拟机中进行的测试,所以选取了比较小的数据表(AdventureWorks数据库中的SalesOrderDetails),大约有12万行数据。

评估的结果如下:

从上面两张表中可以看到,CDC激活后日志文件的读会显著增加。原因是CDC在写更新跟踪表时,需要读取日志。

从上面两张表中可以看到激活CDC后数据文件的写入和日志文件写入都会显著增加,不过需要考虑到CDC激活后会需要多写一张表,在本例中就是dbo_SalesOrderDetails_CT,所以这种增加是可以理解的。当然在生产环境中并不会对数据表的所有列进行CDC监控,所以激活CDC对IO写入的影响还需要针对不同情况进行分析。

 

从上面这张图可以看出,CDC激活后会生产数据表不会消耗更多的存储空间,但是更新跟踪表会需要俄外的存储空间。另外可以发现的一点是,在本例中dbo_SalesOrderDetail_CT表消耗的空间比SalesOrderDetail表多,这是因为在dbo_SalesOrderDetail_CT表中加入了一些额外的字段,例如_$start_lsn和_$end_lsn,同时注意观察dbo_SalesOrderDetail_CT表会发现,SQL Server在这张表上使用_$start_lsn、_$end_lsn和_$seqval三个字段作为聚簇索引,而SalesOrderDetail表上原来的聚簇索引(SalesOrderID,SalesOrderDetailID)再加上_$start_lsn、_$end_lsn和_$seqval三个字段则被创建为一个非聚簇索引,所以这就导致了dbo_SalesOrderDetail_CT表需要消耗比原始表更多的空间,不过原始数据表上的非聚簇索引不会在CDC跟踪表上被创建,这也就说明了原始数据表聚簇索引的大小也会对CDC引发的IO产生影响。

经过以上测试,我们可以发现以下情况:

  • CDC激活会显著增加日志文件的读操作
  • CDC激活后更新跟踪表会产生额外的写入,并消耗存储空间
  • CDC激活后,原数据表的聚簇索引尺寸会影响到CDC产生的IO数据量,而原始数据表上的非聚簇索引则不会

因此建议CDC激活的环境下,应该将更新跟踪表写入与原始表不同的文件组并存放在不同的存储设备上,同时应该注意为日志文件选取可提高读取性能的存储硬件上,比如RAID10。

最后,附上是用于测试的脚本:

–Step 1: Test IO under CDC enabled enviroment
USE master
PRINT getdate()
PRINT ‘Create test database’
GO
CREATE DATABASE TestIO_CDC
GO

PRINT getdate()
PRINT ‘Enable CDC feature on the test database’
GO
USE TestIO_CDC
EXEC sp_cdc_enable_db_change_data_capture;
GO
PRINT getdate()
PRINT ‘Create test table’
GO
CREATE TABLE dbo.SalesOrderDetail(
    SalesOrderID int NOT NULL,
    SalesOrderDetailID int NOT NULL,
    CarrierTrackingNumber nvarchar(25) NULL,
    OrderQty smallint NOT NULL,
    ProductID int NOT NULL,
    SpecialOfferID int NOT NULL,
    UnitPrice money NOT NULL,
    UnitPriceDiscount money NOT NULL,
    rowguid uniqueidentifier ROWGUIDCOL  NOT NULL,
    ModifiedDate datetime NOT NULL,
CONSTRAINT PK_SalesOrderDetail PRIMARY KEY CLUSTERED
    (SalesOrderID ASC, SalesOrderDetailID ASC)
)
GO
PRINT getdate()
PRINT ‘Enable CDC feature on the test table’
GO
EXEC sp_cdc_enable_table_change_data_capture ‘dbo’, ‘SalesOrderDetail’, @role_name= NULL,
    @supports_net_changes = 1;
GO
PRINT getdate()
PRINT ‘Start inserting data’
GO
INSERT INTO dbo.SalesOrderDetail
    (SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber,
    OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount,
    rowguid, ModifiedDate)
    SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber,
    OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount,
    rowguid, ModifiedDate FROM AdventureWorks.Sales.SalesOrderDetail
GO

–Step 2: Test IO under CDC disabled enviroment
PRINT getdate()
PRINT ‘Create test database’
GO
USE master
CREATE DATABASE TestIO_Non_CDC
GO
USE TestIO_Non_CDC
GO
PRINT getdate()
PRINT ‘Create test table’
GO
CREATE TABLE dbo.SalesOrderDetail(
    SalesOrderID int NOT NULL,
    SalesOrderDetailID int NOT NULL,
    CarrierTrackingNumber nvarchar(25) NULL,
    OrderQty smallint NOT NULL,
    ProductID int NOT NULL,
    SpecialOfferID int NOT NULL,
    UnitPrice money NOT NULL,
    UnitPriceDiscount money NOT NULL,
    rowguid uniqueidentifier ROWGUIDCOL  NOT NULL,
    ModifiedDate datetime NOT NULL,
CONSTRAINT PK_SalesOrderDetail PRIMARY KEY CLUSTERED
    (SalesOrderID ASC, SalesOrderDetailID ASC)
)
GO
PRINT getdate()
PRINT ‘Start inserting table’
GO
INSERT INTO dbo.SalesOrderDetail
    (SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber,
    OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount,
    rowguid, ModifiedDate)
    SELECT SalesOrderID, SalesOrderDetailID, CarrierTrackingNumber,
    OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount,
    rowguid, ModifiedDate FROM AdventureWorks.Sales.SalesOrderDetail
GO

PRINT getdate()
PRINT ‘Wait 30 seconds for IO finished’
GO
WAITFOR DELAY ‘0:0:30’
GO
PRINT getdate()
PRINT ‘Start getting the IO statistics data’
GO
USE master
GO
–IO Statistics
SELECT DB_NAME(database_id) AS Database_Name,
    FILE_NAME(file_id) AS File_Name, num_of_reads, num_of_bytes_read,
    num_of_writes, num_of_bytes_written
    FROM sys.dm_io_virtual_file_stats(DB_ID(N’TestIO_CDC’), NULL)
UNION
SELECT DB_NAME(database_id) AS Database_Name,
    FILE_NAME(file_id) AS File_Name, num_of_reads, num_of_bytes_read,
    num_of_writes, num_of_bytes_written
    FROM sys.dm_io_virtual_file_stats(DB_ID(N’TestIO_Non_CDC’), NULL);

–Storage Concumed
USE TestIO_CDC
GO
SELECT ‘Enabled’ AS ‘CDC Feature’, OBJECT_NAME(id) AS TableName, dpages * 8 AS ‘Data (KB)’, reserved * 8 AS ‘Reserved (KB)’
    FROM TestIO_CDC.sys.sysindexes
    WHERE id IN(OBJECT_ID(‘dbo.SalesOrderDetail’), OBJECT_ID(‘cdc.dbo_SalesOrderDetail_CT’))
        AND indid = 1
GO
USE TestIO_Non_CDC
GO
SELECT ‘Disabled’ AS ‘CDC Feature’, OBJECT_NAME(id) AS TableName, dpages * 8 AS ‘Data (KB)’, reserved * 8 AS ‘Reserved (KB)’
    FROM TestIO_Non_CDC.sys.sysindexes
    WHERE id IN(OBJECT_ID(‘dbo.SalesOrderDetail’))
GO

Advertisements
Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: