This is a repost of a post I wrote for www.rocksolidsql.net
In RockSolid we have tried for every feature we have added, to take a step back and re-evaluate what is truly best practice rather than just accepting current industry standards for our processes. One such area is that of database index maintenance, and in this article I will explain our approach to index management.
First let me talk about the most common method of maintaining database indexes for a SQL Server database (true for almost all SQL Server databases). DBAs are aware that index maintenance is an important process for improving performance as it:
- reduces fragmentation which improves scan performance
- adds free space (fill) into the index which avoids page splits improving insert/update performance
And the most common method for running index maintenance is to schedule a chunk of time on the weekend and effectively have an outage while the maintenance process runs through all indexes on an instance sequentially carrying out a “rebuild” (which is effectively a recreation of the index).
This is common, but is it best practice? There are a number of issues that start to pop up especially when database maintenance windows are decreasing and the requirement to have the database online and operation at specific times increases.
The first issue is how long should the database maintenance window be for? How long can you tell the application users, the web site manager, the developer and so on that the database will be “busy” for while performing maintenance? What time can they start their batch process, their invoice run, their data entry? Sure a DBA can calculate the time required to carry out maintenance for a few databases, and recalculate these maintenance times routinely as the databases grow, but as many enterprise sites exceed 1000 databases, they cannot practically calculate exact window requirements for all databases organization wide. So what happens is they allocate a catch all timeframe, half a day, all day Sunday, 10 hours etc which is sufficient enough to cover all requirements. Regardless if a database is being maintained for 10 mins, 30 mins or 2 hours, it is effectively considered under maintenance and unavailable for the duration of the generic window.
The second issue is how effective and efficient is the index maintenance process being? Half a day on a Sunday to rebuild all database indexes is useful, but do all the indexes in a database need to be rebuilt weekly? Or, are there indexes that should be rebuilt much more frequently? Again something that can be easily determined for 1, 2 or 10 databases, not so easily accurately determined for hundreds or thousands of databases.
And the third issue is what options should be applied when rebuilding a given index? What is the optimal fill factor? Should sorting take place in memory or tempdb? Should online or offline rebuilds take place? When we are talking hundreds or thousands of databases, we are usually talking tens of thousands, or hundreds of thousands of indexes. Again, not a practical question for a DBA to answer using traditional methods.
So what does RockSolid do to address these issues and evolve a new best practice?
Firstly, with RockSolid you do not allocate large random chunks of time for the purpose of rebuilding indexes and adjust all business and application requirements around that maintenance. Instead you tell RockSolid what time you have available for carrying out index maintenance, regardless of how long. Maybe you have 1 hour here on a Tuesday, and 30 minutes here between batch jobs on a Friday, and 15 minutes here between processes on a Sunday. Tell RockSolid what time you have available, and also tell it if that time is considered an outage (so no application processes are expected to run) or if the database should remain online during the window allocation (online maintenance).
But, again the issue the DBA may face is that with such a large number of databases under their control, accurately knowing what windows are actually available may be almost impossible. So RockSolid helps to overcome this by providing a workload analysis tool, which will analyze your historical use of the database and find times when little or no activity has been occurring on the database. Using this tool the DBA receives recommendations for when maintenance can be scheduled (typically they would then confirm these windows with the application users before implementing).
Next, when RockSolid is given a maintenance window to use, at run time it carries out an analyses of a number of factors to determine which indexes will provide the most return in terms of performance, given the time allocated to the maintenance process. RockSolid evaluates factors such as fragmentation, index usage, index size and of course the time allocated to determine what is the best “bang for your buck”. The time allocated is important because RockSolid will never exceed the boundaries of a maintenance window, for any type of maintenance process that is managed by RockSolid.
Thirdly, RockSolid looks at all factors necessary to decide the most appropriate use of Fill Factors (index usage, read/write ratios, time to next maintenance window etc) and rebuild options (sort in tempdb, online/offline based on the window type) to ensure every index is rebuilt using an optimal strategy, not only for the rebuild itself but for the performance of that index for it’s intended purpose following the rebuild.
Lastly, RockSolid provides feedback on the windows you have allocated to the database for maintenance. Are the windows too large/too many? Are their too few maintenance windows? A recommendation may be that you will see an estimated n% improvement in performance by adding a 30 minute daily maintenance window as apposed to a once off weekly window, for example. RockSolid pro-actively provides this analysis and feedback, and is constantly re-evaluating the maintenance requirements for the lifetime of a database.
This approach is redefining how database maintenance is performed, resulting in improvements in availability of the database, but also resulting in improved performance of the database through more effective index management.