RockSolid Eventing
Eventing is one of those killer features in RockSolid that everyone who has been involved in managing a production SQL Server environment instantly sees the benefit of.
RockSolid monitors SQL Server instances and picks up on everything that occurs from a performance, security, recoverability, maintenance or configuration aspect. Obviously everything that is detected as an “issue” of some kind raises a service request and these service requests go through the resolution processes (including automated resolution). However everything that occurs that is not necessarily a problem at that point in time (such as a backup completing, a new user being created, a stored procedure being altered) etc is logged in RockSolid as an event.
Eventing is very useful therefore for inspecting an audit trail of routine and ad-hoc events that have occurred in a SQL Server environment. Reviewing change history, security audits, recovery audits etc all have very useful roles in system management, but one of the most interesting roles of eventing relates to its application in performance management.
Now to put some context around this, many SQL Server performance tools will show you data around system performance and most good ones will be useful for displaying when negative situations have occurred, for example by graphing data.
This is an example graph, maybe it is slow the response time of a particular query, maybe it is showing your average CPU utilization, or your log growth rate. Doesn’t matter, the point is the graph shows you that there has been a significant change from whatever the norm was to what the norm is currently. Now this is typically the point where the tool stops providing information, and the DBA now goes off to perform manual investigation to try and unearth the cause of the change in performance levels.
RockSolid on the other hand, makes available its event analysis capabilities. Events are categorized based on area of impact (performance, security, recoverability, availability) and the level of impact in those areas. Events also have content, for example an event may have a performance impact on a specific stored procedure, table, database or an entire instance.
As a DBA investigating a performance issue, you can start with the analysis graphs in RockSolid which shows you a change in some performance aspect has occurred. But then you can request event analysis and RockSolid will present to you the most relevant events that occurred in the lead up to the change observed, explaining factors that could be responsible for such a situation.
For example, you may review the performance of a given stored procedure and notice that it is running 20% slower than it did last week. Requesting event analysis shows that a few days ago that particular stored procedure was altered and a query within that stored procedure had some additional predicates added. Clicking on that query would show that the indexing that was in place prior to the change is no longer relevant to this query following the change, so currently a number of scans are being used when previously seeks were in use. As a DBA you may choose to alter or add indexing to mitigate this impact.
Another example may be that on reviewing a monthly performance summary report, you notice that mid month the daily I/O levels increased for a given instance significantly. Requesting an event analysis from RockSolid could show up that at this time a new reoccurring query was executed for the first time. Drilling into this query you see that it has been run on a frequent basis since its first execution mid month. This was due to a new application report being deployed at this time, and as a DBA you can now take steps to optimize that report query if necessary, or speak with the application manager about the impact of the new report.
In summary, the purpose of RockSolid eventing is to provide explanation around why changes in system dynamics have occurred. The examples I have spoken about in this post refer to performance, but event analysis covers all key areas of database management including security, recoverability and availability.



Comments