Introduction.
This series of blog posts explores a new topic – something
that arose in the course of working on a recent project, namely a collection
agent to gather performance counters from several key SQL
Server Dynamic Management views (DMVs). The new project, which we plan
to make available as Open Source, connects to an instance of SQL Server,
queries key DMVs on a systematic basis, and then turns that data into Windows
performance counters. Having the DMV counter data re-formatted as Windows
performance counters then allows me to use the full set of tools my company has
developed over the years for gathering, analyzing and reporting on these
counters.
Being primarily a developer and not a Database Administrator
(DBA), I knew the DMV performance data existed, but I wasn’t familiar with its
uses. As I begin to understand its value, I see the potential to use this
counter data to help DBAs address issues in physical database design that play
a pivotal role in application performance, issues that are too often ignored,
neglected or treated in a very ad hoc manner.
The Dynamic Management Views that bear on physical database
design issues are the ones that report on database Table Index usage. The SQL
Server Index usage statistics enable DBAs to perform a cost/benefit analysis on
their current Table Index configuration and see if it needs revising. They can
use the data to understand the patterns their database applications use to
access the database, which, in turn, leads to making informed physical database
design decisions. Without analyzing these SQL Server Index usage statistics
that are available from DMVs, Database Administrators (DBAs) are forced to make
decisions about which new Indexes to define and which current Indexes to drop
in an unsatisfactory and perfunctory fashion. This is the important use case I
intend to focus on here. Unfortunately, I have not come across much in the way
of guidance that is currently available for DBAs that want to understand how to
use these statistics.
It will also prove helpful to use this Index access pattern
data in conjunction with Microsoft’s new Query Store facility, which became
available beginning in SQL Server 2016. The Query Store is a repository of
Query execution time measurements, which are stored in a new set of SQL Server dynamic
views, along with SQL execution Plan EXPLAIN text for those queries. When the
SQL Optimizer generates an execution Plan for a Query, it will make note of any
“missing” table indexes that would improve execution time. The Query Store
dutifully passes on those “missing” Index recommendations. There are costs
associated with maintaining Indexes that benefit some queries, and it is
important to understand those cost/benefit trade-offs before adopting any
recommendations from the Optimizer to assist some particular Query. Because of
those costs, additional table Indexes should never be created without
considering the overall pattern of Index usage.
While I intend to tackle difficult issues of physical
database design here, specifically in the area of Index selection, I want to be
clear not to represent myself as an experienced SQL Server DBA. I have a
healthy respect for DBAs because they perform a very complicated job, but I am
not one of them. My perspective on Microsoft SQL Server is based on long
acquaintance with database performance issues as an application performance
analyst and performance tools developer. As an application developer, I have
worked extensively with Microsoft SQL Server technology, including designing a
SQL Server-backed repository for storing and reporting on the Windows
performance counter data that is gathered by my company’s Performance Sentry Windows performance
counter collection service. While I am not a professional DBA, I am an
experienced application developer who makes extensive use of SQL Server
technology. Still, it is only in the past year that I have begun to explore
using some of the data from SQL Server Dynamic Management Views for performance
analysis.
Dynamic Management Views.
For over ten years now, SQL Server has made available a wide variety of internal performance counters that it calls Dynamic Management Views. These are memory-resident tables where SQL Server keeps track and reports on what is going on inside SQL Server as it executes. These counters are represented as Views, which means they can be readily queried using the SQL language. At this point, literally 100s of DMVs are available inside SQL Server, but I am mainly focusing on the following, at least initially:
The sys.dm_os_wait_stats DMV contains cumulative counts of SQL Server thread waits and wait time by wait reason. It is considered one of the the most important data sources available about SQL Server performance.
The other two DMVs are used to characterize Table Index
usage. In general, deciding what data columns of a database Table to index is
one of the most important physical database design decisions that DBAs face. Indexes
are hierarchical tree structures (specifically, self-balancing or b-trees), that are created and
maintained in order to speed up access to individual data records (or rows, in relational database
terminology). When one or more indexes on a Table are available, accessing
specific rows is much faster, because an index can be searched to find a result
set of applicable data rows. This is compared to sequentially scanning some or all
the rows of a large Table to satisfy a particular Query, which is how SQL
Server is forced to execute a Query if the proper Index is not available. The sys.dm_db_index_usage_stats
DMV provides basic, essential data on Index usage, counting each time a
particular Index is used in a Seek, Lookup, Update or Scan operation.
So, there are obvious benefits whenever search keys are
directed at indexed columns in a variety of database search and retrieval operations.
On the other hand, there are also costs
associated with maintaining each Index. These costs include the extra data
storage required for the index data, of course. Potentially more important than
space on disk and in resident buffer memory are the operational costs
associated with maintaining indexes, which occur whenever Update, Insert and
Delete actions are performed that require updating the tree structure and
potentially re-balancing it. These costs can be substantial, often forcing SQL
Server to lock the index while re-balancing is occurring, for example, which blocks all other concurrent requests for
data that is obtained using that index. In SQL Server, the sys.dm_db_index_operational_stats
DMV details these run-time operational costs.
Together, these two Indexing DMVs allow the DBA to
understand both the costs and the benefits of existing Table indexes.
To gain perspective about how DBAs currently approach SQL
Server performance and tuning, I recently attended the SQL Server
PASS user group annual convention (known as the PASS Summit) here in
Seattle, where querying some of these performance-oriented DMVs was a popular
topic discussed by many of the presenters. For example, Brent Ozar, a well-known SQL performance
expert, devoted a day-long pre-conference workshop to the use of free SQL scripts he and
his associates have developed over the years that gather SQL Server performance
data from several of the DMVs. The Blitz approach is to query the DMVs and then
store the Query results back inside SQL Server so that there is an historical
record. Blitz is designed to gather a baseline set of measurements that can
then be compared against a current measurement interval whenever there are
signs of problems. It is a free set of tools, with an excellent set of features,
but also some obvious limitations. For example, once the DMV counter data is
stored permanently inside SQL Server, there is much capability provided to
manage the historical data.
The Brent Ozar Blitz scripts support gathering data from the
three important DMVs listed above, although they also wander off into reporting
on individual Query execution time using other DMVs, historically, not
something straightforward due to the transient nature of some of the key
Explain data on SQL Plan execution. Beginning in SQL Server 2016, there is a
better way to look at Query execution time directly from Microsoft. This is a
new facility built into SQL Server called the Query Store, available for no
additional cost. (There is a good introduction to the new Query Store here.)
With the Query Store, Microsoft is attempting to address the legacy problems
associated with the gathering and reporting of the execution time of Queries,
along with associated performance statistics gathered while the Query was
executing.
The new MS SQL Server Query Store is not perfect. but it is an important set of new tools for the DBA that focuses on identifying long-running SQL Queries and improving their performance. Since many DBAs are responsible for constructing the stored procedures written in T-SQL that are then used by the applications, these DBAs are also responsible for the performance of these stored procedures. (Stored procedures are collections of SQL statements stored in the database which are pre-compiled into Execution Plans that are executed on demand.) Frequently, these stored procedures encapsulate Business Rules, security and other best practices that the application developers must use to access data in the database consistent with those rules. Understanding the performance of these often complex SQL procedures is far from a trivial task. DBAs need to know how long it takes the SQL statements in these stored procedures to execute, the resources that they use, and the frequency with which they are executed.
This is precisely the information that Microsoft will
gather, aggregate and store in a set of Query Store tables, which
themselves can be queried using reports built into the SQL Server Management
Studio (SSMS), which is the GUI-based app that Microsoft supplies for DBAs to
use to manage SQL Server installations. One particularly significant aspect of
the Query Store is that it addresses the major limitations associated with
capturing SQL Query performance statistics using the approach that SQL Server
performance experts like Ozar were forced to use in the past. The Query Store not
only tracks the Optimizer’s “missing index” recommendations, it is able to
associate those recommendations with the actual SQL statements that were
executed, the detailed execution Plan that was generated, and the cumulative
run-time statistics, updated each time the Plan is executed.
Comments
Post a Comment