Finding SQL Server Queries with Large Memory Grants


If you are ever asked to figure out what queries are consuming all the memory from your SQL Server instance, you can use the data from the DMV to tell you which queries have been granted what amount of RAM. The query is basic:

SELECT r.session_id 
                         statement_start_offset / 2 + 1,
                         (CASE WHEN statement_end_offset = - 1 
                            THEN LEN(
                                     ) * 2 
                         ELSE statement_end_offset 
                         END - statement_start_offset 
                    ) / 2) 
        FROM sys.dm_exec_sql_text(r.sql_handle) 
       ) AS query_text 
FROM sys.dm_exec_query_memory_grants AS mg 
     INNER JOIN sys.dm_exec_requests r 
     ON mg.session_id = r.session_id 
     CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) AS qp 
ORDER BY mg.required_memory_kb DESC;

The real task is to understand the subject of Memory Grants, and that is made much easier to understand by this article by Jay Choe.

Query memory grant (a.k.a. query work buffer) is a part of server memory used to store temporary row data while sorting and joining rows. It is called “grant” because the server requires those queries to “reserve” before actually using memory. This reservation improves query reliability under server load, because a query with reserved memory is less likely to hit out-of-memory while running, and the server prevents one query from dominating entire server memory. In the following sections, I will describe how SQL server estimates query memory requirement, and how it throttles memory grants when multiple queries are competing.

When SQL server receives a user query, it follows well-defined steps to produce a result for user. First, it creates a compiled plan, which is a set of logical instructions such as how to join rows. Next, it creates an execution plan based on the compiled plan. This execution plan contains instructions with all logical references translated to actual objects, and tracking mechanisms for query execution. Finally, the server starts execution from the top of instruction tree. Creating a compiled plan is expensive because the server needs to find the optimal plan out of hundreds of potential candidates. Distinguishing between compile and execution helps overall server performance because each compiled plan can be cached and shared among multiple execution plans. The memory grant estimate follows the same overall sequence. It has parameters saved in compiled plan, and a mechanism to calculate actual grant size at execution time.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s