SQL High CPU troubleshooting checklist

 

1. From Windows task manager check the overall CPU utilization. Collect the details of number of logical processors present on the box.

2.  Select * from sys.dm_os_sys_info;     (DMV)

3. SELECT * FROM sys.dm_os_performance_counters WHERE counter_name = 'CPU%'

    

4. If the High CPU is causing by process other than SQL Server process (sqlservr.exe) engage the team which takes care of that process.

5. Open Perfmon and add the below counters:

Process (sqlservr):

% Privileged Time

% Processor Time

% User Time

 

https://msdnshared.blob.core.windows.net/media/2017/07/236.png

 

5.  Run the SQL Standard report to get the list of Top CPU queries:

Right Click on the instance, go to reports> Standard reports

https://msdnshared.blob.core.windows.net/media/2017/07/422.png

 

6. Check if SQL System threads are consuming high CPU:

select * from sys.sysprocesses where cmd like 'LAZY WRITER' or cmd like '%Ghost%' or cmd like 'RESOURCE MONITOR'

Ghost cleanup thread >>>> Check if the user deleted large number of rows

Lazy Writer thread >>>>>>> Check if any memory pressure on the server

Resource Monitor thread >> Check if any memory pressure on the server

 

Step-1: Identify SQL Instance

Determine if high CPU condition is caused by SQL or Windows. If it is SQL and your server is hosting multiple SQL Instances identify the instance responsible for high CPU.

Open task manager –> goto Performance tab –> click on Open Resource Monitor –> click on CPU tab and click on CPU column to arrange in descending order.

Now check the box against sqlservr.exe process and click on Services below to identify the SQL instance

 

http://dba-datascience.com/wp-content/uploads/2018/03/High_CPU2-300x228.jpg

Step-2: Identify SQL Kernel or User

Open Performance Monitor and add the below counters:

Step-3: Caused by SQL Kernel

·         Divide SQL Server %Privileged Time by number of logical processors to understand if the usage is due to SQL Kernal. If this value is more than 30% involve the Windows Admins. (Process (sqlservr): % Privileged Time / No of logical cpus)

·         Potential causes could be antivirus faulty drivers or encryption services.

 

Step-4: Caused by User

SQL Background Threads:

SELECT * FROM sys.sysprocesses 
WHERE cmd like 'LAZY WRITER' or cmd like '%Ghost%' or cmd like 'RESOURCE MONITOR'
Ghost clean up thread  –  Check if the user deleted large number of rows
Lazy Writer thread – Check if any memory pressure on the server
Resource Monitor thread – Check if any memory pressure on the server

 

Query Execution:

SELECT GETDATE() AS "RunTime", st.text AS batch, SUBSTRING(st.text,statement_start_offset / 2+1
,((CASE WHEN a.statement_end_offset = -1 
THEN (LEN(CONVERT(nvarchar(max),st.text)) * 2)
ELSE a.statement_end_offset END)  - a.statement_start_offset) / 2+1)  AS current_statement
, qp.query_plan, a.* 
FROM sys.dm_exec_requests a CROSS APPLY sys.dm_exec_sql_text(a.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(a.plan_handle) AS qp
ORDER BY CPU_time DESC

http://dba-datascience.com/wp-content/uploads/2018/03/High_CPU3-300x241.jpg

 

Incorrect Configurations:

Check the following configurations are according to Microsoft Best Practice recommendations:

https://msdnshared.blob.core.windows.net/media/2017/07/319.png

 

 

 

 

 

 I am going to share some of the important Performance Counters for monitoring the SQL Server Performance.

Memory Information Counters: Using this script, you can find the total amount of physical memory on the computer.

SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'Total Server Memory (KB)'

GO

 

Page Information Counters:

Using this script, you can find that how long pages stay in the buffer cache in seconds. If the value is more significant, SQL Server does not read from disk, and it serves the request from the buffer only.

SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'Page life expectancy'

GO

 

It shows the value of the number of physical database page reads that are issued per second.
Around 80 per second is normal, but if it is above, you may get a timeout in your application.

 

SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'page reads/sec'

GO

 

It shows the number of requests to find a page in the buffer pool.

 



SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'Page lookups/sec'

GO

 

 

 

 

User Connection Counter:

It shows the number of different users that are connected to the SQL Server.



SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'User Connections'

GO

 

Buffer Information Counters:

A greater value indicates that more significant number of requests are satisfied with the data cache and SQL Server is getting queries data from the memory instead of disk.

 

SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'Buffer Cache Hit Ratio'

GO

 

CPU Counters:

Using this script, you can find the usage of CPU in percentage.

 

SELECT *FROM sys.dm_os_performance_counters

WHERE counter_name = 'CPU usage %'

GO