Wednesday, December 11, 2013

Force a Windows Reboot When the OS Says No

I know how to reboot the box from the cli, you say.  I can use shutdown.exe , psshutdown.exe , or the PowerShell cmdlet Restart-Computer with the -Force parameter.

Those methods normally suffice, but there are events when a server won’t cooperate.  Maybe it was a hotfix install, or a pending service shutdown.  Or just a lack of patience.  Forcing a reboot of a machine is akin to yanking the power cord out of the wall and plugging it back it in afterward, pushing the physical reset button on a desktop computer, or holding down the power button for five seconds to power down, then pushing it again to power up.

The aforementioned methods of shutting down a system are very dangerous.  The methods proposed hereafter are also quite unsafe.

The file system might be damaged and unbootable, your company’s data might be lost, unicorns and leprechauns might cry, or some other type of unplanned horribleness could ensue from a less-than-graceful restart.

With all that being said here’s how you bend a machine to your restart will:

Scenario A:  The machine already has a pending reboot or shutdown, but can’t be restarted.

Solution: Kill the winlogon process.  The logon session will end and the machine will restart.

Here’s an example of what that might look like:

image[5]

shutdown.exe /a typically aborts a pending shutdown.  I typed it here knowing it would display the error 1115 message for the screenshot.  I had already tried running shutdown.exe /r without success.

Winlogon can be killed with your tool of choice ( pskill, for example ).  Two PowerShell examples follow:

Get-Process winlogon | Stop-Process -Force

Get-Process | where Name -match winlogon | Stop-Process –Force

Note:  In the absence of a pending reboot, killing the winlogon process can just kill a session/log off users.

 

Scenario B:  The machine does not have a pending reboot or shutdown, but for some reason you want to force a hard reboot immediately in a very ugly way, potentially causing a bluescreen in the process.  I successfully tested the following method against Windows Server 2012, forcing a bluescreen reboot.

Solution: Kill the csrss process.  The machine will then restart.

Get-Process csrss | Stop-Process -Force

image

Note:On Windows 8.1 image killing csrss failed.

When you can, reboot responsibly.  It’s not the law.  It’s just a good idea.


Credit for the winlogon idea in Scenario A goes to oasysadmin.  Killing csrss is an idea I got from Chris B(aka Otis).

Monday, July 22, 2013

Finding SQL Server Cluster Failover Events using PowerShell v3

Here’s a 2 node cluster running Windows Server 2008R2 for the OS and a single instance of SQL Server.  If the name of the clustered service/application is known, querying the event log using PowerShell’s remoting features implicitly with Invoke-Command (aliased to icm) makes finding cluster failover events pretty quick.  In this case matching against string  MSSQLSERVER is used, as the full name of the clustered service/app is 'SQL Server (MSSQLSERVER)'.

What’s the cluster look like now?

7/22/2013 2:01:35 PM :: user@deadair :: D:\Dropbox\bin
[6307] #   icm SQL00 {Get-ClusterGroup} | ft Name, OwnerNode, State, PSComputerName –a   

Name                     OwnerNode  State  PSComputerName
----                     ---------  -----  --------------
PORTAL_DTC               sql01      Online SQL00
SQL Server (MSSQLSERVER) sql01      Online SQL00
Cluster Group            sql01      Online SQL00
Available Storage        sql01      Online SQL00


Event ID 1201 is logged when resource groups are brought online within the cluster, so we’re limiting the results to that Event ID.  Running the Get-WinEvent cmdlet against all cluster nodes (just two nodes in this case – SQL00 and SQL01) and assigning the results to a variable allows sorting of entries from all nodes.  Otherwise results would be sorted within the context of the the current remote target(PSComputerName) node.

7/22/2013 2:14:19 PM :: user@deadair :: D:\Dropbox\bin
[6316] #   $FailoverEvents = icm SQL00, SQL01 {Get-WinEvent -FilterHashtable @{LogName='Microsoft-Windows-FailoverClustering/Operational';ID = 1201} | where Message -match MSSQLSERVER }  


Now to sort the event log entries…

7/22/2013 2:15:04 PM :: user@deadair :: D:\Dropbox\bin
[6317] #   $FailoverEvents | sort TimeCreated -desc | ft –a   


   ProviderName: Microsoft-Windows-FailoverClustering

TimeCreated              Id LevelDisplayName Message                                                                                                          PSComputerName
-----------              -- ---------------- -------                                                                                                          --------------
7/19/2013 10:26:24 PM  1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL01
7/13/2013 10:33:05 PM  1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL00
7/13/2013 10:09:08 PM  1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL01
7/6/2013 10:57:32 PM   1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL00
7/6/2013 10:45:58 PM   1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL01
6/1/2013 10:40:18 PM   1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL00
6/1/2013 10:18:50 PM   1201 Information      The Cluster service successfully brought the clustered service or application 'SQL Server (MSSQLSERVER)' online. SQL01

Summary: Combining Get-WinEvent with Remoting allows for a very quick recon of events in a cluster.

Prefer a one-liner instead of using a variable? Just perform the sort outside the Get-WinEvent statement block.  Let’s add the day of the week, too! 

  <# Cluster Failover Events #> icm SQL00, SQL01 `
{ Get-WinEvent -FilterHashtable @{LogName='Microsoft-Windows-FailoverClustering/Operational';ID = 1201} `
| where Message -match MSSQLSERVER } `
| sort TimeCreated -desc `
| ft @{N='DayofWeek';E={($_.TimeCreated).DayofWeek}} , TimeCreated, ID, Message -a  

image

Monday, June 24, 2013

Search SQL Server Error Logs Using PowerShell (Get-SqlErrorLog)

Scenario 1:
A new email just arrived – an alert indicating a deadlock occurred right at the beginning of the work day.

Argh.

But I did previously add a couple of trace flags (1204 and 1222) on all our prod instances that give more deadlock info in the error log.  That’ll help.  Someday I’ll get around to monitoring deadlocks with extended events…

Let me connect to that server in SSMS and drill down through Management –> SQL Server Error Logs… wait a minute.  I’ve searched through Windows event logs before using PowerShell cmdlets, so why am I _clicking_ through a GUI?  This just feels wrong.

There’s surely a cmdlet for this… found it!

image
Not sure why I have multiple copies of commands there – probably
loaded multiple modules or something.

Get-SQLErrorLog from SQL Server PowerShell Extensions (SQLPSX) was exactly what I wanted – a function to return the SQL Server Errorlog.  I know the time of the deadlock because of the error message in my inbox – it was 09:08:55 this morning.

image

Let’s give it a 1 minute window before and after that time:

Get-SqlErrorLog  -sqlserver "SQLTACOPS" | `
Where {$_.LogDate -GT '2013-06-24 09:07:55' -AND $_.LogDate -LT '2013-06-24 09:09:55' } | ft -a

image

Using additional conditions can exclude events such as backups and logins.  Pipelining output to the Export-CSV cmdlet creates a file that can be quickly saved and shared.

Get-SqlErrorLog  -sqlserver "SQLTACOPS" | `
Where {$_.LogDate -GT '2013-06-24 09:07:55' -AND `
       $_.LogDate -LT '2013-06-24 09:09:55' -AND `
       $_.ProcessInfo -NE 'Logon' -AND `
       $_.ProcessInfo -NE 'Backup'} | `
       Export-CSV -NoType 2013-06-24_090855_log_deadlock.csv

image

Scenario 2:
Two databases filled an entire volume with transaction log entries during index maintenance.  The recovery model of the databases was temporarily changed to SIMPLE, logs were shrunk, and the databases were reverted to FULL recovery model.  Even though full backups of the databases were created before and after changing the recovery model, this breaks the backup chain for those databases and my boss requested I log this in the server redbook.  We cycle the error log each night at midnight, and the the recovery models were changed 2 logs (days) ago.  Get-SqlErrorLog has a lognumber parameter, where 0 is the the index of the current log.

Get-SqlErrorLog -lognumber 2 -sqlserver "SQLTACOPS" | `
    Where {$_.LogDate -GT '2013-06-22 13:00' -AND `
    $_.LogDate -LT '2013-06-23 18:00' -AND `
    $_.ProcessInfo -NE 'Logon' -AND `
    $_.ProcessInfo -NE 'Backup'} | `
    Where {$_.text -LIKE 'Setting database option RECOVERY*' } | ft –a

image

The output (including the censored database names) was then copied and pasted into the redbook for that server.  PowerShell can be used as a tool for quick, accurate documentation.

The effort of clicking through a UI to troubleshoot has very little reuse value.  A persisted PowerShell session command line history, transcript, or saved script file can be shared, and used again and again.

A question to ask yourself: If I have to do this two or more times, should I consider investing some time to learn how to access the information/automate the process in PowerShell?

Wednesday, January 09, 2013

Identifying i/o Bottlenecks in SQL Server

In November of 2012 I presented to the Utah Valley SQL Server User Group on finding performance issues within SQL Server instances by focusing on storage and i/o for data and log files.

The Identifying i/o Bottlenecks slide deck is now available.
image

I used some slides from Wes Brown’s May 2011 SQL Rally presentation on Understanding Storage Systems and SQL Server, and included some PowerShell snippets.

There is a PowerShell script mentioned in a previous post I frequently use to view the storage used by a particular server referenced in the deck.