I’ve been playing around a bit recently with Powershell and XML. It’s one of those expirements where I’m not sure what the immediate benefit is, but it certainly is interesting seeing what kind of functionality we have available to us as data folks. I’m going to see what more I can coax out of it, but I wanted to share with you what I’ve learned so far.
First off, understand that I’m not that strong when it comes to XML. I get what it is, I understand the basic structure, but wrangling it isn’t something I’ve had to do a lot of. As a result, I’m still very much a newbie with XPath and XQuery. I understand nodes and properties, but then it starts to get muddy. Just a disclaimer before we get to far into this.
.NET, and by extension Powershell, has an XML data type. This is useful because query plans are XML documents, whether we save them off or stored in the plan cache. So it’s a fairly simply matter to suck a query plan into an XML variable:
[xml]$plan=(gc SomeSQLQuery.sqlplan)
From here, we can start browsing through our plan using the dot notation to parse the plan. The query plan itself is going to be found under the ShowPlanXML node. Under that, there’s a fairly complex layout that you can really dig into by looking at the full schema documentation. Suffice to say, if we want to see the SQL text from the query, we’d need to look at:
$plan.ShowPlanXML.BatchSequence.batch.Statements.StmtSimple.StatementText
That’s a lot of drilldown! What’s worse is, if we wanted to started finding specific operators, we would quickly get lost in a recursive arrangement of RelOp nodes and actual operators. So if we want to extract something useful out of the XML, we need to leverage XPath using the .SelectNodes() method of the XML data. The only problem here is that, for some reason, the namespace that is contained within the plan confuses any calls using .SelectNodes(). To get around this, I basically ripped out the namespace so that the defaults can be used:
[xml]$plan=(gc SomeSQLQuery.sqlplan) -replace 'xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"'
At this point, I can now start using XPath to analyze my query plan. So if I wanted to pull up all my table scans:
$plan.SelectNodes('//TableScan/Object') | ft
Or, if I wanted to get all my table or index scans:
$plan.SelectNodes('//*[contains(name(),"Scan")]/Object') | ft
And so and so forth.
Now, what does this get me? At this point, I’m not sure. I started down this road after seeing Jason Strate’s(@StrateSQL) presentation on shredding the plan cache with T-SQL. My thought process was that this might be an easier way to dissect the plan cache for useful information. In a way, I was right, because it was a little easier to grasp, but it also seems like it’s the long way around the horn to get at that information. I’ll continue to poke at it and see what I can coax out of it.
Mike, was digging into some SSISDB 2012 stuff and came across this.
One thing that this approach would be really handy for – SSISDB 2012. Pull out the blob from SSISDB, unzip the DTS file via powershell into a variable, then shred to pull out the details. Obviously a lot of it is exposed inside SQL via job details, but this might allow you to view the package directly, which is hard to do otherwise.
We’ll see if that’s a worthwhile approach.