PowerShell and XML and SQL Server

PowerShell offers support for XML data directly though its [xml] datatype and this article is going to look at that.

First of all to make use of the builtin [xml] datatype just prefix a variable name with [xml] when you assign something to it.

PS C:\demos> [xml]$order = "<order customer='joe'>
>>         <line price='3.00' qty='4'>nut</line>
>>         <line price='1.00' qty='3'>hammer</line>
>>         </order>"
>>

At first glance this appears to be an ordinary assignment of a string to a variable. But it is doing more than that, it is checking the string to make sure that it is well-formed XML, i.e. text that conforms to the XML specification. For example the following will produce an error

PS C:\demos> [xml]$xdata = "<order ><line>stuff</line></Order>"
Cannot convert value "<order ><line>stuff</line><Order>" to type "System.Xml.Xm
lDocument". Error: "Unexpected end of file has occurred. The following elements
 are not closed: Order, order. Line 1, position 34."
At line:1 char:12
+ [xml]$xdata  <<<< = "<order ><line>stuff</line><Order>"

because of a mis-match between the opening <order> tag and the closing <\Order> tag. So when you use an [xml] variable in PowerShell you can be sure that what it contains is XML.

So $order is an [xml] variable, what can we do with it? One of the nice things about PowerShell is that if you are not sure what you can do with a varable you can always ask by piping the variable to the get-member function.

PS C:\demos> $order | get-member

   TypeName: System.Xml.XmlDocument
Name                        MemberType            Definition
----                        ----------            ----------
ToString                    CodeMethod            static System.String XmlNo...
add_NodeChanged             Method                System.Void add_NodeChange...
add_NodeChanging            Method                System.Void add_NodeChangi...
add_NodeInserted            Method                System.Void add_NodeInsert...
add_NodeInserting           Method                System.Void add_NodeInsert...
add_NodeRemoved             Method                System.Void add_NodeRemove...
add_NodeRemoving            Method                System.Void add_NodeRemovi...
AppendChild                 Method                System.Xml.XmlNode AppendC...
Clone                       Method                System.Xml.XmlNode Clone()
CloneNode                   Method                System.Xml.XmlNode CloneNo...
CreateAttribute             Method                System.Xml.XmlAttribute Cr...
CreateCDataSection          Method                System.Xml.XmlCDataSection...
CreateComment               Method                System.Xml.XmlComment Crea...
CreateDocumentFragment      Method                System.Xml.XmlDocumentFrag...
CreateDocumentType          Method                System.Xml.XmlDocumentType...
CreateElement               Method                System.Xml.XmlElement Crea...
CreateEntityReference       Method                System.Xml.XmlEntityRefere...
CreateNavigator             Method                System.Xml.XPath.XPathNavi...
CreateNode                  Method                System.Xml.XmlNode CreateN...
CreateProcessingInstruction Method                System.Xml.XmlProcessingIn...
CreateSignificantWhitespace Method                System.Xml.XmlSignificantW...
CreateTextNode              Method                System.Xml.XmlText CreateT...
CreateWhitespace            Method                System.Xml.XmlWhitespace C...
CreateXmlDeclaration        Method                System.Xml.XmlDeclaration ...
Equals                      Method                System.Boolean Equals(Obje...
GetElementById              Method                System.Xml.XmlElement GetE...
GetElementsByTagName        Method                System.Xml.XmlNodeList Get...
GetEnumerator               Method                System.Collections.IEnumer...
GetHashCode                 Method                System.Int32 GetHashCode()
GetNamespaceOfPrefix        Method                System.String GetNamespace...
GetPrefixOfNamespace        Method                System.String GetPrefixOfN...
GetType                     Method                System.Type GetType()
get_Attributes              Method                System.Xml.XmlAttributeCol...
get_BaseURI                 Method                System.String get_BaseURI()
get_ChildNodes              Method                System.Xml.XmlNodeList get...
get_DocumentElement         Method                System.Xml.XmlElement get_...
get_DocumentType            Method                System.Xml.XmlDocumentType...
get_FirstChild              Method                System.Xml.XmlNode get_Fir...
get_HasChildNodes           Method                System.Boolean get_HasChil...
get_Implementation          Method                System.Xml.XmlImplementati...
get_InnerText               Method                System.String get_InnerText()
get_InnerXml                Method                System.String get_InnerXml()
get_IsReadOnly              Method                System.Boolean get_IsReadO...
get_Item                    Method                System.Xml.XmlElement get_...
get_LastChild               Method                System.Xml.XmlNode get_Las...
get_LocalName               Method                System.String get_LocalName()
get_Name                    Method                System.String get_Name()
get_NamespaceURI            Method                System.String get_Namespac...
get_NameTable               Method                System.Xml.XmlNameTable ge...
get_NextSibling             Method                System.Xml.XmlNode get_Nex...
get_NodeType                Method                System.Xml.XmlNodeType get...
get_OuterXml                Method                System.String get_OuterXml()
get_OwnerDocument           Method                System.Xml.XmlDocument get...
get_ParentNode              Method                System.Xml.XmlNode get_Par...
get_Prefix                  Method                System.String get_Prefix()
get_PreserveWhitespace      Method                System.Boolean get_Preserv...
get_PreviousSibling         Method                System.Xml.XmlNode get_Pre...
get_SchemaInfo              Method                System.Xml.Schema.IXmlSche...
get_Schemas                 Method                System.Xml.Schema.XmlSchem...
get_Value                   Method                System.String get_Value()
ImportNode                  Method                System.Xml.XmlNode ImportN...
InsertAfter                 Method                System.Xml.XmlNode InsertA...
InsertBefore                Method                System.Xml.XmlNode InsertB...
Load                        Method                System.Void Load(Stream in...
LoadXml                     Method                System.Void LoadXml(String...
Normalize                   Method                System.Void Normalize()
PrependChild                Method                System.Xml.XmlNode Prepend...
ReadNode                    Method                System.Xml.XmlNode ReadNod...
RemoveAll                   Method                System.Void RemoveAll()
RemoveChild                 Method                System.Xml.XmlNode RemoveC...
remove_NodeChanged          Method                System.Void remove_NodeCha...
remove_NodeChanging         Method                System.Void remove_NodeCha...
remove_NodeInserted         Method                System.Void remove_NodeIns...
remove_NodeInserting        Method                System.Void remove_NodeIns...
remove_NodeRemoved          Method                System.Void remove_NodeRem...
remove_NodeRemoving         Method                System.Void remove_NodeRem...
ReplaceChild                Method                System.Xml.XmlNode Replace...
Save                        Method                System.Void Save(String fi...
SelectNodes                 Method                System.Xml.XmlNodeList Sel...
SelectSingleNode            Method                System.Xml.XmlNode SelectS...
set_InnerText               Method                System.Void set_InnerText(...
set_InnerXml                Method                System.Void set_InnerXml(S...
set_Prefix                  Method                System.Void set_Prefix(Str...
set_PreserveWhitespace      Method                System.Void set_PreserveWh...
set_Schemas                 Method                System.Void set_Schemas(Xm...
set_Value                   Method                System.Void set_Value(Stri...
set_XmlResolver             Method                System.Void set_XmlResolve...
Supports                    Method                System.Boolean Supports(St...
Validate                    Method                System.Void Validate(Valid...
WriteContentTo              Method                System.Void WriteContentTo...
WriteTo                     Method                System.Void WriteTo(XmlWri...
Item                        ParameterizedProperty System.Xml.XmlElement Item...
order                       Property              System.Xml.XmlElement orde...

PS C:\demos>

Notice that an [xml] PowerShell variable is an .NET XmlDocument. So everything you know about an XmlDocument can be applied to a PowerShell [xml] variable. In a later article we will look at how you can leverage this fact, but for now we will stick to the XML support that is directly part of PowerShell.

Look closely at the information returned by get-member… notice there is a property named “order” of type XmlElement. If you lookup XmlDocument in MSDN you will not find an order property for it, this is something that PowerShell has added.

The order property is in fact the root, or document, element for the $order document. PowerShell has a simple path language for XML that makes use of the additional properties that PowerShell adds to XML elements. This language is not XPath but still very useful. The simplest thing you can do with an XML variable is ask what it contains:

PS C:\demos> $order
order
-----
order

PS C:\demos>

It produces a table that shows the children of the $order variable. It shows that the $order variable has a single child named “order” whose value is “order”. Since we saw that order is a property of the $order variable we can append it with a “.” to that variable to see what that contains:.

PS C:\demos> $order.order
customer                                line
--------                                ----
joe                                     {line, line}

PS C:\demos>

When PowerShell displays the children of an element in at table, or list, it show both the elements and attributes of that element.

If you prefer to see the children of an [xml] variable in list form you can pipe the results to the format-list function:

PS C:\demos> $order.order | format-list

customer : joe
line     : {line, line}

Here we see that line has a number of lines as children.

We can extend this dotted property syntax further and find the value for customer:

PS C:\demos> $order.order.customer joe PS C:\demos>

We can tell that this is just a plain old value because the telltale “——-” used to separate the names of properties from their values is not there. Likewise if we had used a list form the name : is not there.

Now let’s use the dotted property syntax to see what the value of line is.

PS C:\demos> $order.order.line
price                      qty                        #text
-----                      ---                        -----
3.00                       4                          nut
4.00                       2                          hammer

Here we can see that PowerShell has output a two lines, one for each line element in the order element. price and qty are attributes of the line and the special name #text indicates that the line element contains some text.

Since there are a number of line elements we can refer to them individually using an array syntax.

PS C:\demos> $order.order.line[0]
price                      qty                        #text
-----                      ---                        -----
3.00                       4                          nut

PS C:\demos> $order.order.line[1]
price                      qty                        #text
-----                      ---                        -----
4.00                       2                          hammer

We can even ask how many line elements there are.

PS C:\demos> $order.order.line.Length
2

Under the covers $order.order.line is a .NET array and we can do with it anything we might do with a .NET array. If we want the price of the second line element…

PS C:\demos> $order.order.line[1].price
4.00

The dotted property syntax is actually a simple path language and is very similar to that used in data binding in ADO.NET, Windows Workflow, and Windows Presentation Framework. Now that we see the basics of this syntax lets see if we can do some calculations. Let’s calculate the value of the order.

PS C:\demos> $value = 0
PS C:\demos> $order.order.line | %{$value += [double]$_.qty * $_.price}
PS C:\demos> $value
20

Note that we had to cast the qty attribute to a double to force a numeric computation.

First we create a $value variable and initialize it to 0. Then we use the dotted property syntax to select the line elements in the $order variable. $order.order.line is actually a .NET array. When a .NET array is passed into a segment of a pipeline, as indicated by the | symbol, each item in the array is passed into that segment one at a time for processing.

The %{} syntax in the second segment of the pipeline says to execute what is between the braces once for each item that flows into the pipe. The $_ refers to the individual item that is flowing through that segment, in this case a line element. So as the line elements flow through the pipe the script between the braces calculates the extended price, i.e. the price times the qty, for the line element and accumulates that product into the $value element. Last we get the value of the $value element.

Next let’s do this same calculation again, but be more “pipeline” oriented.

PS C:\demos> $order.order.line | %{[double]$_.qty * $_.price} | &{begin{$v = 0}process{$v += $_}end{$v}}
20

This is making use of an in-line function, the part inside of &{}. It starts the same way the previous solution did, by passing the line elements into the pipeline. The second segment of the pipeline calculates the extended price for the line element and passes it onto the last segment.

The last segement is the in-line function and consists of three parts. The first part is inside of begin{} and is executed just once no matter how many elements pass through the pipeline and it initializes the $v variable to 0.

The second part of the function, the part inside of process{}, is executed once on each item that passes through the pipeline. It just adds each of the extended prices it receives to the $v variable.

The last part of the function, the part inside of the end{}, is also executed just once, but only after the process part is finished processing all of the items passing through the pipeline. All it does is refer to the $v so it is returned.

We can carry this even further. If we know we will be just summing up some numbers that pass though the pipeline we can change our in-line function into a named function and reuse it over and over.

PS C:\demos> function sum {
>> begin{$v = 0}
>> process{$v += $_}
>> end{$v}
>> }
>>
PS C:\demos> $order.order.line | %{[double]$_.qty * $_.price} | sum
20

So far our example has used literal XML, that is we have literally typed in the XML document we want to process. In real life you will probably have some XML files to process. There are a number of ways to read a file with PowerShell but the easiest it to use for our purposes is the ${} syntax. The file c:\demos\order.xml contains the same XML document we used to initialize $order in the previous examples.

PS C:\demos> [xml]$orders2 = ${c:\demos\order.xml}
PS C:\demos> $orders2
order
-----
order

PS C:\demos>

Note that the full path was specified.

Another source of XML data is SQL Server. It uses a FOR XML clause in a SELECT statement to return the results of a query in the form of XML. For example the following query is made against a table that contains information about last names there were filled out on United States census forms.

SELECT name, rank FROM LastNames FOR XML AUTO, ROOT('Names')

The LastNames table contains the names from the census forms along with their rank, ranked by how many forms contained that name. Now we have to somehow use PowerShell to execute a query against SQL Server. Besides its builtin functionality PowerShell can use any functionality provided by a .NET class. So we will use ADO.NET to execute the query. The tricky part, as we shall see in shortly, is how to stick the results into an [xml] PowerShell variable.

First we need a connection to SQL Server.

PS C:\demos> $conn = new-object System.Data.SqlClient.SqlConnection
PS C:\demos> $conn.ConnectionString = "server=.;integrated security=true;database=scratch"
PS C:\demos> $conn.Open()

To make the connection we use the PowerShell new-object function to create an instance of SqlConnection then fill out the connection string. Now we need a command.

PS C:\demos> $cmd = new-object System.Data.SqlClient.SqlCommand
PS C:\demos> $cmd.CommandText="SELECT name, rank FROM LastNames FOR XML AUTO,ROOT('Names')"
PS C:\demos> $cmd.Connection=$conn

Similar to making the connection we make the command by creating an instance of a SqlCommand and fill out the CommandText and Connection.

When ever a SqlCommand uses FOR XML it must be executed using ExecuteXmlReader, which returns its result in the form of an XmlReader.

PS C:\demos> $xrdr = $cmd.ExecuteXmlReader()

If were were writing a program in C# or VB.NET then at this point we would use the XmlReader to Load an XmlDocument. Wait! Remember what we found out when we used get-member to find out what we could do with an [xml] PowerShell variable? We found out that is was an XmlDocument. So we can just use the Load method on an [xml] varible load it with the results of our query. These is one problem though…

PS C:\demos> [xml]$names.Load($xrdr)
You cannot call a method on a null-valued expression.
At line:1 char:18
+ [xml]$names.Load( <<<< $xrdr)

A freshly created [xml] variable has not yet been initialized so we cannot use it to load anything. There are a couple of ways to intialize it. One is to just assign an XmlDocument to it.

PS C:\demos> [xml]$names = New-Object System.Xml.XmlDocument
PS C:\demos> [xml]$names.Load($xrdr)
PS C:\demos> $names
Names
-----
Names

Here we have use new-object to create an XmlDocument and then assign it to an [xml] variable. After we have done that the Load method can be used to read the results of our FOR XML query into the $order3 variable. The we have PowerShell to output the $name variable in tablular form. Here we can see it contains the expected ‘Names” root element.

Another way to initialize the [xml] varible is to just put some XML into it, then write over it when the Load method is used.

PS C:\demos> [xml]$names = '<a/>'
PS C:\demos> $names.Load($xrdr)
PS C:\demos> $names
Names
-----
Names

Now that we have our Names XML document let’s take a quick look inside of it.

PS C:\demos> $names.Names
LastNames
---------
{SMITH, JOHNSON, WILLIAMS, JONES...}

PS C:\demos> $names.Names.LastNames.Length
88799
PS C:\demos>

Here we can see that there are 88799 names the document and the first few are SMITH,JOHNSON and so on.

One last thing… there is a bug in the RC2 PowerShell that hopefully will be fixed by the time the RTM version is released. Elements whose name is “item” will cause problems when the table or list formatter processes them.

PS C:\demos> [xml]$stuff = '<stuff><item/><item/></stuf
PS C:\demos> $stuff
stuff
-----
stuff

PS C:\demos> $stuff.stuff
format-default : The member "Item" is already present.

You will still be able to process “item” element otherwise, it is just an issue when format-table or format-list is used.

We have seen the PowerShell has some pretty straightforward support for XML in its simple dotted path expressions. You can work with literal XML, XML from files and XML as it is returned from SQL Server in FOR XML queries. The PowerShell builtin path language that is very similar to that used in data binding which you may have already used. I didn’t cover it in this article, but PowerShell also allows you to make use of all the XML capabilities built into .NET… but that leaves me something to work on for the next article.

Dan

dan@pluralsight.com


Posted Oct 28 2006, 02:16 PM by dan-sullivan
Filed under: ,

Comments

Jeffrey P. Snover wrote re: PowerShell and XML and SQL Server
on 10-28-2006 5:48 PM
Very impressive blog entry. You've either been using PowerShell a lot or are a very quick study. I can't wait to read the next one. I see that your previous logs have a lot of SQL in them. I would love your opinion about our DataTable adapter (get a datatable and pipe it to Get-Member to see what I mean).

Jeffrey Snover [MSFT]
Windows PowerShell/MMC Architect
Visit the Windows PowerShell Team blog at: http://blogs.msdn.com/PowerShell
Visit the Windows PowerShell ScriptCenter at: http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx
Dan wrote re: PowerShell and XML and SQL Server
on 10-29-2006 8:32 AM
Hi Jeffery,

Thanks for the comments.

Making the columns as named properties of the DataTable is useful, but it seems to have introduced a number of other issues that actually limit the utility of a DataTable in PowerShell. I working on clarifying what these are... if you are interested please drop me a note at my pluralsight address, dan@pluralsight.com

Dan
Brian Fernandes wrote Very Cool
on 11-29-2006 5:47 AM
Great article Dan, I took my first look at PowerShell today and these examples helped me do exactly what I wanted with my XML files in no time.

I see you covered the "Item" bug as well, sadly it hasn't been fixed in the RTM version either; good to know it's a bug and not something I'm doing incorrectly.

Add a Comment

(required)  
(optional)
(required)  
Remember Me?