An Xml Tidy in PowerShell or Formatting Xml with Indenting with PowerShell
I like my XML pretty. There's no format-xml cmdlet or tidy-xml in PowerShell, so here's my first try:
#Name me tidy-xml.ps1
# - this crap written by Scott Hanselman
[System.Reflection.Assembly]::LoadWithPartialName("System.Xml") > $null
$PRIVATE:tempString = ""
if ($args[0].GetType().Name -eq "XmlDocument")
{
$PRIVATE:tempString = $args[0].get_outerXml()
}
if ($args[0].GetType().Name -eq "String")
{
$PRIVATE:tempString = $args[0]
}
$r = new-object System.Xml.XmlTextReader(new-object System.IO.StringReader($PRIVATE:tempString))
$sw = new-object System.IO.StringWriter
$w = new-object System.Xml.XmlTextWriter($sw)
$w.Formatting = [System.Xml.Formatting]::Indented
do { $w.WriteNode($r, $false) } while ($r.Read())
$w.Close()
$r.Close()
$sw.ToString()
Sometimes XML is thought of as strings and sometimes as [xml] in PowerShell. This script will take either a string or [xml] but will always return a string. (e.g. It's on you to do the final [xml] cast because if you did, the tidying is moot). For example:
PS> $a = "<foo><bar>asdasd</bar></foo>"
PS> ./tidy-xml $a
<foo>
<bar>asdasd</bar>
</foo>
PS> $b = [xml]"<foo><bar>asdasd</bar></foo>"
PS> ./tidy-xml $b
<foo>
<bar>asdasd</bar>
</foo>
I wanted to make it so I could do these scenarios. Thoughts? Remember that I need to normalize to a string for the StringReader constructor.
#couldn't because it returned an Object[] of strings and it got sloppy fast
get-content foo.xml | tidy-xml#couldn't because it (oddly) returned an ArrayList of strings and it got sloppy fast
get-content foo.xml -ov c
tidy-xml $c
Enjoy (or improve!)
UPDATE: Here's a better version that includes a number of best-practices changes as well as the support for taking IN objects from the pipeline (like I wanted originally):
#The following cases work
#
#PS>$a
#<foo><bar>this is A</bar></foo
#PS>$b.get_OuterXml()
#<foo><bar>this is B</bar></foo
#PS>Get-Content foo.xml
#<foo>
# <bar>this is C</bar>
#</foo>
#
#Now try the following.
#PS>sal ti tidy-xml
#PS>$a | ti
#PS>$b | ti
#PS>$c | ti
#PS>ti $a
#PS>ti $b
#PS>ti $c
#PS>$a, $b | ti
#PS>$a, $c | ti
#PS>$c, $b | ti
#PS>$a, $b, $c | ti
#
#What doesn't work here is when you pass a multiple parameter input as follows:
#tidy-xml $a, $b # doesn't work
#
#Uhm, i think i would have to change my logic "completely" to actually get that to work...
#(after refactoring "process" block...)
#
#Name me tidy-xml.ps1
# - some of this crap written by Scott Hanselman
function Tidy-Xml {
begin {
$private:str = ""
# recursively concatenate strings from passed-in arrays of schmutz
# not sure how to improve this...
function ConcatString ([object[]] $szArray) {
# return string
$private:rStr = ""# Recursively call itself, if a string is also of array or a collection type
foreach ($private:sz in $szArray) {
if (($private:sz.GetType().IsArray) -or `
($private:sz -is [System.Collections.IList])) {
$private:rStr += ConcatString($private:sz)
}
elseif ($private:sz -is [xml]) {
$private:rStr += $private:sz.Get_OuterXml()
}
else {
$private:rStr += $private:sz
}
}
return $private:rStr;
}
# Original "Tidy-Xml" portion
function FormatXmlString ($arg) {
# ignore parse errors
trap { continue; }
# out-null hides output of the assembly load
[System.Reflection.Assembly]::LoadWithPartialName("System.Xml") | out-null$PRIVATE:tempString = ""
if ($arg -is [xml]){
$PRIVATE:tempString = $arg.get_outerXml()
}
if ($arg -is [string]){
$PRIVATE:tempString = $arg
}# the ` tick mark is a line-continuation char
$r = new-object System.Xml.XmlTextReader(`
new-object System.IO.StringReader($PRIVATE:tempString))
$sw = new-object System.IO.StringWriter
$w = new-object System.Xml.XmlTextWriter($sw)
$w.Formatting = [System.Xml.Formatting]::Indenteddo { $w.WriteNode($r, $false) } while ($r.Read())
$w.Close()
$r.Close()
$sw.ToString()
}
}
process {
# For non-xml strings or types, they will be buffered and will be
# taken care of in "end" block
# this checks for objects that have been "pipe'd" in.
if ($_) {
# check if whatever we have appended is a valid XML or not
$private:xmlStr = ($private:str + $_) -as [xml]
if ($private:xmlStr -ne $null) {
FormatXmlString([xml]$private:xmlStr)
# clear the string not to be handled in "end" block
$private:str = $null
} else {
if ($_ -is [string]) {
$private:str += $_
} elseif ($_ -is [xml]) {
FormatXmlString($_)
}
# for an array or a collection type,
elseif ($_.Count) {
# iterate each item in the collection and append
foreach ($i in $_) {
$private:line += $i
}
$private:str += $private:line
}
}
}
}end {
if ([string]::IsNullOrEmpty($private:str)) {
$private:szXml = $(ConcatString($args)) -as [xml]
if (! [string]::IsNullOrEmpty($private:szXml)) {
FormatXmlString([xml]$private:szXml)
}
} else {
FormatXmlString([xml]$private:str)
}
}
}
Thanks to MonadBlog for the Updates! There's definitely some room for refactoring of the begin/end/process, but it's more funcitonal this way.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
Btw, I have also posted the "COMPLETE" tabExpansion function on "http://pastehere.com/?armhfr" but i have broken one of the functionalities($host.ui.rawUi.[tab] doesn't work, meaning nested properties aren't expanded properly...)
Comments are closed.
I have modified your source a bit and pasted it on http://pastehere.com/?dcuqpi as well as some comments
The code now looks like complete crap thanks to my modification... almost unreadable... sorry about botching your source.