Scott Hanselman

XSLT Stylesheet Performance on Big Ass Documents

January 24, 2006 Comment on this post [8] Posted in XML | Tools
Sponsored By

Like it or not, when it comes type to start transforming XML datas folks turn to stylesheets. Sure, it'd be nice if we could write XmlReader/XmlWriter transforms or if one of these Streaming XML Transformation languages would really take off. But for now, you know it, and I know it - folks love their XSLT.

Anyway we had a large XML document that was on the order of 250megs, sometimes larger. It was running in a batch process using MSXSL.exe, a command-line tool that invokes the "newest" version of MSXML that's on your system, starting with MSXML4, the moving backwards to MSXML3 then 2.6. It was running out of memory sometimes using as much as a gig. It was also taking 15 minutes and more. It was written three years ago and was written in a very procedural way. XSLT is meant to be written in a more declarative way, with templates that match on the input elements as they find them.

  • Original XSLT with MSXSL using MSXML4 – crashes memory exception
  • Original XSLT with NXSLT 1.6 (.NET 1.1) – Private bytes level out around 1G
      Source document load time: 16059.870 milliseconds
      Stylesheet load/compile time: 204.672 milliseconds
      Stylesheet execution time: 683552.000 milliseconds

This stylesheet wasn't very opmtized and was kinda:

<xsl:template match="/">
 <xsl:for-each select="$x">
  <xsl:variable name="Something">
   <xsl:call-template name="CreateSomething">
    <xsl:with-param name="Row" select="."/>
   </xsl:call-template>
  </xsl:variable>
  <xsl:value-of select="$Something"/>
 </xsl:for-each>
</xsl:template>

...which is sub-optimal. Not only that, but the variable Something is holding the results of the template rather than allowing it to "flow" out as data is transformed. This transform actually had two input files, the main one, and another small one that contained configuration and some other details that was selected into variable.

<xsl:variable name="Foo" select="document('foo.xml')"/>

The stylesheet was rewritten to be more template-focused ala:

<xsl:template match="Row" >
   <xsl:apply-templates select="$x[@SomeID = $someID]"/>
</xsl:template>

After this change/re-write, the stylesheet was sped up by about 66% and didn't run out of memory. However, it was still using MSXSL and we wanted to try a few other processors. I did try Saxon and a few Java/C++ parsers but they ran out of memory, so don't pick on me for not including their numbers, as this post is primarily a test of the various Microsoft XSL/T options. All these timings are generated with the -t option that all these utilities support.

  • Improved XSLT with MSXSL using MSXML4 – private bytes level out around 300M
      Source document load time: 41920 milliseconds
      Stylesheet document load time: 18.37 milliseconds
      Stylesheet compile time: 3.692 milliseconds
      Stylesheet execution time: 174327 milliseconds
  • Improved XSLT with NXSLT 1.6 (.NET 1.1) – private bytes level out around 550M
      Source document load time:     17893.370 milliseconds
      Stylesheet load/compile time:    462.974 milliseconds
      Stylesheet execution time:     629697.700 milliseconds

Interestly, but not unexpectedly, the .NET 1.1 XSLT transformations used by NXSLT are slower than the original unmanaged code in MSXML. A lot of XSLT wonks have apparently said, after the release of .NET 1.1, that when you have to do some hard-core (large) XSLT you should still use MSXSL.

We had two questions at this point - what if we used MSXML6? what if we used .NET 2.0 (whose XSLT engine was greatly improved)

However, MSXSL.exe hasn't been updated to support MSXML6 yet (the site says coming soon), and while I could go to a VBScript or whatever, I figured why not just add the support to the source of MSXSL (which is available here). I couldn't find the updated SDK header files for MSXML.H so I just hacked it together from the registry. The general gist is at the bottom of this post.

Anyway, I made a version of MSXSL that tries for MSXML6, and falls back to 4, etc. Then I got Oleg's NXSLT2 friendly command-line 2.0 stuff.

You may ask why I'm using this command-line stuff. Well, Oleg has kindly seen fit to maintain "command-line compatibility" with MSXSL.exe which makes swapping out command-line processors within our batch process very easy.

  • Improved XSLT with NXSLT2 (.NET 2.0) - private bytes level out around 500M
      Stylesheet load/compile time:   4596.000 milliseconds  
      Transformation time:           53248.000 milliseconds  
      Total execution time:          59064.000 milliseconds
  • Improved XSLT with (custom) MSXSL using MSXML6 - private bytes level out around 300M
      Source document load time:     33677 milliseconds    
      Stylesheet document load time: 4.685 milliseconds    
      Stylesheet compile time:       3.774 milliseconds    
      Stylesheet execution time:     200952 milliseconds  

Nutshell: .NET 2.0 was 10x faster than .NET 1.1. MSXML6 was 15% slower than MSXML4. This of course, was with one specific funky stylesheet and one rather big ass file. Either way, we are sticking with the MSXML4 stuff for now, but looking forward to .NET 2.0's support for this particular style (pun intended) of madness.

Updating MSXSL to choose MSXML6: I cracked open the source for MSXSL.  I couldn't find the new MSXML6.H so I added this to msxmlinf.hxx:

typedef class XSLTemplate60 XSLTemplate60;
#ifdef __cplusplus
class DECLSPEC_UUID("88d96a08-f192-11d4-a65f-0040963251e5")
XSLTemplate60;
#endif
typedef class DOMDocument60 DOMDocument60;
typedef class FreeThreadedDOMDocument60 FreeThreadedDOMDocument60;
#ifdef __cplusplus
class DECLSPEC_UUID("88d96a05-f192-11d4-a65f-0040963251e5")
DOMDocument60;
class DECLSPEC_UUID("88d96a06-f192-11d4-a65f-0040963251e5")
FreeThreadedDOMDocument60;
#endif

Then I updated the static array and factory in msxmlinf.cxx to check for the version specific ProgID:

const MSXMLInfo::StaticInfo MSXMLInfo::s_staticInfo60 =
{
    VERSION_60,
    L"6.0",

    "{88d96a08-f192-11d4-a65f-0040963251e5}",
    &__uuidof(XSLTemplate60),
    L"MSXML2.XSLTemplate.6.0",

    &__uuidof(DOMDocument60),
    L"MSXML2.DOMDocument.6.0",

    &__uuidof(FreeThreadedDOMDocument60),
    L"MSXML2.FreeThreadedDOMDocument.6.0",
};

...along with a few other things. Email me if you want the source, I don't think I'm allowed to redist this. Anyway, when I ran it the first time I got a "Access Denied 0x80004005" and stared at it for a while. Andy Phenix said, "Didn't they tighten security and break some stuff in MSXML6?" This involved using IXMLDomDocument2 and explicitly allowing the document() function to load our 'foo.xml':

VARIANT FreakingTrue;
FreakingTrue.vt = VT_BOOL;
FreakingTrue.boolVal = VARIANT_TRUE;
pStylesheet->setProperty(L"AllowDocumentFunction", FreakingTrue);

Once we turned on the document() feature, everything worked great. However, I wasn't sure if MSXML4 or MSXML6 was doing the work. (I did filemon.exe and regmon.exe as well as procexp.exe and it WAS in fact loading msxml6.dll) I noticed some cleverness, again from Oleg that allows the XSLT stylesheet to actually detect what vendor and (if MSFT) version of the XSLT engine was being used. I'd reprint it, but you should go visit his site anyway.

Thanks to Krishnan and Andy for their hard work on the new stylesheet and performance testing.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook bluesky subscribe
About   Newsletter
Hosting By
Hosted on Linux using .NET in an Azure App Service
January 24, 2006 7:52
The reasons .NET 2.0 is so fast are that (1) Helena Kupkova rewrote the XML parsers to be much much faster than the original implementation and (2) Andy Kimball rewrote the XSLT engine from scratch, using a little thing called QIL that I came up with. http://www.tkachenko.com/blog/archives/000474.html

QIL gave us two things: the ability to reason about (and then optimize) the XSLT at an algebraic level, and the ability to dynamically generate assembly code from that optimized form instead of interpreting the original XSLT instructions as the original implementation had done.

I'm glad to see that it had such dramatic effects :-)
January 24, 2006 8:07
Geez Michael...you were ALREADY my Hero from the Xbox backward compatibility. Now this?

I officially have a man-crush on you.
January 24, 2006 10:25
LOL! Well, both were team efforts, so you've really got a crush on Microsoft. :-)

I can't take full credit even for QIL. I did the details but Chris Suver (now at Amazon) helped get the ball rolling and was very supportive in a difficult team environment. None of it would have happened without him, even though he never wrote a line of code for it.

Andy and I spent a lot of time figuring out ways to optimize XSLT. One of the things we learned is that minor syntactic differences can make major differences in performance. For example, you might write

xsl:variable name="x" value="some expression"

and then later in your XSLT only use string(x) or number(x) or test x for existence. In such cases, it's usually better to write

xsl:variable name="x_as_string" value="string(some expression)"
xsl:variable name="x_as_number" value="number(some expression)"
xsl:variable name="x_exists" value="boolean(some expression)"

because the implementation can (internally) assign x to a simple atomic value instead of having to keep around a large nodeset.

As you found, assigning a variable to a document() is horribly inefficient. That's because -- in general -- implementations have to load the entire document into memory. We did some work to optimize such cases, but it's still much better if you can process the document in a "streaming" fashion.

Union (|) and anything else that requires removing duplicate nodes from a nodeset (such as certain // queries) are bad for performance because they end up requiring a set comparison to eliminate the potentially duplicate nodes -- even if you know that won't ever happen, the XSLT engine often has no way of knowing.

Using position() is generally killer for optimizations, but unfortunately is so easy to write. Counterintuitively, sometimes when you "know" a nodeset will have at most 1 node in it, adding a [1] to a path can help optimizations even when it has no other effect on the XSLT. Similarly, when navigating through a document, node() is faster than * which is faster than a name test, because the name test has to check both node type and name, while * just has to check node type (element), and node() doesn't have to check anything at all. These things are minor, but can make a measurable difference with large documents or complex XSLTs.

But probably the most difficult part of XSLT is that there are no hard-and-fast performance rules. For any rule you can come up with, I can construct an XSLT that performs horribly when you apply that rule. This makes it really hard to offer people generic advice on improving XSLT performance. But I guess it's built-in job security for consultants :-)

I can't emphasize enough the importance of testing the effects of each optimization you make, and as you did, testing it with the processor you're going to use. Even minor implementation details (such as storing nodesets internally using forward or backward pointers) can have major effects on what queries are most efficient (e.g., following::* or preceding::*). Consequently, no two XSLT implementations have the same performance characteristics -- optimizing for one may hurt performance on another.
January 24, 2006 14:25
Wow, nice numbers!

And btw, despite msxsl comes with sources and no license altogether, they don't want us to fix and redistribute it. I've been told that msxsl.exe with MSXML6 support is done already and to be released in December. Well, I hope they didn't mean December 2006!
January 25, 2006 12:08
Hi Scott,

Really interesting article. We do a great amount of XSLT work at our company in the UK, not with such big XML data I might add!! Nice to know that the .NET 2.0 transforms are running that much faster than 1.1, although for us performance has never been a problematic area since, as I mentioned, our XML data sources are never more than a few K 10/20K max. Another good reason for me to press the guys at work to move to teh .NET 2.0 platform!

Cheers

Dave Mc
January 26, 2006 20:21
Scott,
Very cool benchmark. I wonder if you could post the .NET code that you used to run the transform? I'm wondering if the approach you used to process the documents (XmlDocumetn vs. XPathDocument, for example) might have had any impact on the transformation or loading performance?

Jordan
January 26, 2006 21:53
Jordan, The code used is NXSLT (source available, links in my post) and it uses XPathDocuments, etc. It's a walking best practice, written by Oleg.

Scott
January 31, 2006 19:23
Hi

Interesting though this is, I think the results could have been presented better, it is very confusing as it stands.

For each tested setup, your results list differing measured parameters.

For example this heading:

Improved XSLT with NXSLT2 (.NET 2.0) - private bytes level out around 500M

has no entry for "Stylesheet Executuon Time" and this is just hard to follow.

Why not present the data in a table that is easy to understand?

Hugh


Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.