Moving ViewState to the Bottom of the Page
I was working on some ASP.NET hacks and wanted to move the ViewState to the bottom of the page in order to get Google to pay more attention to my page and less to the wad of Base64'ed ViewState.
First I tried this because it's the closest to the way my mind works:
static readonly Regex viewStateRegex = new Regex(@"(<input type=""hidden"" name=""__VIEWSTATE""
value=""[a-zA-Z0-9\+=\\/]+"" />)",
RegexOptions.Multiline|RegexOptions.Compiled);
static readonly Regex endFormRegex = new Regex(@"</form>",
RegexOptions.Multiline|RegexOptions.Compiled);
protected override void Render(HtmlTextWriter writer)
{
//Defensive coding checks removed for speed and simplicity.
// If these don't work out, you've likely got bigger problems.
System.IO.StringWriter stringWriter = new System.IO.StringWriter();
HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter);
base.Render(htmlWriter);
string html = stringWriter.ToString();
Match viewStateMatch = viewStateRegex.Match(html);
string viewStateString = viewStateMatch.Captures[0].Value;
html = html.Remove(viewStateMatch.Index,viewStateMatch.Length);
// This will only work if you have only one </form> on the page
Match endFormMatch = endFormRegex.Match(html,viewStateMatch.Index);
html = html.Insert(endFormMatch.Index,viewStateString);
writer.Write(html);
}
However, it was taking 1 thousanth of a second (~0.001230s) to do the work and that didn't feel right. Of course, by taking over the HtmlTextWriter and spitting it out as a string I've boogered up all the benefits of buffering and the whole streaming thing, but it still felt wrong.
So, against my better judgement, I did it again like this:
protected override void Render(System.Web.UI.HtmlTextWriter writer)
{
System.IO.StringWriter stringWriter = new System.IO.StringWriter();
HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter);
base.Render(htmlWriter);
string html = stringWriter.ToString();
int StartPoint = html.IndexOf("<input type=\"hidden\" name=\"__VIEWSTATE\"");
if (StartPoint >= 0)
{
int EndPoint = html.IndexOf("/>", StartPoint) + 2;
string viewstateInput = html.Substring(StartPoint, EndPoint - StartPoint);
html = html.Remove(StartPoint, EndPoint - StartPoint);
int FormEndStart = html.IndexOf("</form>") - 1;
if (FormEndStart >= 0)
{
html = html.Insert(FormEndStart, viewstateInput);
}
}
writer.Write(html);
}
I always assumed (mistake #1) that IndexOf was pretty expensive, particularly on larger strings. However, this method averaged out at 0.000995s. It consistently beat the Regex one, even though the Regex one was very simple, the Regexes were precompiled and (I think) simple.
Now, to be clear, I'm just playing here, and I know it's microperf and premature optimization. The really interesting thing would be to do a matrix of page size vs. viewstate size. You know, large page, small viewstate against small page, large viewstate and all points in between, then try it with both techniques and see which is better for these different scenarios. But, I'm tired and have other things to do, so if you like, there's some homework for you. What does this data set look like: viewstate size vs. page size vs. technique?
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
Ya, zipping also
http://www.hanselman.com/blog/ZippingCompressingViewStateInASPNET.aspx
That RegEx is much better, doh! I was trying to hold it to allowed chars but you're right. I'll test it.
The overhead of the RegEx creation is not a problem AFAIK because the creation happens only once (static readonly, the RegEx is compiled to IL and after that it's just running. Plus, I only use the one instance as Regex instances are threadsafe.
1) Just to humor me, how much worse are the results if you instantiate the regex inside the Render method? I'm curious.
2) You could try increasing the size of the strings used in the 2nd method, because it probably uses Boyer-Moore string matching and that gets faster as the string to be matched gets larger. eg..
"<input type=\"hidden\" name=\"__VIEWSTATE\" value=\"");
"\" \>"
I am making a couple of assumptions here, but I have to think that if google can transform pdf/powerpoint files into text and index them correctly, that they have to be ignoring something like hidden form inputs (which the spider doesn't need at all).
I think you would have to have a pretty significant amount of viewstate to cause problems. Compression greatly increases the amount of bytes that the googlebots will grab as well.
Their implementation works very well.
I have modified their existing code to allow for storing the viewstate either in the session or the cache.
Comments are closed.
I assume you got the viewstate move code from here?
http://scottonwriting.net/sowblog/posts/3536.aspx
I expect a simple IndexOf to be faster than any Regex 100% of the time. However, you might try simplifying the regex a little:
<input type="hidden" name="__VIEWSTATE" value="[^"]+" />
Not sure if you need multiline here. Anyway, I think the overhead of creating the regex once per page is what's hurting you most. Can it be put in the app domain or otherwise removed from the page class?
Also, have you considered viewstate compression in addition to moving it? I can't remember the last results I saw on this..