Most Common ASP.NET Support issues - Reporting from deep inside Microsoft Developer Support
Microsoft Developer Support or ("CSS" - Customer Support Services) is where you're sent within Microsoft when you've got problems. They see the most interesting bugs, thousands of issues and edge cases and collect piles of data. They report this data back to the ASP.NET team (and other teams) for product planning. Dwaine Gilmer, Principal Escalation Engineer, and I thought it would be interesting to get some of that good internal information out to you, Dear Reader. With all those cases and all the projects, there's basically two top things that cause trouble in production ASP.NET web sites. Long story short, Debug Mode and Anti-Virus software.
Thanks to Dwaine Gilmer, Doug Stewart and Finbar Ryan for their help on this post! It's all them!
#1 Issue - Configuration
Seems the #1 issue in support for problems with ASP.NET 2.x and 3.x is configuration.
Symptoms | Notes |
| There are more debug=true cases than there should be. |
People continue to deploy debug versions of their sites to production. I talked about how to automatically transform your web.config and change it to a release version in my Mix talk on Web Deployment Made Awesome. If you want to save yourself a headache, release with debug=false.
Additionally, if you leave debug=true on individual pages, note that this will override the application level setting.
Here's why debug="true" is bad. Seriously, we're not kidding.
- Overrides request execution timeout making it effectively infinite
- Disables both page and JIT compiler optimizations
- In 1.1, leads to excessive memory usage by the CLR for debug information tracking
- In 1.1, turns off batch compilation of dynamic pages, leading to 1 assembly per page.
- For VB.NET code, leads to excessive usage of WeakReferences (used for edit and continue support).
An important note: Contrary to what is sometimes believed, setting retail="true" in a <deployment/> element is not a direct antidote to having debug="true"!
#2 Issue - Problems with an External (non-ASP.NET) Root Cause
Sometimes when you're having trouble with an ASP.NET site, the problem turns out to not be ASP.NET itself. Here's the top three issues and their causes. This category are for cases that were concluded because of external reasons and are outside of the control of support to directly affect. The sub categories are 3rd party software, Anti-virus software, Hardware, Virus attacks, DOS attacks, etc.
If you've ever run a production website you know there's always that argument about whether to run anti-virus software in production. It's not like anyone's emailing viruses and saving them to production web servers, but you want to be careful. Sometimes IT or security insists on it. However, this means you'll have software that is not your website software trying to access files at the same time your site is trying to access them.
Here's the essence as a bulleted list
- Concurrency while under pressure: This causes problems in big software. Make sure your anti-virus software is configure appropriately and that you're aware of which processes are accessing which files, as well as how, why and when
- Profile your applications: .NET and the Web are not black boxes. You can see what's happening if you look. Know what bytes are going out the wire. Know who is accessing the disk. Measure twice, cut once, they say? I say measure a dozen times. You'd be surprised how often folks put an app in production and they've never once profiled it.
- Anti-Virus Software: It can't be emphasized enough that site owners should ensure they are running the latest AV engine and definitions from their chosen anti-malware vendor. They've see folks hitting hangs due to flakey AV drivers that are over two years out of date. Another point about AV software is that it is not just about old-school AV scanning of file access. Many products now do low level monitoring of port activity, script activity within processes and memory allocation activity and do not always do these things 100% correctly. Stay up to date!
- Know where you're calling out to: Also, connection to remote endpoints: calling web services, accessing file systems etc. All of this can slow you down if you're not paying attention. Is your DNS correct? Did you add your external hosts to a hosts file to remove DNS latency?
- processModel autoconfig=true: This is in machine.config and folks always mess with it. Don't assume that you know better than the defaults. Everyone wants to change the defaults, add threads, remove threads, change the way the pool works because they think their textboxes-over-data application is special. Chances are it's not, and you'd be surprised how often people will spend days on the phone with support and discover that the defaults were fine and they had changed them long ago and forgotten. Know what you've changed away from the defaults, and know why. Don't program by coincidence.
...and here's the table of details:
Issue | Product | Description | Symptoms | Notes |
Anti-virus software | All | Anti-virus software is installed onto Servers and causes all kinds of problems. |
| This consists of all AV software reported by our customers. All cases do not report the AV software that is being used so the manufacturer is not always known. |
3rd party Vendors | All | This is a category of cases where the failure was due to a 3rd party manufacturer. |
| The top culprits are 3rd party database systems, and 3rd party internet access management systems. |
Microsoft component | All | Microsoft software |
| Design issues that cause performance issues like sprocs, deadlocks, etc. Profile your applications and the database! (Pro tip: select * from authors doesn't scale.) Pair up DBAs and programmers and profile from end to end. |
Spread the word! What kinds of common issues do YOU run into when running production sites, Dear Reader?
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
One thing that really, really bothers me about IIS6 and 7 is just how much work it is to speed up your website. IIS7 needlessly moved things all over the place, when in reality all most people really want is an "Optimize" tab. Throw "prevent debug=true", all Caching, HTTP Header editing and Gzipping/Compression in one place so it's a one stop shop.
Call it "Uber Retail" setting if you want, but just don't make me go to the Machine.Config to do it, let me do it on a site-by-site basis, or if I choose apply it to the entire server.
Having a checkmark in [x]Passive Mode
Removing that removes a lot of FTP connection issues with certain routers.
This is one case where "Works on my computer" may not be working on the others. ;-)
We have a need to resize images on the fly (we also need to apply some other effects as well) but the ,Net documentation specifically says the System.Drawing namespace shouldn't be used in ASP.Net but they don't mention what should be used instead.
We currently do image processing in classic ASP (we bought a 3rd party component to do it) and occasionally get OOM errors (plenty of RAM, not sure what the error actually refers to). We're in the process of converting our application to Asp.Net (MVC2) and I'm concerned that we will have the same problems or, considering the warning, possibly worse problems when we do.
Consider using WPF.
Bertrand Le Roy has some posts on this:
Resizing images from the server using WPF/WIC instead of GDI+
Server-side resizing with WPF: now with JPG
Raj
We always end up deploying debug versions of our platform libraries because we need to reference debug versions during development, and don't have an easy way to switch all the references to release versions and back for builds. And no, we can't change them all the project references.
Do you know a way to resolve this?
For some reason, when browsing an ASP.NET site with a large view state (Not out of control, just large) it would get corrupted every time. The fix was simple in the end, but it took a lot of banging of my head against the wall to figure it out.
All I did was break up the view state by setting maxPageStateFieldLength="5120" and viola everything was all better.
We handle different build configurations using Nant scripts and CruiseControl to call the scripts. Since you can pass the configuration to msbuild we can call it with an execute command and change the build configuration per project, like this:
<exec program="C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\msbuild.exe" commandline='/nologo /p:Configuration=Release "<project path here>" />
What exactly do you mean with this line:
> An important note: Contrary to what is sometimes believed, setting retail="true" in a <deployment/> element is not a direct antidote to having debug="true"!
Setting deployment retail to "true" will disable debugging at all levels in ASP.NET Runtime 2.0 and higher. So would you care to explain on what you mean with this statement?
Cheers
Yeah, we have a build process that changes the build configuration, but that doesn't address the problem I described. If you add an assembly reference to a library there is no way, as far as I can say, to say "use the debug version of this library in development, then switch to the release version when you produce a live build".
"For a normal debug=”false” page or site, request execution timeout is 110s or whatever alternative timeout you have specified. If you deploy with debug=”true” the timeout is effectively disabled (it is actually set to about 5 hours). The intention of this is that when you are debugging you don’t want the request timing out while you are debugging it.
Setting retail=”true” does reverse some of the debug compilation behaviour that results from having debug=”true” but it will not revert the ASP.NET runtime to enforcing the correct timeout. So if you have debug=”true” in production and your ASP.NET page happens to call something that blocks indefinitely (such as a wayward web service or database stored procedure) the ASP.NET request is not going to timeout for a very, very long time. You would be dependent on the thing you are calling timing out."
At the same time, some parts of the MSDN documentation needs to be changed, because on same pages the technical writer explicitly refers to the debug element in the web.config but in many cases the technical writer just mentions "this is only true when ASP.NET debugging is disabled". So when reading this, we basically don't know anymore whether or the technical writer means "this is only true when the debug attribute is set to false". So I vote for removing the possibility to use <deployment retail="true"/> and never look back again...
Can you expand on "profile your applications". What is this and how do you do it?
I would also like to know about how to profile ones web application. Any tips or hints? Some pointers would be deeply appreciated!
Can you please explain more on how to profile web applications and databases. Thanks!
Here is another reason why debug=true is evil on production sites :
Scripts and images downloaded from the WebResources.axd handler are not cached
I got it from Guthrie's post : http://weblogs.asp.net/scottgu/archive/2006/04/11/442448.aspx
(The Gu also advocated retail=true without warning that it does NOT enable the timeout again.)
A Fan, Tom
Yeah, we have a build process that changes the build configuration, but that doesn't address the problem I described. If you add an assembly reference to a library there is no way, as far as I can say, to say "use the debug version of this library in development, then switch to the release version when you produce a live build".
Comments are closed.
These are good reads:
http://msdn.microsoft.com/en-us/library/ff649309.aspx
http://msdn.microsoft.com/en-us/library/ms178699.aspx