Back to Basics - Keep it Simple and Develop Your Sense of Smell - From Linq To CSV

February 05, 2010 Comment on this post [48] Posted in ASP.NET | Back to Basics

Sponsored By

I was working with a friend recently on a side thing they were doing. They wanted to create an "Export" function for some small bit of data and start it from their website.

You'd hit a URL after logging in
Some data would come out of a database
You'd get a .CSV file downloaded
You could open it in Excel or whatever.

I spoke to my friend and they said it was cool to share their code for this post. This post isn't meant to be a WTF or OMG look at that code, as is it meant to talk about some of the underlying issues. There's few things going on here and it's not all their fault, but it smells.

They are using a Page when a IHttpHandler will do.
- Not a huge deal, but there's overhead in making a Page, and they're not using any of the things that make a Page a Page. The call to ClearContents is an attempt at telling the Page to back off. It's easier to just not have a page.
They're "thrashing" a bit in the Page_Load, basically "programming by coincidence."
- They're not 100% sure of what needs to be done, so they're doing everything until it works. Even setting defaults many times or calling methods that set properties and they setting those properties again.
- This means a few things. First, HTTP is subtle. Second, the Response APIs are confusing (less so in .NET 4) and it's easy to do the same thing in two ways.
They're not using using() or IDisposable
- They're cleaning up MemoryStreams and StreamWriters, but if an exception happens, things'll get cleaned up whenever. It's not tight in a "cleaning up after yourself" deterministic (when it needs to be) kind of way.
- They're calling Flush() when it's not really needed, again programming by coincidence. "I think this needs to be done and it doesn't break..."
Old school data access
- Not bad, pre se, but it could be easier to write and easier to read. DataAdapters, DataSets, are a hard way to do data access once you've used Linq to SQL, EF or NHibernate.
Re-Throwing an Exception via "throw ex"
- When you want to re-throw an exception, ALWAYS just throw; or you'll lose your current call stack and it'll be hard to debug your code.
Not reusable at all
- Now, reuse isn't the end-all, but it's nice. If this programmer wants different kinds of exports, they'll need to extract a lot from the spaghetti.

Here's what they came up with first.

Note that this code is not ideal. This is the before. Don't use it.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using System.Configuration;
using System.Data.SqlClient;
using System.Data;

namespace FooFoo
{
    public partial class Download : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            System.IO.MemoryStream ms = CreateMemoryFile();

            byte[] byteArray = ms.ToArray();
            ms.Flush();
            ms.Close();

            Response.Clear();
            Response.ClearContent();
            Response.ClearHeaders();
            Response.Cookies.Clear();
            Response.Cache.SetCacheability(HttpCacheability.Private);
            Response.CacheControl = "private";
            Response.Charset = System.Text.UTF8Encoding.UTF8.WebName;
            Response.ContentEncoding = System.Text.UTF8Encoding.UTF8;
            Response.AppendHeader("Pragma", "cache");
            Response.AppendHeader("Expires", "60");
            Response.ContentType = "text/comma-separated-values";
            Response.AddHeader("Content-Disposition", "attachment; filename=FooFoo.csv");
            Response.AddHeader("Content-Length", byteArray.Length.ToString());
            Response.BinaryWrite(byteArray);
        }

        public System.IO.MemoryStream CreateMemoryFile()
        {
            MemoryStream ReturnStream = new MemoryStream();

            try
            {
                string strConn = ConfigurationManager.ConnectionStrings["FooFooConnectionString"].ToString();
                SqlConnection conn = new SqlConnection(strConn);
                SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM [FooFoo] ORDER BY id ASC", conn);
                DataSet ds = new DataSet();
                da.Fill(ds, "FooFoo");
                DataTable dt = ds.Tables["FooFoo"];

                //Create a streamwriter to write to the memory stream
                StreamWriter sw = new StreamWriter(ReturnStream);

                int iColCount = dt.Columns.Count;

                for (int i = 0; i < iColCount; i++)
                {
                    sw.Write(dt.Columns[i]);
                    if (i < iColCount - 1)
                    {
                        sw.Write(",");
                    }
                }

                sw.WriteLine();
                int intRows = dt.Rows.Count;

                // Now write all the rows.
                foreach (DataRow dr in dt.Rows)
                {
                    for (int i = 0; i < iColCount; i++)
                    {

                        if (!Convert.IsDBNull(dr[i]))
                        {
                            string str = String.Format("\"{0:c}\"", dr[i].ToString()).Replace("\r\n", " ");
                            sw.Write(str);
                        }
                        else
                        {
                            sw.Write("");
                        }

                        if (i < iColCount - 1)
                        {
                            sw.Write(",");
                        }
                    }
                    sw.WriteLine();
                }

                sw.Flush();
                sw.Close();
            }
            catch (Exception Ex)
            {
                throw Ex;
            }
            return ReturnStream;
        }

    }
}

I don't claim to be a good programmer, but I do OK. I went over my concerns with my friend, and suggested first an HttpHandler. I started with Phil's basic abstract HttpHandler (based on my super basic HttpHandler boilerplate). I could have certainly done by just implementing IHttpHandler, but I like this way. They're about the same # of lines of code. The important part is in HandleRequest (or ProcessRequest).

namespace FooFoo
{
    public class ExportHandler : BaseHttpHandler
    {
        public override void HandleRequest(HttpContext context)
        {
            context.Response.AddHeader("Content-Disposition", "attachment; filename=FooFoo.csv");
            context.Response.Cache.SetCacheability(HttpCacheability.Private);

            var dc = new FooFooDataContext();
            string result = dc.FooFooTable.ToCsv();
            context.Response.Write(result);
        }

        public override bool RequiresAuthentication
        {
            get { return true; }
        }

        public override string ContentMimeType
        {
            get { return "text/comma-separated-values"; }
        }

        public override bool ValidateParameters(HttpContext context)
        {
            return true;
        }
    }
}

At the time I wrote this, I was writing how I wished the code would look. I didn't have a "ToCsv()" method, but I was showing my friend how I though the could should be separated. Even better if there was a clean DAL (Data Access Layer) and Business Layer along with a service for turning things into CSV, but this isn't too bad. ToCsv() in this example is a theoretical extension method to take an IEnumerable of something and output it as CSV. I started writing it, but then decided to Google with Bing, and found a decent direction to start with at Mike Hadlow's blog. He didn't include all the code in the post, but it saved me some typing, so thanks Mike!

namespace FooFoo
{
    public static class LinqToCSV
    {
        public static string ToCsv<T>(this IEnumerable<T> items)
            where T : class
        {
            var csvBuilder = new StringBuilder();
            var properties = typeof(T).GetProperties();
            foreach (T item in items)
            {
                string line = string.Join(",",properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());
                csvBuilder.AppendLine(line);
            }
            return csvBuilder.ToString();
        }

        private static string ToCsvValue<T>(this T item)
        {
            if(item == null) return "\"\"";

            if (item is string)
            {
                return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\""));
            }
            double dummy;
            if (double.TryParse(item.ToString(), out dummy))
            {
                return string.Format("{0}", item);
            }
            return string.Format("\"{0}\"", item);
        }
    }
}

This creates an extension method that lets me call something.toCsv() on anything IEnumerable. It'll spin through the properties (yes, I know that could have been a LINQ statement also, but I like a nice ForEach sometimes. Feel free to fix this up in the comments! ;) ) and build up the Comma Separated Values.

At some point, it really should format Dates as {0:u} but as of now, it works identically as the before code and attempts to rectify most of the issues brought up. Of course, one could take something like this as far and make it as robust as they like.

As Mike points out, you can also do little projections to control the output:

string csv = things.Select(t => new { Id = t.Id, Name = t.Name, Date = t.Date, Child = t.Child.Name }).AsCsv();

Your thoughts, Dear Reader?

Related Links

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

About Newsletter

Hosting By

Hosted on Linux using .NET in an Azure App Service

Comment on this post [48]

Share on BlueSky or use the Permalink and post anywhere!

February 05, 2010 2:41

You can also use Response.ContentType = "application/vnd.ms-excel" and write an HTML table to the page.
Here is an example: http://support.microsoft.com/kb/271572

Gal Ratner

February 05, 2010 2:55

I have preferred using a StreamWriter to write directly to the Response.OutputStream. It works very well when you are dealing with a large dataset or csv export, since it will be able to unload memory once the buffer is filled.

https://www.google.com/accounts/o8/id?id=AItOawkQ9wvbj2ipaC07Xz_HtFmUjuKtBJevl60

February 05, 2010 2:57

But one of the requires was "You could open it in Excel or whatever" and specifying application/vnd.ms-excel won't meet the "or whatever" part of the requirement.

http://www.google.com/profiles/holm.david.g

February 05, 2010 3:07

Apart from the thrashing in Page_Load the first version looks easier to read and therefore maintain.

Someone implement a comment voting system so I can get voted down, as this opinion will be likely be very unpopular by Scott's readership :-)

String.Format...

vs

string line = string.Join(",",properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());

(PS Yes, I know LINQ and No, it's not always appropriate in real life)

Mike

February 05, 2010 3:09

Fabrice Marguerie, et. al. did a complete LINQ to CSV implementation in their book LINQ in Action (Manning). Worth a look, if only for purposes of comparison.

Lee Dumond

February 05, 2010 3:17

David, you are correct. I ignored the "or whatever" requirement. For the "or whatever" part I would probably use the Save As dialog in excel and save as CSV. Also the machine used for the page should have Office installed. In any case, this is one option.

Gal Ratner

February 05, 2010 3:29

Copy / Paste.

Coincidentally, this morning I was just about to write a csv export in a WPF app but decided to check my feeds before starting work. Saved me an hour or so.

Now I have some serious ammunition if anyone has an issue with me browsing the web ;-)

https://www.google.com/accounts/o8/id?id=AItOawmwFN4SXwqtQDxQbkaUgROjY10s19Y6byQ

February 05, 2010 3:56

IEnumerable to CSV, sure, but LINQ? This isn't LINQ-y at all :p

[)amien

Damien Guard

February 05, 2010 4:04

If you want something more configurable then check out http://www.filehelpers.com/

https://www.google.com/accounts/o8/id?id=AItOawlyUHFZBQMYGhvyTJABITE24jHg-m2yTCI

February 05, 2010 4:07

Seems like a lot overhead to use LINQ in this situation. I mean LINQ uses reflection to get the query results onto the properties of an object, then you go and use reflection to get those values off of the properties of the object. Why not just skip all of that and write the results from a data reader straight to OutputStream.

Instead of LINQ to CSV do IDataReader to CSV

-Mike

Mikey

February 05, 2010 4:20

HttpHandlers (and HttpModules for that matter) must be the most overlooked feature of ASP.NET. When you learn what they do, I think it clears up so many things about the entire pipeline.

Jeff Putz

February 05, 2010 4:35

* They are using a Page when a IHttpHandler will do.

Agreed
o Not a huge deal, but there's overhead in making a Page, and they're not using any of the things that make a Page a Page. The call to ClearContents is an attempt at telling the Page to back off. It's easier to just not have a page.

Agreed
* They're "thrashing" a bit in the Page_Load, basically "programming by coincidence."

Agreed
o They're not 100% sure of what needs to be done, so they're doing everything until it works. Even setting defaults many times or calling methods that set properties and they setting those properties again.

Agreed
o This means a few things. First, HTTP is subtle. Second, the Response APIs are confusing (less so in .NET 4) and it's easy to do the same thing in two ways.

Agreed
* They're not using using() or IDisposable

Agreed
o They're cleaning up MemoryStreams and StreamWriters, but if an exception happens, things'll get cleaned up whenever. It's not tight in a "cleaning up after yourself" deterministic (when it needs to be) kind of way.

Agreed
o They're calling Flush() when it's not really needed, again programming by coincidence. "I think this needs to be done and it doesn't break..."

I always do this because I've been burned before with a file being closed without it's full contents. Supposedly can't happen, but I've seen it in the past.
* Old school data access

DataSets are bad enough, because are you really sure that the contents of what you are selecting actually fits in the web server's memory (for all simultaneous accessors)? Of course you don't know that, so when you get busy your server is going to thrash like crazy and you won't know why.

LINQ is worse, because the objects are heavier and there is tons of Reflection.

Use a DataReader for performance.
o Not bad, pre se, but it could be easier to write and easier to read. DataAdapters, DataSets, are a hard way to do data access once you've used Linq to SQL, EF or NHibernate.

Easier to write, yes, but LINQ is NEVER easier to read. LINQ ensures that 50% of programmers will NEVER be able to read it. I'm not saying that LINQ doesn't have it's place, but you better be sure that all of your programmers will be LINQ-aware. If you aren't 100% certain that that will always be the case, you probably shouldn't use LINQ, especially chains of 8-10 functions in a row.

LINQ is totally not debuggable. It either completely works or gets completely rewritten.
* Re-Throwing an Exception via "throw ex"
o When you want to re-throw an exception, ALWAYS just throw; or you'll lose your current call stack and it'll be hard to debug your code.

Agreed
* Not reusable at all
o Now, reuse isn't the end-all, but it's nice. If this programmer wants different kinds of exports, they'll need to extract a lot from the spaghetti.

Agreed

PRMan

February 05, 2010 4:36

Your preview line breaks differently than your comment. Just FYI.

PRMan

February 05, 2010 4:40

PRMan - Hm...I F11 into LINQ statements all the time....you can set debug points even on segments of a LINQ statement...why do you say it's not debuggable?

Scott Hanselman

February 05, 2010 4:52

Oh, and as long as we're criticizing code...

if (item is string)
{
return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\""));
}

Seriously? You are using string.Format to add quotes in a tight loop?

This adds StringBuilder creation overhead to EVERY CSV VALUE! That's a lot of StringBuilders! Use + instead. That makes only 2 strings for each. The original and the final.

if (double.TryParse(item.ToString(), out dummy))
{
return string.Format("{0}", item);
}

Again with the StringBuilders. Set string ItemAsString = item.ToString(); in a variable first and then do the double.TryParse. If it's OK, just return ItemAsString. If not, return "\"" + itemAsString + "\"".

PRMan

February 05, 2010 4:56

PRMan - Hm...I F11 into LINQ statements all the time....you can set debug points even on segments of a LINQ statement...why do you say it's not debuggable?

Typically because of the function chaining, which seems to be all the rage.

And again, not trying to be a jerk here, you did a great job of pointing out things to look for when evaluating code, but I'm surprised at the lack of attention to performance aspects of something that could be very problematic depending on the size of the result set.

PRMan

February 05, 2010 5:30

My thoughts?

I have been doing this for quite a while now :).

http://www.kaareskovgaard.net/

February 05, 2010 6:19

I used to have the attitude that if you haven't mastered LINQ by now then hit the books or find another job.
Now that I'm the one responsible for utilizing resources that roll in and out of projects my opinion has changed.

LINQ is like Regular Expressions. Half the coders get it, the other half is left scratching their heads at the voodoo magic.
In Scott's example, did he really make the code something easier for people to understand - or did he just demonstrate how he could accomplish something with the least amount of code?

Based on the comments from other readers, you've at least given folks something to ponder when choosing LINQ for an otherwise simple solution.

I've decided to treat it as a last resort option. This will make porting the code to Java much easier - Just kidding ;-)

Dave

February 05, 2010 6:53

In case it interests anyone, I used the supplied code a basis for doing a csv export just now.

However, my class inherits from CustomTypeDescriptor, where I create dozens of dynamic virtual properties (effectively pivoting my row based data). This is an interesting topic in itself but I won't explain further.

You can't get the virtual properties via reflection, so I wrote a few other similar methods that might help someone :

public static string CustomTypeDescriptorToCsvColumns(CustomTypeDescriptor instance)
{
return string.Join(",", GetTypeDescriptorProperties(instance).Select(p => p.Name.ToCsvValue()).ToArray());
}

public static List<PropertyDescriptor> GetTypeDescriptorProperties(CustomTypeDescriptor instance)
{
List<PropertyDescriptor> properties = new List<PropertyDescriptor>();

foreach (var p in instance.GetProperties())
{
properties.Add(p as PropertyDescriptor);
}
return properties;
}

public static string CustomTypeDescriptorToCsv<T>(this IEnumerable<T> items,string columnHeaderProperty) where T : CustomTypeDescriptor
{
string ret = string.Empty;

if (items.Count() > 0)
{
var csvBuilder = new StringBuilder();
List<PropertyDescriptor> properties = GetTypeDescriptorProperties(items.First());

foreach (T item in items)
{
string line = typeof(T).GetProperties().Where(p => p.Name.Equals(columnHeaderProperty)).First().GetValue(item, null).ToString();
line += string.Join(",", properties.Select(p => p.GetValue(item).ToCsvValue()).ToArray());
csvBuilder.AppendLine(line);
}

ret = csvBuilder.ToString();
}

return ret;
}

https://www.google.com/accounts/o8/id?id=AItOawmwFN4SXwqtQDxQbkaUgROjY10s19Y6byQ

February 05, 2010 7:02

Hi Scott good post. Two things:

1. Any reason for using var vs StringBuilder when specifying csvBuilder's type? I assume it's just to save a few keystrokes since they are functionally equivalent in this use case.

2. The mixing of business logic/data access code is another source of code smell. Do you agree?

Tone

February 05, 2010 7:20

Thank you Mr. Hansleman for a simple, yet excellent, post! From failing to implement the "using" statement on disposable objects, to old style approaches to data access, to re-throwing exceptions, I see these same patterns every day from even the most experienced of colleagues. These are simple tips/hints that will make ones code more manageable, and, simply put -- better.

As for LINQ -- frankly, if a programmer is unable to read and debug a simple LINQ chain -- whether it includes built in predicates or custom delegates -- he/she should consider a new career path (and so should any developer who's chaining together 8-10 functions in one call). Seriously folks. This is what we do. And, it doesn't get much easier than LINQ. By it's very nature it's succinct and concise, as opposed to the linguine-like code that's often written to achieve the same result.

And, regarding LINQ's performance, if one is coding for a high performance back end server application, obviously LINQ is not going to be the tool of choice. However, for the vast majority of business tier web applications, LINQ's performance is more than acceptable. Shoot, the performance of LINQ to XML blows the doors off the XmlDocument xpath calls (3x - 5x faster) that we've used up until now.

https://www.google.com/accounts/o8/id?id=AItOawmpaoPt5ztNe41aVdR3fz7FXmKhUjw9tvE

February 05, 2010 7:38

I love linq. I love code that reads like an instruction manual.

You have a collection called 'items'.
1. filter the items that meet a predicate (Where)
2. map the items to something else (Select)
3. ......
4. PROFIT!

BTW, this:

string line = string.Join(",",properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());

Is kind of hard to read, for my taste anyway.

But all in all, a big improvement!

alwin

February 05, 2010 8:31

Scott, thanks for the links. I should point out that my post at http://www.thinqlinq.com/Post.aspx/Title/LINQ-to-CSV-using-DynamicObject-and-TextFieldParser contains a more complete LINQ to CSV reader implementation using DynamicObject post which uses a full streaming approach to reading a CSV file and allowing to LINQ over it based on convention on the column headers. I also included a couple sample projects for reference.

To those who objected to using LINQ above due to reflection concerns, using a custom DynamicObject implementation allows you to specify the call dispatching and avoid reflection. Also since it is streamed, you don't have to worry with the performance overhead of loading the thing into memory.

For those who object to the code thinking it is harder to read, The plumbing code may be harder, but once that is written, it should make the end implementation code (in LINQ?) easier to consume the plumbing. The same objection can be said for much of the Reactive Framework code out there now. Much of the nasty code is in setting up the event plumbing. Once that is set, consuming it using LINQ becomes much easier.

And for those of you who haven't learned LINQ yet, I could recommend a good book...

Jim

February 05, 2010 8:51

there's a common misconception that 'throw;' does not alter the callstack. it DOES.

using System;
class F {
public static void Main () {
try {
int i = 0;
int j = 3 / i;
} catch {
throw;
}
}
}

PiersH

February 05, 2010 8:54

there's a common misconception that 'throw;' does not alter the callstack. it DOES.

compile this with '/debug'

using System;
class F {
public static void Main () {
try {
int i = 0;
int j = 3 / i; // line 6
} catch {
throw; // line 8
}
}
}

quiz: which line does the 1st frame of the the exception reference?

PiersH

February 05, 2010 10:14

Wow, I've seen some disdain here for Linq and chaining methods that I normally reserve for other people's COM code. :-p I can entertain the thought that not all developers need to create enumerables, but I don't buy that app developers have trouble understanding chained methods. It's as natural as waterfall development:


  var product = customers
                    .Select(c => c.Requirements)
                    .Design()
                    .Implement()
                    .Verify()
                    .Maintain()
                    .First();

(Um, except of course that the execution of the above statements is actually iterative under the covers.)

Scott, I would prefer not to build the whole output at once in the StringBuilder. I would consider modifying ToCsvValue() to return IEnumerable<string> so that each row can be written immediately to the output stream. The idea here is that we're not building every bit of the results up in memory but getting rid of it as we build each row. Of course this depends on the stream to not do the same dastardly thing, but since we can't see what the stream is doing it must not matter, right? ;-)

I've renamed the method to AsCsvRows, it's called something like this:

    foreach (string row in LinqToCSV.AsCsvRows(input))
    {
        context.Response.WriteLine(row);
    }

And while my goal is reduced memory load, not brevity, the code is a little tighter:

    public static IEnumerable<string> AsCsvRows<T>(this IEnumerable<T> items)
        where T : class
    {
        var properties = typeof(T).GetProperties();
        foreach (T item in items)
        {
            yield return string.Join(",", properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());
        }
    }

That's my two cents, anyway. I've left the string manipulation unchanged for comparison.

http://curtnichols.myopenid.com/

February 05, 2010 11:35

Why is everyone so worried about performance? LINQ's performance is fine. I worked on a project that runs items though a "workflow" and uses a LINQ to Objects query in every "activity" (at least 10 activities per item), and it could process over a 1000 items per second. Why are we trying to manage every CPU cycle when in most cases its not necessary. How many people are writting operating systems, drivers, and/or high performance databases? This is the same argument we had when moving from assembly to C, C to C++, C++ to COM, COM to .NET. Lets stop the performance argument unless it within a performance critical context.

Studio T

February 05, 2010 11:44

I started writing it, but then decided to Google with Bing, and found a decent direction to start with at Mike Hadlow's blog.

Is Google with Bing a phrase all MS employees use?

Espo

February 05, 2010 11:58

I agree with Mike and Dave. I think the Microsoft camp put too much focus on using the latest technology instead of focusing on using object oriented programming with design principles that have been around for decades.

Lars-Erik Kindblad

February 05, 2010 11:58

@PRMan: Are you for real? I've only made the mistake once to use a single + to concat two strings in a tight loop and it shot my execution times up to 20 minutes for hitting it about 8000 times. Changed it to a string.format and performance went to 2 minutes.

@Scott: Sure, in theory you're correct and it's better and all that, but come on, a DAL for a simple CSV export? Talking about overhead. Sometimes you just need something fast (both in terms of build time and performance) and that usually means deliberatly throwing conventions out the window. As long as you realize you'll have to throw it away at some point in the near future, it's not a problem.

Jonathan van de Veen

February 05, 2010 12:18

For perf, why not restructure your loop just a little bit so you can pass the outer StringBuilder into ToCsvValue() ? It would involve another foreach instead of the string.join but would likely involve quite a few less string allocations, if ToCsvValue just wrote to the main stringbuilder directly.

On the perf of Linq ... these kind of things always make my bar weighing "programmer time is much more expensive than CPU time". This bar needs to be applied with care but Linq just saves so much programming time I usually always find it worth the runtime cost, except in the most critical loops.

On the other hand - this code does seem to be a bit lazy about using too many individual string allocations in inner loops in ways that are easy to fix.

And this line here is just an outright crime - I am shocked to see someone like Scott use such a thing. Must have been a lapse.

> return string.Format("{0}", item);

Blech.

Otherwise very interesting post - I've always felt like CSV should be supported by the framework as natively as ToList() or ToArray() ...

Peter H.

February 05, 2010 12:48

Here's my thought after reading the article and the comments...

1. You need to implement some kind of voting for comments since after Stackoverflow I have a +1 (or -1) reflex.
2. So +1 to Mike for mentioning this too :))
3. I agree to use just streams (a DbDataReader and the Response stream so you don't bloat the RAM (leave the buffering worries to the ASP.NET runtime)
4. Maybe the catch(Exception ex) { throw ex; } was a means to obfuscate debugging info :))))))
5. @PRMan : WTF is with the flush? I can't believe it till I see it.. Maybe it was something else..
6. I agree on the string.Format hate. I do hate it and rarely see its benefits
7. Regarding googling it with Bing... Can't we bing it with Bing? Or maybe binging it with Google?! :))))

Now shoot me :P

Andrei Rinea

February 05, 2010 13:24

Hi.
For better Excel (or Sql Server) support we could add column headers and use ';' as separator values.
Maybe a new parameter isExcelMode on ToCsv method.

C# CODE:

public static string ToCsv<T>(this IEnumerable<T> items, bool isExcelMode)
where T : class
{
var csvBuilder = new StringBuilder();
var properties = typeof(T).GetProperties();

var separator = isExcelMode ? ";" : ",";

if (isExcelMode) //HEADER COLUMNS
csvBuilder.AppendLine(string.Join(separator, properties.Select(p => p.Name.ToCsvValue(isExcelMode)).ToArray()));

foreach (T item in items)
{
string line = string.Join(separator, properties.Select(p => p.GetValue(item, null).ToCsvValue(isExcelMode)).ToArray());
csvBuilder.AppendLine(line);
}
return csvBuilder.ToString();
}

private static string ToCsvValue<T>(this T item, bool isExcelMode)
{
var columnDelimiter = String.Empty;

if (!isExcelMode)
columnDelimiter = "\"";

if (item == null)
return String.Format("{0}{0}", columnDelimiter);

if (item is string)
{
return string.Format("{0}{1}{0}", columnDelimiter,
!isExcelMode ? item.ToString().Replace("\"", "\\\"") : item.ToString());
}

double dummy;

if (double.TryParse(item.ToString(), out dummy))
{
return string.Format("{0}", item);
}
return string.Format("{0}{1}{0}", item);
}

Tested with Excel 2003.

Bye

http://pacoweb.myopenid.com/

February 05, 2010 16:47

Why not use stringtemplate to can export data easily ?
For an example, see http://codeplex.com/exporter

Ignat Andrei

February 05, 2010 19:42

CsvHelper is another option and is easy to use.

http://csvhelper.com

https://www.google.com/accounts/o8/id?id=AItOawmn8YJIYLNPFwbk4Ti3VU5nOGuh9pdTNt8

February 05, 2010 19:59

I wrote something similar and I also added the ability to exclude certain columns from being exported. The reason I did this was so I could reuse my repository functions. For example, in a datagrid you commonly pull back a logical string that represents a record and also the primary key of the table so you can use it to build the href. However, no one wants to see the ID field in the CSV/Excel output. Therefore my ExportToCSV takes a string array as a parameter so I can exclude certain fields from the export.

public static void ExportToCSV<T>(this IQueryable<T> list, string filename, string[] exclude )

You can see the full example here: http://www.codecapers.com/post/export-to-csv-using-reflection-and.aspx

Michael Ceranski

February 05, 2010 21:34

Jonathan van de Veen: I meant a DAL if you had one, in the context of a larger app...this is a much larger app.

Peter: Agreed. That's sloppy. You may have noticed I kind of slapped it together. ;)

Ignat: I'll check it out! As well as CSVHelper

Michael Ceranski: Nice...

Scott Hanselman

February 05, 2010 23:34

"I think the Microsoft camp put too much focus on using the latest technology instead of focusing on using object oriented programming with design principles that have been around for decades."

OO was invented in the 70s. List comprehnsions where invented in...the 70s.

Though I suppose OO is 7 odd years it's senior, so we should all be programming in OO and completely ignoring everything else that has come out of Computer Science in the last 4 decades.

Just because Micosoft was late to the party doesn't mean the latest technology doesn't have a sound theoretical base or the benefit of decades of research and experience to draw on. Quite the contrary.

http://www.morethannothing.co.uk/

February 06, 2010 6:53

Hi Scott,
I really like the vssettings of the editor you are having, can you please share it or let me know how to create interleaved colors on the editor.

Regards,
Ravi.

RaviKiran

February 06, 2010 21:40

Ravi, he is using SyntaxHighlighter :))

https://www.google.com/accounts/o8/id?id=AItOawkgXDhqQMqRu6IsaGxv0DjBRIZzmIl377I

February 06, 2010 22:35

Hi Scott, I liked your code. I think although LINQ is not a solution to all problems, it works great in situations like this. Makes the code clean and succint.

S. M. Sohan

February 09, 2010 2:05

I did something similar for ASP .NET MVC a while back. If anyone is interested it can be found at http://jpalm.se/post/Generate-Excel-reports-with-ASPNET-MVC.aspx

https://www.google.com/accounts/o8/id?id=AItOawm1Q_jGXtVlGiu2lHa_TXQVjHnSY9TNYVQ

February 09, 2010 10:02

I enjoyed the "elysium fields" ideal approach.

But Google with Bing seems to be too much an attempt to deteriorate the "Google Brand" on your very visible blog. Not trying to be too much a groupie, but while maybe? all in fun, it seems a bit malicious.

opello

February 09, 2010 10:50

Certainly wasn't trying to deteriorate a brand. Google is a company and a ubiquitous verb meaning 'to search the web, usually with Google." I was just applying it to a search engine I use sometimes.

Scott Hanselman

February 10, 2010 23:43

I would consider (if using SQL Server 2005+) writing a stored procedure to return XML describing the entity in question, transform this via XSLT to whatever was requested (Excel 2003 XML for instance, or even CSV) and serving this to the requester with the appropriate Internet media type as your first commentator has said.

Such an approach would concentrate on the data rather than procedural code, provide reusable components, provide a flexible pipeline architecture, validate against strong types with XML schema (for XML output), be easy to debug with a viewable intermediate data format, and use a declarative model that leaves the underlying code to optimize performance.

It could also avoid the hardcoded values in the examples, as you would parameterize the stored procedure to make it more reusable and responsive to user choice.

I have generated Word 2003 XML documents in a similar way, and that schema is more difficult than the Excel 2003 schema.

Even better would be SQL Server exposing the final page directly through a RESTful interface, though :)

Tavis

February 11, 2010 0:08

This is a great example of eliminating noise in code Scott. My only suggestion would be to have the Handle method take a HttpContextBase instead of an HttpContext so that you can easily unit test it. Obviously, Phil's BaseHttpHandler was written back in 2005 when HttpContextBase didn't exist yet, but I'm sure he'd make that change now days. :-)

https://www.google.com/accounts/o8/id?id=AItOawmkComSnH0Ie2YhlBv6ujP_U306sLcFX5w

February 13, 2010 6:22

I may be making something out of nothing here but in this line:

string line = string.Join(",",properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());

properties.Select(....) iterates the properties and creates an array of their values in which the call to string.Join(...) iterates the newly created array and joins the values. Wouldn't the following be better or is what I have below one of those, as the saying goes, "premature optimization is the root of all evil"?

... snip ...

foreach(T item in items) {
bool first = true;
foreach (var p in properties) {
if (!first) csvBuilder.Append(",");
else first = false;
csvBuilder.Append(p.GetValue(item, null).ToCsvValue());
}
csvBuilder.AppendLine();
}

.... snip ....

In effect you replace two inner loops with one; granted it is a little uglier to look at.

Bud White

March 22, 2010 23:58

For converting to CSV why not use ODBC Text Driver like mentionned here ?

http://csharptutorial.com/blog/how-export-a-datatable-to-a-text-file/

Wouldn't it much easier and above all flawless or did I miss something ?

alain

Comments are closed.