LINQ with DataReaders

Over the past couple of days, one of the MVP lists has had some activity about LINQ (which is pretty common these days), and one of the threads meandered into using LINQ on DataReaders. The problem is that LINQ works on IEnumerable sources, but DataReaders don't implement that. Fortunately, it's easy to do, and several people out there have come up with various solutions, including Jim Wooley in this article of his. The basic gyst is that you can you create an extension method on IDataReader that serves as an Enumerator.

However, Jim encountered two problems. He is returning the data one row at a time (which is the point) but as an array of object. So first, he's worried about performance. This is a valid concern considering the boxing/unboxing implications for all value type columns as well as the fact that a LINQ statement might try to grab those values several times (for sorting, as an example). Secondly, because all you have is an array of object that represents a row, you reference each column of the source as "row[0]" and "row[1]", etc. - basically, you don't get to use nice column names in the SQL.

The way to fix both these problems is to have an object that represents the row, complete with specific and type-safe fields (properties if you want to get technical and PC). Then, you need a way to convert each DataReader row to an instance of the row class. Expanding on Jim's example, here's what I came up with:

C#
public interface IReaderRow
{
    IDataReader Reader { get; set; }
    object GetRowData();
}

public class TimeZoneRowReader : IReaderRow
{
    public IDataReader Reader { get; set; }

    public object GetRowData()
    {
        return new TimeZoneRow() 
        {
            ID = Reader.GetByte(0),
            Name = Reader.GetString(1),
            Description = Reader.GetString(2)
        };
    }
}

public class TimeZoneRow
{
    public int ID { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
}

public static class DataReaderExtension
{
    public static IEnumerable<T> DataReaderEnumerator<T, TReader>(this IDataReader source) where TReader : IReaderRow, new()
    {
        if (source == null)
            throw new ArgumentNullException("source");

        IReaderRow rowReader = new TReader() { Reader = source };

        while (source.Read())
        {
            yield return (T)rowReader.GetRowData();
        }
    }
}

As you can see, the LINQ statement is now typed and has intuitive and descriptive names for the columns. The downside, of course, is that you need a class representing each type of possible result, and a way to translate the DataReader data to that row instance. You can combine the row class with the specific row reader class, but I simply chose to keep them separate. Strangely enough, this starts to feel like LINQ to SQL or LINQ to Entities at this point, but it's a way to solve the two other problems I mentioned earlier. In the end, the LINQ will now look something like this:

C#
listBox1.Items.Clear();

using (SqlConnection cn = new SqlConnection(connectionString))
{
    using (SqlCommand cmd = new SqlCommand("Select * from TimeZone"))
    {
        cmd.CommandType = CommandType.Text;
        cmd.Connection = cn;
        cn.Open();
        using (SqlDataReader dr = cmd.ExecuteReader())
        {
            var queryResults = from timeZone 
                in dr.DataReaderEnumerator<TimeZoneRow, TimeZoneRowReader>)
                where timeZone.ID > 0
                orderby timeZone.Name descending
                select timeZone;

            listBox1.DisplayMember = "Description";
            listBox1.DataSource = queryResults.ToList<TimeZoneRow>();
        }
    }
}

If you search around, you'll see quite a few other solutions (some of them in community projects) for using DataReaders as LINQ sources, although they all seem to have some of these aspects in common. The really interesting stuff is when you start to think along the lines of using combinations of IL Emit and anonymous classes to create all these row-related classes automatically (the schema is available on the data reader, after all). However, without getting too tricky, it's tough to find a way for C# or VB to interpret the results in design time so you can still maintain the intellisense and column names without resorting to something like code generation.

posted @ Friday, April 11, 2008 1:27 PM

Print
«September»
SunMonTueWedThuFriSat
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789