The Weekly Source Code 51 - Asynchronous Database Access and LINQ to SQL Fun
You can learn a lot by reading other people's source code. That's the idea behind this series, "The Weekly Source Code." You can certainly become a better programmer by writing code but I think good writers become better by reading as much as they can.
I was poking around in the WebFormsMVP project's code and noticed an interesting pattern.
You've seen code to get data from a database and retrieve it as an object, like this:
public Widget Find(int id)
{
Widget widget = null;
widget = (from w in _db.Widgets
where w.Id == id
select w).SingleOrDefault();
return widget;
}
This code is synchronous, meaning basically that it'll happen on the same thread and we'll wait around until it's finished. Now, here's an asynchronous version of the same code. It's a nice combination of the the new (LINQ, in this case, LINQ to SQL) and the older (DataReaders, etc). The LINQ (to SQL) query is in query, then they call GetCommand to get the underlying SqlCommand for that query. Then, they call BeginExecuteReader on the SqlCommand which starts asynchronous execution of that command.
SqlCommand _beginFindCmd = null;
public IAsyncResult BeginFind(int id, AsyncCallback callback, Object asyncState)
{
var query = from w in _db.Widgets
where w.Id == id
select w;
_beginFindCmd = _db.GetCommand(query) as SqlCommand;
_db.Connection.Open();
return _beginFindCmd.BeginExecuteReader(callback, asyncState, System.Data.CommandBehavior.CloseConnection);
}
public Widget EndFind(IAsyncResult result)
{
var rdr = _beginFindCmd.EndExecuteReader(result);
var widget = (from w in _db.Translate<Widget>(rdr)
select w).SingleOrDefault();
rdr.Close();
return widget;
}
When it's done, in this example, EndFind gets called and they call DataContext.Translate<T> passing in the type they want (Widget) and the source, the DataReader retrieved from EndExecuteReader. It's an asynchronous LINQ to SQL call.
I found it clever so I emailed my parallelism friend and expert Stephen Toub and asked him if this was any or all of the following:
a. clever
b. necessary
c. better done with PFX/TPL (Parallel Extensions to the .NET Framework/Task Parallel Library)
Stephen said, in his own get-down-to-business fashion:
a) It's a standard approach to converting a LINQ query to a command to be executed with more control over how it's executed. That said, I don't see it done all that much, so in that capacity it's clever.
b) It's necessary to run the query asynchronously; otherwise, the call to MoveNext on the enumerator will block. And if ADO.NET's MARS support is used (multiple asynchronous result sets), you could have multiple outstanding operations in play.
c) TPL can't improve upon the interactions with SQL Server, i.e. BeginExecuteReader will still need to be called. However, TPL can be used to wrap the call such that you get a Task<Widget> back, which might be a nicer API to consume. Once you have it as a Task, you can do useful things like wait for it, schedule work for when its done, wait for multiple operations or schedule work when multiple operations are done, etc.
One other thing that's interesting, is the WebFormsMVP project's PageAsyncTaskManagerWrapper:
namespace WebFormsMvp.Web
{
/// <summary>
/// Represents a class that wraps the page's async task methods
/// </summary>
public class PageAsyncTaskManagerWrapper : IAsyncTaskManager
{
readonly Page page;
/// <summary />
public PageAsyncTaskManagerWrapper(Page page)
{
this.page = page;
}
/// <summary>
/// Starts the execution of an asynchronous task.
/// </summary>
public void ExecuteRegisteredAsyncTasks()
{
page.ExecuteRegisteredAsyncTasks();
}
/// <summary>
/// Registers a new asynchronous task with the page.
/// </summary>
/// <param name="beginHandler">The handler to call when beginning an asynchronous task.</param>
/// <param name="endHandler">The handler to call when the task is completed successfully within the time-out period.</param>
/// <param name="timeout">The handler to call when the task is not completed successfully within the time-out period.</param>
/// <param name="state">The object that represents the state of the task.</param>
/// <param name="executeInParallel">The vlaue that indicates whether the task can be executed in parallel with other tasks.</param>
public void RegisterAsyncTask(BeginEventHandler beginHandler, EndEventHandler endHandler, EndEventHandler timeout, object state, bool executeInParallel)
{
page.RegisterAsyncTask(new PageAsyncTask(beginHandler, endHandler, timeout, state, executeInParallel));
}
}
}
They made a nice wrapper for these existing System.Web.UI.Page methods and they use it like this, combined with the asynchronous LINQ to SQL from earlier:
AsyncManager.RegisterAsyncTask(
(asyncSender, ea, callback, state) => // Begin
{
return widgetRepository.BeginFindByName(e.Name, callback, state);
},
result => // End
{
var widget = widgetRepository.EndFindByName(result);
if (widget != null)
{
View.Model.Widgets.Add(widget);
}
},
result => { } // Timeout
, null, false);
AsyncManager.ExecuteRegisteredAsyncTasks();
They fire off their task, which then does its database work asynchronously, and then it all comes together.
I'll leave (for now) the wrapping of the APIs to return a Task<TResult> as an exercise for the reader, but it'd be nice to see if this pattern can benefit from the Task Parallel Library or not.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
That said, if you still want to go asynchronous then the Rx framework might be very useful in this scenario.
The TPL is more about computation work than IO, so I'm not sure to what extent it would make this pattern easier to use - but it's a perfect fit for F#'s asyncronous computation expressions, and Async.FromBeginEnd.
public Widget Find(int id) {
return _db.Widgets.SingleOrDefault(w => w.Id == id);
}
The replacement could probably be simpler too - will take a look when I get a few minutes.
[)amien
Absolutely. I wanted to mention the same thing you mentioned :-) but I saw your comment.
I really hate to use the From ... In keywords in LINQ these days.
I would rather go for something easier using the new Task class:
System.Threading.Tasks.Task<int> t1 = new System.Threading.Tasks.Task<int>(DoSomething);
t1.Start();
System.Threading.Tasks.Task<int> t2 = new System.Threading.Tasks.Task<int>(DoSomethingElse);
t2.Start();
System.Threading.Tasks.Task.WaitAll(new[] { t1, t2 });
t1.Result & t2.Result ready to use
It would be great if you could do this more often! This was very interesting.
Example : I have an Action that gets all the Todo list and renders them in a view. Will I gain any benefit if I do it in parallel ?
P.S : Idea is definetly clever, but I am debating in which scenario should we use it in web environment ?
void CustomersAsync()
{
Data.NwindDb db = new Data.NWindDb();
db.QueryAsync( db.Customers.Where(c=>c.PostalCode = "98053") )
.OnCompleted( Customers_Completed )
.Begin();
}
void Customers_Completed(object sender, AsyncQuery e)
{
if( e.Error != null ) return;
Data.Customer cust = e.Read<Data.Customer>().FirstOrDefault();
Debug.Print("Retrieved customer Id {0}", cust.Id);
}
It also adds extension methods to DataContext like MultiQueryAsync where you can pass an SqlCommand or a string to execute a stored procedure that returns multiple readers that get translated to Linq objects with the call to Read<T>(). In the completed method, you just called Read<T>() multiple times to read the multiple recordsets. The AsyncQuery disposes of the DataContext for you after it's Completed event is handled (if the AsyncQuery.AutoDispose property is left to it's default of true).
Nice example. I extracted the first part of the sample and created an ExecuteAsync<T>(query, Action<IEnumerable<T>> onReady) extension method to the DataContext class to make life a little bit easier. You can find a sample and the source code over here.
Cheers,
Wesley
I think the point here is that it's very useful when you want to load up a lot of data without blocking the executing thread. In a web application, this has great potential when your application first starts up - you could load up all your lookup/cached data using this technique, without having to worry about a massive startup time.
Excellent post, thanks Scott - I actually didn't know you could do this!
I wonder if the same works with Entity Framework instead of L2S ?
Thank you,
For more information read this great article and watch this great session from Tech.Ed US a few years back.
I've made a simple IQueryable<T> extension method Async() that easily allows you to run any query async on the threadpool. It is described here.
Second, Damian - Thanks for your articulate response on the benefits/reasons for performing async tasks in the web scenario. I totally get the benefit of releasing the current thread back to the pool for other client request processing.
However, where I work, we're using web services as app servers behind the DMZ (4-tiered solution), and it seems to me that ws web methods would benefit from the same idea of a PageAsyncTask framework. However, since it's only a web "method," the synchronization is really left up to us, the web method developer, right? It looks like we just have to block the thread and wait for all tasks, parallel or not, to finish before continuing with returning the result to the caller. This doesn't really release any threads back to the pool on the ws side, does it? It just seems there would be benefit to releasing threads back to the pool in the ws tier, too, if we're going to be waiting on IO bound ops. After all, my ws's need to remain scalable to handle "client" requests from my asp.net web app, right?
Any thoughts on this?
ASP.NET supports this too, just uses a slightly different pattern: http://msdn.microsoft.com/en-us/library/aa480516.aspx.
Basically if you turn your single [WebMethod] declaration into separate BeginXXX and EndXXX [WebMethod] declarations, it all magically works. Of course to get the real benefit you need to ensure that the IAsyncResult you’re returning from the BeginXXX method is waiting on an IO thread (DB or web service call) and not some arbitrary worker thread you created yourself, otherwise you’re just stealing from the same worker pool of threads that ASP.NET uses to service requests.
Let's say you have several database calls that must be executed on PageLoad.
Call1()
Call2()
Call3()
This is going to execute Call1, wait till return, then Call2, wait for return, then Call3 wait for return.
If these are called Asynchronously then all 3 can be called at the same time.
Keep in mind, for the webpage, it still wait for all three to be returned in order to continue. (This is why above it's mentioned that you wouldn't do this for just one call)
Comments are closed.