DataReaders and Parallelism

posted Mar 26, 2014, 8:25 AM by Eric Patrick
There are several use cases that require some task to be executed for each row output by a data reader, including:
  • DecisionStep.StartMethodLoop
  • Import.BatchEngine (processing Excel rows)
  • AbstractObject.Listen  
The typical approach is something like this:

using (IDataReader reader = myQboObject.InvokeDataReader(operation, parameters)) {
    while (reader.Read()) {
        IDictionary<string, object> data = Functions.ToProperties(reader); // This creates a dictionary from a single row in the data reader
        AbstractObject myOtherQboObject = AbstractObject.Create(someString, User);
        myOtherQboObject.Invoke(someOperation, data);

Assume for a moment our data reader is returning a lot of day; perhaps 50K rows. While data readers are very memory-efficient, this approach is single-threaded, which means we are calling myOtherQboObject.Invoke(someOperation, data) in series, 50K times.  Slow.

Fortunately, the .NET Task Parallel Library (which is wicked cool) allows use to use an enumerator and leverage parallelism, like this:

MethodSignature signature = new MethodSignature("MyQboObject/operation?{Parameters}");
Parallel.ForEach(signature.EnumerateReader(), data => {
    AbstractObject myOtherQboObject = AbstractObject.Create(someString, User);
    myOtherQboObject.Invoke(someOperation, data);