I don't have a lot to say, but this is my little bit.

Wednesday, December 23, 2009

Concatenate Aggregate Field Values Using LINQ

Problem: SQL Server and other databases do not have an aggregate function for concatenation. Aggregate functions are those such as min, max, sum, count, and avg, which work on database fields from SQL queries containing a GROUP BY clause, but not grouped under that clause. A concatenation function would be very helpful when you need to quickly summarize string data from a child object when viewing the parent object.

In my example, I have a web app with a table showing Products, which have zero or more associated Events, and each Event has one Event Code. I wanted the table to display a Product with a concatenated summary of the Event Codes for each of its Events. Yesterday I blogged about how to use a LinqDataSource while also joining the Product and Event tables; this post will show how to extend that solution to concatenating one of the data fields.

There are many solutions that I didn't like or which didn't work for me:

  • The internet has many people requesting help with similar problems, and the proffered solutions usually include writing a database function, often a highly tailored one. I would have used this solution if I could have found a good, reliable, generalized function written by somebody who knew how to make tight T-SQL. I could not find such a function; I could only find functions that didn't work, wouldn't compile, or which were not generalized to my satisfaction. I did not want to tackle writing my own aggregate function in T-SQL, nor did I want to write a special CLR function.
  • I attempted many different ways of frobbing my LINQ query in such a way as to build the required concatenated list without success. This approach may be impossible.
  • I attempted to write a loop which would go over my LINQ query results, build the concatenated string, and splice it into the result set. This didn't work because I was trying to assign to a read-only member.
That last failed approach was tantalizingly close to working, and I managed to placate the compiler by moving my loop to a second function and calling that from within my LINQ query. This works because LINQ is flexible in building its lazy queries.

Here is the method which builds the concatenated string. This method will look different for different applications, but they will all select values from a database, build them into a string, and return the string.

private object GetConcatenatedEventCodes(int p)

{
var events =
from sourceEvent in Db.Events
where sourceEvent.product_id == p
select sourceEvent.event_code;

Hashtable eventCodeTable = new Hashtable();
StringBuilder eventString = new StringBuilder();

foreach (var sourceEventCode in events)
{
if(! eventCodeTable.ContainsKey(sourceEventCode))
eventCodeTable.Add(sourceEventCode, string.Empty);
}

foreach (var sourceEventCode in eventCodeTable.Keys)
{
eventString.Append(sourceEventCode);
eventString.Append(" ");
}

return eventString.ToString();
}


In this example I use a Hashtable in order to remove duplicates from my list, which was a feature I wanted.

Then, the LINQ query includes a direct call to the concatenation method. I have removed some nonessential parts of this query, but it shows how to assign the results of a method call to a named field within an anonymous class. This feature of LINQ will probably be familiar to an experienced ASP.NET programmer, but using it for string concatenation is useful and non-obvious.

var result = (

from sourceProduct in Db.Products
join sourceEvent in Db.Events
on sourceProduct.product_id equals sourceEvent.product_id

select new {
sourceProduct.product_id,
sourceProduct.xxx,
event_code = GetConcatenatedEventCodes(sourceEvent.product_id),
issue_time = sourceProduct.update_time
}
);


Because LINQ is lazy, it doesn't call GetConcatenatedEventCodes() until it actually attempts to fetch the query results.

No comments:

Post a Comment