SQL Patterns

posted Aug 25, 2011, 7:08 AM by Eric Patrick   [ updated Nov 21, 2014, 7:23 AM ]

Best of Both Worlds: Dynamic SQL Statements vs. Stored Procedures

Stored procedures have been the bread and butter of QBO 2 installations. They are modular, reusable, easily maintainable, and perhaps most importantly, they can be tweaked / tuned without the need to recompile the application.  However, stored procedures are limited in their ability to solve problems like:
  • Searching: building a dynamic WHERE clause is cumbersome
  • Member lists: displaying "parent" columns on a Collection Member List or Smart Worklist Member list requires building a custom stored procedure per parent
  • Adding columns: if adding a new foreign key column to a table, all related stored procedures need to be updated accordingly, increasing development time
  • Updating only a few columns: dynamic statements need not reference unaffected columns, potentially reducing the impact on index maintenance and trigger execution
With QBO 3, we allow for the continued use of stored procedures, but also introduce the ability to build dynamic queries.  For example:

Contact.ashx/Search?LastName=Patr%&ContactTemplateID=&Status=Valid&Status=Invalid

will dynamically build something like:

SELECT * FROM Contact WHERE LastName LIKE 'Patr%' AND Status IN ('Valid', 'Invalid')

You can play with these statements by navigating to /{Project}/{Table}.ashx/Describe; e.g. /Skiptrace/Skiptrace.ashx/Describe.

SQL Formatting Patterns

Reviewing the QBO 2 stored procedures, there are common patterns used that are now "formatting patterns" in QBO 3. For example, given the Contact table's foreign keys include ParentContactID (FK to Contact.ContactID) and ContactTemplateID (FK to ContactTemplate.ContactTemplateID), we often do the following:

SELECT
    Contact.ContactID, ...
    ParentContact.Contact AS ParentContact, ...
    ContactTemplate.ContactTemplate AS ContactTemplate, ...
FROM Contact
    LEFT OUTER JOIN Contact AS ParentContact WITH (NOLOCK) ...
    LEFT OUTER JOIN ContactTemplate AS ContactTemplate WITH (NOLOCK) ...


Abstractly, this pattern includes:
  • From clause: left outer join to all FK tables, aliasing the table as the FK column name (without the 'ID')
  • Select clause: include the FK table's "main column" aliased as the FK column name (without the 'ID')
QBO 3 allows you to construct a query like this:

SELECT {Select.ForeignKey} FROM {From.ForeignKey}

and the QBO 3 code will replace {Select.ForeignKey} with the list of columns following the select pattern above, and {From.ForeignKey} with the FROM statement following the pattern above.

However, this particular pattern is not always followed.  For example:
  • we don't want the foreign key columns or tables in our statements, or
  • we might want all the foreign key columns, or
  • we want just ContactTemplateID and ContactTemplate to populate a dropdown list, or
  • we want something completely customized that QBO 3 does not have a pattern form
Most SQL statements in QBO 3 are built by the qbo.Application.ObjectConfiguration class, which handle a finite number of patterns.  See the code in qbo.Application > Configuration > SqlPattern.cs, or play with them using the /Descibe method for any HttpHandler.

Parameter Substitution and Literals

A nice feature of the SqlPattern class is automatic creation of IN clauses for dimensions. For example, we may want to display all contacts in New England:

... WHERE State IN ('ME', 'MA', 'NH', 'VT', 'CT', 'RI') ...

This can be done over the wire with:

Contact.ashx/Search?State=ME,MA,NH,VT,CT,RI

However, there are use cases where we want similar string substituion but are not using dimensions. An example is leveraging SQL's OPENROWSET() function to read Excel files; it does not allow use of variables to identify a file path.

-- This is not valid TSQL
DECLARE @FileName nvarchar(50) = 'MyTestFile.xlsx'
SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0', 'Excel 12.0;Database=' + @FileName + ';HDR=Yes', 'SELECT * FROM [Sheet1$]'

In such cases, we can use literal substitution with {Param.{ParamName}}.  For example:

SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0', 'Excel 12.0;Database={Param.FileName};HDR=Yes', 'SELECT * FROM [Sheet1$]')

Do not allow direct user data entry of such parameters; users could enter data that may result in SQL injection.

Xml Columns

For tables containing XML columns (such as ImportForm.XmlData), the {Where.Basic} clause support search and order by using SQL XML expressions.

ImportForm.ashx/Search?ImportFormTemplateID=22&XmlData.MyCustomField=SomeValue&OrderBy=XmlData.AnotherCustomField

This would generate a SQL clause like this:

SELECT ...
FROM ImportForm WITH (NOLOCK)
...
WHERE 
[ImportForm].[XmlData].value('(//MyCustomField)[1]', 'nvarchar(255)') IS NOT NULL 
...
ORDER BY 
[ImportForm].XmlData.value('(//AnotherCustomField)[1]', 'nvarchar(255)') DESC

Note that the expression XmlData.MyCustomField translates to //MyCustomField.  You can specify a more precise XPath expression using a 'dot' notation:

Search?XmlData.ImportFormXml.MyGroup.MyCustomField=X generates ImportFormXml/MyGroup/MyCustomField
Search?XmlData...MyInnerNode.MyCustomField=X generates //MyInnerNode/MyCustomField


Tricky Stuff

In designing these patterns, some tricky issues with inherited classes arose. For example, any Process-derived class (SkipTrace, Foreclosure, Bankruptcy, etc.) needs to address four issues:
  1. When inserting or updating, one must insert/update into the base table and derived table (e.g. INSERT INTO Process and INSERT INTO SkipTrace),
  2. One must ensure that the correct parameters are mapped into the correct table's insert or update statement,
  3. If there are no specified parameters for a table, the entire statement for the table must be omitted, and
  4. The where clause for a base table must be based upon primary keys for the derived table
DbStatement.Cascade 

In a configuration file, if a query is marked with Cascade="true", the SQL statement will be built from all non-abstract classes.
For example, Abstract.config defines the basic structure of an Insert statement as:

INSERT INTO {Table} ({Select.Insert}) VALUES ({From.Insert}) SELECT SCOPE_IDENTITY() AS {Table}ID

For a non-derived class like Contact, this will produce:

INSERT INTO Contact (Contact, FirstName, LastName, ...) VALUES (@Contact, @FirstName, @LastName, ...) SELECT SCOPE_IDENTITY() AS ContactID

For a Process-derived class like SkipTrace, this will produce:

INSERT INTO Process (Process, ...) VALUES (@Process, ...) SELECT SCOPE_IDENTITY() AS ProcessID
INSERT INTO SkipTrace (SkipTrace, ..., ProcessID) VALUES (@SkipTrace, ..., SCOPE_IDENTITY()) SELECT SCOPE_IDENTITY() AS SkipTraceID

DbStatement.RequireParameters

An entire statement can be omitted from the formatted results by configuring the DbStatement with RequireParameters="true". 

The URL /SkipTrace/SkipTrace.ashx/Update?SubjectFound=1/1/2012&Status=Found&ID=18

will result in:

UPDATE Process SET Status=@Status WHERE ProcessID IN (SELECT ProcessID FROM SkipTrace WHERE SkipTraceID IN (18))
UPDATE SkipTrace SET SubjectFound=@SubjectFound WHERE SkipTraceID IN (18)

However, if there are no columns specified for the SkipTrace table:

/SkipTrace/SkipTrace.ashx/Update?Status=Found&ID=18

will result in:

UPDATE Process SET Status=@Status WHERE ProcessID IN (SELECT ProcessID FROM SkipTrace WHERE SkipTraceID IN (18))

Likewise, if there are no columns specified for the Process table:

/SkipTrace/SkipTrace.ashx/Update?SubjectFound=1/1/2012&ID=18

will result in:

UPDATE SkipTrace SET SubjectFound=@SubjectFound WHERE SkipTraceID IN (18)

Blob Columns

Occasionally, our table include Blob columns with a lot of data. ImportFile and ServiceOrder are prime examples; large amounts of XML may be stored in these tables. Most of the time, we do not need to fetch the blob data when running queries against these tables. Fetching the blob data is an expensive operation, and causes slow performance as a lot of data moves "over the wire".

In a class, you can specify a property (column) as a Blob column as follows:

[Blob]
[DataMember(EmitDefaultValue = false)]
public string RequestXml
{
get { return _RequestXml; }
set { ... }
}

The Select clauses that SqlPattern builds omit blob columns by default. If you have a query that requires emitting the blob columns by default, you can use a statement like this:

SELECT {Select.ForeignKeyLabel} {Select.Blob} FROM {From.ForeignKey} WHERE {Where.IdList}

Note the lack of a comma between the two Select clauses; Select.Blob expects that columns come before it, and will automatically include a comma before it's first column. If you have a statement with {Select.Blob} by itself, you can include a 'dummy' column like this:

SELECT 'Blob' AS BlobPlaceholder, {Select.Blob} FROM {From.Basic} WHERE {Where.IdList}

Recommended practice is to ignore blob columns for all generic statements. If you need access to a blob column, extend your class to include a method or property that selects the blob column directly on it's own. One incurs an extra database connection when doing so, but this is preferable to passing blob columns when one does not need them.

Why We Don't Use LINQ


For all practical purposes, LINQ requires developers to work with strictly pre-defined data models. In most QBO 2 installations, custom reports are used extensively to meet client customization requirements.  Using LINQ would essentially require that all such customization include a build of custom DLLs to handle these queries via LINQ.

Moreover, LINQ is really designed to assist developers with reducing development time, and it appears to do a very nice job with this.  However, LINQ does not appear to solve some of the practical performance issues QBO system face.  For example:
  • it is not a simple matter to dive under the hood with LINQ to performance tune queries, and
  • very large data set (for reporting and import framework operations) still have the data reside in memory

Comments