FOR XML EXPLICIT

Posted by Chris on October 09, 2006

A couple of weeks ago I was engaged as trainer for the course “2779: Implementing a Microsoft SQL Server 2005 Database”:http://www.microsoft.com/learning/syllabi/en-us/2779afinal.mspx. The module that was by far the most difficult according to the attendees was the one on xml, with none of them having any practical experience with xml. One of the things that was specifically difficult to understand was the @FOR XML EXPLICIT@ clause to the @SELECT@ statement.

The EXPLICIT mode of the FOR XML clause is to be used when you need to create XML of a specific format that cannot be done with AUTO or RAW modes. You can use EXPLICIT to generate xml of more or less any format you wish. It is also the most complex mode to use. The AUTO and RAW modes are normally used to transform the result of an existing query from a tabular resultset into an xml stream. The key word in that sentence is existing, by which I mean that whether or not you want the results in xml or not you still use the same query. Just add the FOR XML clause and you’re good.

With the EXPLICIT mode it is not that easy. The transformation engine that creates the xml stream from the result of a query requires that the resultset is designed specifically for this task. The concept you must understand is what is called a universal table. This table will have all the information that is needed for the transformation engine to generate xml of the format you require. So what is a universal table then? I think it is easiest to start with an example:

Tag | Parent | Employee!1!Id
 1  | null   | 280

So what does the above mean? First thing to note is the column names. This metadata is used by the transformation engine to create the nodes in the resulting xml output. We must make sure that the query we run (the one we use the FOR XML EXPLICIT clause with) returns a resultset like the one above.

The first two columns must always exist with those names. They are used to describe the hierarchy of elements in the xml output. Every element that should exist in the needs to be represented by a row in the universal table. Every unique “type” of element (ie an element name at a specific level in the hierarchy of the xml fragment) needs to be uniquely idientified by an arbitrary tag number, which is specified in the first column. If the element is nested inside a parent element then the tag number of the parent element should be included in the column for that.

The rest of the column names (in this case only one) specify the names of the nodes in the resulting xml fragment. In our example we have Employee!1!Id. This cryptic combination says that the element tagged with number 1 should be called Employee, and it should have an attribute called Id. Then, for all the rows in the table the value in this column will end up in the Id attribute of element Employee. So the very simple xml fragment we get from the universal table above would look like this:


This is about as much as the course documentation says about FOR XML EXPLICIT and universal tables. Well, it contains a little more but it is quite difficult to see how to use it for more complex examples than what you can easily create with the other modes. So I decided to create an example that does a little more, but still should be quite simple to understand.

Lets say that we have a requirement to produce an xml fragment like the one below.


        
                
                
                ...
        

What we want is an xml fragment describing a specific employee (id=280 in this case) and her customers. The data we need for this can be returned by the following query (run in the AdventureWorks database):

SELECT Employee.EmployeeId Id
	, Employee.LoginID [Login]
	, SalesPerson.SalesLastYear
	, Store.CustomerID CustomerId
	, Store.Name
	, Customer.AccountNumber Account
FROM HumanResources.Employee Employee
INNER JOIN Sales.SalesPerson SalesPerson
	ON Employee.EmployeeId = SalesPerson.SalesPersonId
INNER JOIN Sales.Store Store
	ON SalesPerson.SalesPersonId = Store.SalesPersonId
INNER JOIN Sales.Customer Customer
	ON Store.CustomerId = Customer.CustomerId
WHERE Employee.EmployeeId = 280
ORDER BY EmployeeId, CustomerID

This result includes all the data that should go in the resulting xml fragment. However it does not tell transformation engine how to create it. If we would simply add FOR XML AUTO or RAW we would not at all get the result we want. What we need to do is to create a query that will return a universal table consisting of the data above but in a resultset that includes the metadata needed to create the xml. The following query will do the trick:

SELECT 1 AS Tag
	, NULL AS Parent
	, Employee.EmployeeId AS [Employee!1!Id]
	, Employee.LoginID AS [Employee!1!Login]
	, SalesPerson.SalesLastYear AS [Employee!1!SalesLastYear]
	, NULL AS [Customers!2]
	, NULL AS [Store!3!CustomerId]
	, NULL [Store!3!Name]
	, NULL [Store!3!Account]
FROM HumanResources.Employee Employee
INNER JOIN Sales.SalesPerson SalesPerson
	ON Employee.EmployeeId = SalesPerson.SalesPersonId
WHERE Employee.EmployeeId = 280
UNION ALL
SELECT 2
	, 1
	, NULL
	, NULL
	, NULL
	, NULL
	, NULL
	, NULL
	, NULL
UNION ALL
SELECT 3
	, 2
	, NULL
	, NULL
	, NULL
	, NULL
	, Store.CustomerID
	, Store.Name
	, Customer.AccountNumber
FROM Sales.Store Store
INNER JOIN Sales.Customer Customer
	ON Store.CustomerId = Customer.CustomerId
WHERE Store.SalesPersonID = 280
FOR XML EXPLICIT

As you can see, what we have is a query that is really built up of several queries (three in our case) unioned together. Each query must include all the columns that it needs itself as well as the ones necessary for the other queries. The queries assign arbitrary tag numbers to the three different kinds of elements in our expected output (remember we had Employee, Customers and Store) and also hierarchically arrange them under their direct parent element.

Each of the queries returns the data that is necessary for the elements at a specific depth in the xml tree. For the columns that specify other elements at other depths they just specify null. Note that all the column names are aliased in the first query, since the union operator will use those names for the unioned resultset. Also note specifically how the second query will always return one row and is only used to add the Customers element (that’s why the column name is only two parts and does not include an attribute name) to use as parent for the Store elements.

Trackbacks

Trackbacks are closed.

  • mlevitt

    Chris,

    I need your help, Please!
    I have this data in a table

    grp ledger amount
    1 A01 5
    1 A02 3
    1 A03 4
    2 A22 8
    3 A33 3
    3 A36 4

    I need to sum the amounts grouping by grp
    I also need to concatenate the ledgers by grp
    The result needs to look like this. How do I do this without a cursor or loop on a Sql2000 server. I've tried to make sense of FOR XML but I'm not getting it. Also, I'm not sure if I even have to use it.

    grp amt ledger
    1 12 A01, A02, A03
    2 8 A22
    3 7 A33, A36

blog comments powered by Disqus