We had a customer that was trying to pull data into Excel by way of an OData feed. This actually related to a Project Server hosted in a SharePoint 2013 On Premise deployment. Within Excel, we have multiple ways to import data. This could be through the Excel Data Tab, Power Pivot or Power Query.
The issue we had is that for some of the fields in the feed had a data type of Decimal. Here is the schema we had for our internal repro.
<d:ProjCost1 m:type="Edm.Decimal" m:null="true" /><d:ProjCost2 m:type="Edm.Decimal" m:null="true" /><d:ProjCost3 m:type="Edm.Decimal" m:null="true" /><d:ProjCost4 m:type="Edm.Decimal" m:null="true" /><d:ProjCost5 m:type="Edm.Decimal" m:null="true" />
Excel Data Tab
The customer was originally pulling this in via the Excel Data Tab.
Once the data is imported it looked like the following:
You’ll notice that ProjCost2 and ProjCost3 show decimals as compared to 1, 4 and 5. Also, when we pulled this into a PivotTable and summed the ProjCost fields, we hit some odd behavior. The ProjCost2 and ProjCost3 fields were just showing a 0.
Power Query
Next I wanted to see what Power Query would do. Grabbing the data already looked different from what we saw from the Excel Data Tab pull. The 5 fields looked consistent.
Also, in the Query itself it was showing “Number” as the data type.
The PivotTable looked good as well.
However, the customer didn’t want to use Power Query. There is clearly a different here.
Power Pivot
Looking at Power Pivot, we actually saw the same behavior as with the Excel Data Tab.
We also saw that the data type here was showing text for the ProjCost 2 and ProjCost3 fields.
The PivotTable here showed the same behavior that the Excel Data Tab did.
Internals
Under the hoods, grabbing an OData Feed from the Excel Data Tab is actually just using Power Pivot to get it. So, that explains why I saw the same behavior. This uses an assembly named Microsoft.Data.DataFeedClient. One thing I wanted to understand is whether this issue of it thinking it was a string data type was due to the DataFeedClient, or something Power Pivot was doing after the data was pulled in. Like some sort of post import processing to determine data types.
So, I spun up some code to use it and see what the schema looked like coming back from the DataFeedClient. Here is the code I used.
using (Microsoft.Data.DataFeedClient.DataFeedConnection conn = new DataFeedConnection(connBuilder.ConnectionString)){ conn.Open(); DataFeedCommand comm = new DataFeedCommand("Projects", conn); DataTable dataTable = null;
using (IDataReader dbReader = comm.ExecuteReader(CommandBehavior.SchemaOnly | CommandBehavior.KeyInfo)) { dataTable = new DataTable("Projects"); dataTable.Locale = CultureInfo.InvariantCulture; dataTable.Load(dbReader); if (dbReader != null) { dbReader.Close(); } }
int columnCount = dataTable.Columns.Count;
foreach (DataColumn dc in dataTable.Columns) { Console.Write("Column: " + dc.ColumnName); Console.WriteLine("\tData Type: " + dc.DataType.ToString()); }}
Here is the output:
So, this was definitely something that the DataFeedClient was using. Of note, Power Query doesn’t use Microsoft.Data.DataFeedClient. It uses Microsoft.Data.OData.
The problem here is that DataFeedClient doesn’t currently use the schema described by an OData Service. Instead, it sniffs the first n rows of a data set. In our case, the values of the first 31 rows for ProjCost2 and ProjCost3 were blank. If that is the case, DataFeedClient will just default to a string data type.
There is a connection string property for DataFeedClient called “Schema Sample Size”. The default for this is 25. Well, we saw that we didn’t have a value until row 32, so that is a problem. Here is what it looks like when I set Schema Sample Size to 50 using the same test code that I wrote.
That looks much better. They all show decimal now.
Connection Properties
Within Power Pivot, we want to click on Advanced on the Table Import Wizard, after we fill in our Data Feed URL.
From there you can adjust the Schema Sample Size to a value that fits your needs. Then just continue with the import.
For the Excel Data Tab, it isn’t as obvious. You want to get the Data Connection Wizard up by doing to the Data Tab, From Other Sources and From OData Data Feed. Insert the URL and click Next. Select the table that you want and click Next. On the Save Data Connection File and Finish, you want to hit Finish. Then on the Import Data Dialog, you want to click on Properties.
This will bring up the Connection Properties. From here, go to the Definition tab. You can then add Schema Sample Size to the connection string. Don’t forget the semi-colon.
After doing this, the fields should reflect the proper Data Type, and you can use the data as you expected to.
Adam W. Saxton | Microsoft SQL Server Escalation Serviceshttp://twitter.com/awsaxton