Monday, 14 November 2011

More on setting up a DTS package

Remember I was talking about setting up a thing to import stuff from a spreadsheet into SQL Server?


Well, there are a few more wrinkles worth mentioning.


First thing was that when I first tried to run it, there was a problem.  The Data for the Source column is too large for the specified buffer size.  Here's the error message:



Ah?  I don't remember specifying a buffer size.  Well, not surprisingly, because I never did - what's more, as far as I can tell I can't get at the buffer size to change it.  It seems that by default DTS reads the first eight rows of the spreadsheet and works out how big the buffers and field lengths need to be, based on what it finds there.  Unfortunately, the data in this particular field reads something like "Yes" or "No" until we get down to row number 172 where someone has written an essay detailing the sad and complex story of one particular Hospital. (Don't get me started on database design and normalisation - let me just say that not everyone in the past adopted best practice - grrrr!)

The quickest way round this is to take row 172 and move it up to the first eight rows.  This works fine, and the data gets loaded.  But I'm not free to interfere with the master spreadsheet.  (Perhaps I could copy it and sort the copy?  I may come back to that)  That also means I don't want to put a dummy row up at the top with a 2048 character field in it.  (http://social.msdn.microsoft.com/forums/en-US/transactsql/thread/5e4f90d7-b84c-422b-bbd1-35481ae3d1ba/)

Apparently I can edit the registry so that it calculates buffer sizes based on more than the default eight rows.  But I'm leery of mucking about with the registry, and anyway it may be that someone else needs to run the package, not me. 

It might be possible to read in the field in chunks of 255 characters until the source field is empty But that is getting into something that's too heavy duty for me.  Maybe someone cleverer than me can figure it out and post the solution.  

The obvious answer is to find out where the buffer size is specified as 255, and change it to something larger.  But as far as I can see, it isn't possible to change this in DTS.  

One entry in Google suggested that I might need a new service pack?  http://support.microsoft.com/kb/247527 - nope, already got it.  


Had you guessed already that I've been researching this on Google?  I also tried asking the question on the Linkedin SQLDBA group.  I got some good answers but not THE answer, unfortunately.  So I suppose this blog post is really about problem solving.  You know, that favourite interview question - "Right, Jack, tell me about how you would go about solving a problem" and you think, Er...

First thing is research - Google finds lots of posts where it has been done, or something like it has been done, but usually not quite the same thing.  LinkedIn groups helped, but no magic bullet.  One of my colleagues vaguely remembered coming across something similar a couple of years ago, but couldn't really remember what he did to solve it <sigh>.  But it's always worth asking around, because there's a fairly good chance that someone will know the answer.  

Then, evaluate the options.  I did try moving row 172 up to the top, and it worked like a charm.  But it isn't my spreadsheet to modify, and there is no guarantee that the author isn't going to write a long and complicated explanation in row 207 next week.  So that was a no go.  

Then - decision.  I went for editing the registry in the end.  There's an article here on how to do it.  You change the permitted value for the TypeGuessRows variable in the registry so that DTS reads all the rows in the spreadsheet. Unfortunately, this means that the DTS package will only run on a machine which has been so modified. On the other hand, maybe this means that I have a job for life?  Yeah, right.

And Test.  Yep - it works.