SQL code to find the columns of max that need to be moved to the end of code for importing to R
@_JohnMackintosh posted in Twitter about his ‘red pants in the wash’ being those date columns that are in fact varchar()
. I’ve been very lucky as our data warehouse people have consistently formatted dates in the SQL tables including my favourite integer yyyymmdd
format which beautifully imports into R with no issues.
What I have found though, which is my ‘red sock in the wash’, is varchar(max)
or nvarchar(max)
data which, it seems, causes an import error unless these columns are placed at the end of the select script or recoded by using something like CAST(colm AS varchar(150))
.
The first time I noticed this, it was because of a table I had built myself. I had put in a column for comments and naturally went for varchar(max)
. To solve the problem I just changed the data type for the column - because I could.
The next time I encountered this, which wasn’t too long after, it was in a database I couldn’t update so I found my import scripts to R failing with an error:
Error in result_fetch(res@ptr, n) : nanodbc/nanodbc.cpp:3011: 07009: [Microsoft][ODBC SQL Server Driver]Invalid >Descriptor Index Failed to execute SQL chunk
which isn’t really descriptive but searching for the error brought this up for the ODBC package:
Some of the tables I was working with have nearly 100 columns and whilst I can scan through for them, I’ve recently had to import about 20 tables so this would be long-winded.
First, find your data type number:
-- USE database_example
SELECT system_type_id, name
FROM sys.types
WHERE system_type_id = user_type_id
Then when you know what varchar(max)
or even nvarchar(max)
is listed under for your own database search for it with this:
-- USE database_example
SELECT
= s.name
SchemaName
,o.type_desc= o.name
,ObjectName = c.Name
,ColumnName FROM sys.objects AS o
LEFT JOIN sys.schemas AS s ON o.schema_id = s.schema_id
LEFT JOIN sys.all_columns AS c ON o.object_id = c.object_id
WHERE c.system_type_id IN (231, 167)
AND c.max_length = -1
--Other useful code to restrict what is returned
AND s.name IN ('schema_name') -- restrict to the schema
AND o.type_desc = 'VIEW' -- or USER_TABLE
AND o.name LIKE '%table_name%' -- or use = 'precise_table_name'
Unfortunately, now I can’t use the really straight forward script:
SELECT *
FROM schema.Table
and instead I have to list out all 100 columns and move those that are varchar(max)
to the end. Still, this code saves lots of peering at the screen.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
For attribution, please cite this work as
Turner (2021, Feb. 13). Blog: Finding SQL varchar(max). Retrieved from https://philosopher-analyst.netlify.app/posts/2021-02-13-finding-sql-varcharmax/
BibTeX citation
@misc{turner2021finding, author = {Turner, Zoë}, title = {Blog: Finding SQL varchar(max)}, url = {https://philosopher-analyst.netlify.app/posts/2021-02-13-finding-sql-varcharmax/}, year = {2021} }