Sql-server – sp_send_dbmail is populating 00 bits between all characters in a result set

database-mailencodingsql servert-sqlunicode

I'm using sp_send_dbmail in a SQL agent job triggered nightly. The job queries our database and checks for product price updates, if there are some, it will send them in an email as an attachment to our e-commerce vendor. They have some automatic processes they will update our e-commerce platform, however their automated process cannot deal with the file provided by sp_send_dbmail.
sp_send_dbmail seems to be putting null characters between all characters in the result set.

See result of opening the csv in a hex editor:

Viewed through a text editor I see as expected:

sp_send_dbmail query here:

    SET @FileName = 'Update_' + CONVERT(VARCHAR(12),GETDATE),105) '.csv'        
    SET @Query = 'Some Query'

    EXEC msdb.dbo.sp_send_dbmail
        @recipients = 'me@domain.com'
        @query = @Query
        @attach_query_result_as_file = 1
        @query_attachment_filename = @FileName
        @query_result_separator = ','
        @query_result_no_padding =  1
    END

What's going on here?

****EDIT FOR ADDITIONAL CLARIFICATION****

Using Microsoft SQL Server 2008 R2
Collation is Latin1_General_CI_AS

Leaving no_padding out leaves several trailing spaces in each returned field. each space (20) separated by a null (00)

Best Answer

The file encoding for the attachments is standard UTF-16 Little Endian (LE) -- one of the various Unicode encodings. The first two bytes, shown in the first image, are FF FE. These are the Byte Order Mark (BOM), which is an explicit indication of the files encoding. And you can see in the chart on that linked Wikipedia page that FF FE indicates an encoding of UTF-16 Little Endian.

UTF-16 uses 2 or 4 bytes per "character", so the 00 null bytes are really just half of each of those characters. And, because it is Little Endian instead of Big Endian, the bytes in each pair are swapped. Meaning, the highlighted 00 in the top image is paired with the preceding 50 (for the capital P). But the pair 50 00 is just Little Endian for 00 50, which is Unicode Code Point U+0050 (there is a chart at the bottom of that page that shows the encoding variations).

If you were to open that file in any editor that auto-detected the file encoding via the BOM, it would know that it was a UTF-16 LE encoded file and display it correctly, as you would expect to see it. Just try it out in something like Notepad++.

The first thing I would do is contact the vendor and/or check the documentation to see if either the e-commerce platform already has a means of dealing with UTF-16 LE encoded files (a.k.a. "Unicode" in Microsoft-land), OR if they can make a small change to handle Unicode files. I would be surprised if they both don't already handle this in some fashion and won't do it if not already able to. Given that this world is moving towards more internationalization rather than less, you can't be the only client needing to pass in a Unicode file. Also, converting the file to ASCII Extended could lead to data loss if your data naturally has characters in it that do not fit into a single code page.

The last thing I would do is update the definition of sp_send_dbmail. I am not saying to never, ever do that, but in most cases it is unnecessary. If anything, it would be preferable to convert the encoding of the attachment after it is sent (which will probably need to be done by the recipient of the email). An intentional conversion will allow for specifying the correct encoding, whether it be UTF-8 or Extended ASCII using one of the many available code pages.

Related Solutions

T-sql – Result set to eventvwr

A CURSOR with the above query as it's definition would allow you to insert the records into the event log one-by-one.

Another option you may want to look at is querying sys.dm_exec_query_stats:

SELECT TOP 10
    st.text,
    qs.max_logical_writes,
    qs.max_logical_reads,
    qs.max_elapsed_time,
    deqp.query_plan
FROM
    sys.dm_exec_query_stats AS qs
    CROSS APPLY sys.dm_exec_sql_text( qs.sql_handle ) AS st
    CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS deqp
ORDER BY
    qs.max_elapsed_time DESC

This query will return the top 10 longest running queries along with their plans. This is just a sample, it could easily be tweaked, and may point you in another direction.

Sql-server – How to transform this result set

The example below uses a CTE to find the gaps in your TimeScheduleId sequence. The field HasPreviousItem identifies if a row is the first in a sequence and RowNum is used to number all of the records. To find the start dates/times select for records where HasPreviousItem = 0.

The Duration assumes that each record is a 15 minute block of time. The cross apply subquery finds all records for the same ApptCode in the same block. That is records in sequence not interuppted by a HasPreviousItem value change. If the "duration" of a record is not always 15 minutes an alternate method to find the duration of an appointment would be needed.

CREATE TABLE ApptTemp (
     ProviderId INT NOT NULL DEFAULT 196
    , OfficeId INT NOT NULL DEFAULT 7
    , ApptCode CHAR(1) NOT NULL
    , Description VARCHAR(20) NOT NULL  
    , ScheduleTimeId INT NOT NULL
    , ScheduleTime DATETIME NOT NULL
    , ApptDate DATETIME NOT NULL
);

GO

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES ( 'L', 'Lunch', 49, '1900-01-01 12:00', '2014-03-20 12:00'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'L', 'Lunch', 50, '1900-01-01 12:15', '2014-03-20 12:15'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'L', 'Lunch', 51, '1900-01-01 12:30', '2014-03-20 12:30'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'L', 'Lunch', 52, '1900-01-01 12:45', '2014-03-20 12:45'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'S', 'Staff Meeting', 33, '1900-01-01 08:00', '2014-03-20 08:00'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'S', 'Staff Meeting', 34, '1900-01-01 08:15', '2014-03-20 08:15'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'S', 'Staff Meeting', 41, '1900-01-01 10:00', '2014-03-20 10:00'  );

INSERT INTO ApptTemp ( ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate )
VALUES  ( 'S', 'Staff Meeting', 42, '1900-01-01 10:15', '2014-03-20 15:00'  );



WITH ApptCte 
AS (
SELECT ApptCode, Description, ScheduleTimeId, ScheduleTime, ApptDate, ProviderId, OfficeId

        , CASE
            WHEN  EXISTS
                (
                SELECT  *
                FROM     ApptTemp AS A2
                WHERE  A2.ScheduleTimeId = A1.ScheduleTimeId - 1
                    AND A2.ApptCode = A1.ApptCode
                )
                    THEN 1
            ELSE 0
        END AS HasPreviousItem 

        , RowNum = Row_Number() OVER(ORDER BY ScheduleTimeId)

FROM ApptTemp AS A1

)


SELECT 
     ApptDate
    , Description AS ApptDetail
    , OfficeId
    , ProviderId
    , Duration.Total AS Duration


FROM ApptCte AS MainCte
    CROSS APPLY
        (
            SELECT (COUNT(*) + 1) * 15 AS Total
            FROM ApptCte AS CountCte
            WHERE CountCte.ApptCode = MainCte.ApptCode
                AND CountCte.HasPreviousItem = 1
                AND CountCte.RowNum > MainCte.RowNum
                AND NOT EXISTS ( SELECT *
                                    FROM  ApptCte AS InnerCte 
                                    WHERE InnerCte.ApptCode = CountCte.ApptCode 
                                        AND InnerCte.HasPreviousItem = 0
                                        AND InnerCte.RowNum != MainCte.RowNum )
        ) AS Duration

WHERE HasPreviousItem = 0

ORDER BY ApptDate;

Best Answer

Related Solutions

T-sql – Result set to eventvwr

Sql-server – How to transform this result set

Related Question