Sql-server – Large table size, affecting usage

performancequery-performancesql serversql-server-2005

I have the following database table, that saves all outgoing emails to students:

    CREATE TABLE [dbo].[tblEmailsSent](
        [id] [int] IDENTITY(1,1) NOT NULL,
        [Sent] [datetime] NULL,
        [SentByUser] [nvarchar](50) NULL,
        [ToEmail] [nvarchar](150) NULL,
        [StudentID] [int] NULL,
        [SubjectLine] [nvarchar](200) NULL,
        [MessageContent] [ntext] NULL,
        [ReadStatus] [bit] NULL,
        [Folder] [nvarchar](50) NULL,
     CONSTRAINT [PK_tblMessages] PRIMARY KEY CLUSTERED 
    (
        [id] ASC
    ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
     IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
    GO        

The table currently stores 18,700 records.

I have a number of issues, which I guess might be related:

  1. The table, for some reason, weighs 460 MB – which means I am pretty much hitting the limit of the hosting package. I wouldn't have thought that 18,700 rows would weigh so much. Is there something that can be done about this?

  2. When I load the top X rows from the table, the response is VERY slow. We are talking about a run time of a little more than 2 minutes (in SSMS), for a simple query that looks like this:

    SELECT top 500  * 
    FROM tblEmailsSent
    ORDER BY id desc
    
  3. The following query gives zero records, when I know for a fact (ie, i can see in the table,) that emails were sent during the day of the 18th:

     SELECT id FROM tblEmailsSent
     WHERE  convert(datetime,[sent],103) = convert(datetime,'18/11/2016',103)
    

I am using Microsoft SQL Server 2005 – 9.00.5000.00.

I am using ntext because i am using Unicode (for Hebrew support).

SET STATISTICS TIME ON for query 2 returns:

(500 row(s) affected). 

SQL Server Execution Times:
    CPU time = 141 ms,  elapsed time = 112347 ms.

A SELECT AVG(DATALENGTH(MessageContent)) query returns 23,363.


I changed the MessageContent column to nvarchar(MAX), and ran query 2 again:

SQL Server Execution Times:
    CPU time = 47 ms,  elapsed time = 116726 ms.

…only a few seconds different from before.

The server is remote. It generally responds fairly quickly, but these results for such a simple query are really very very slow.

Using SET STATISTICS IO ON returns:

Table 'tblEmailsSent'.
 Scan count 1, logical reads 17, physical reads 0, read-ahead reads 354, 
 lob logical reads 3308, lob physical reads 660, lob read-ahead reads 0.
SQL Server Execution Times:
 CPU time = 62 ms,  elapsed time = 107245 ms.

Best Answer

Community wiki answer collecting information left in comments by Dan Guzman and Aaron Bertrand.

The space requirements for this table will mostly be driven by the size of the MessageContent column. The default behavior of the deprecated ntext type is that the column data will be stored in separate LOB data pages of 8K each.

So with 18,700 rows, a minimum of about 150MB will be needed with values larger than 8K requiring additional pages. nvarchar(MAX) will store smaller values in-row and reduce space requirements considerably if most values are under 8K.

If you are in danger of hitting your hosting limits, have you considered a different design, where you store these big messages in flat files on the file system, and just the path in the database? Typically hosted file system storage is much cheaper than hosted database storage.

Perhaps your server is remote and over a slow network connection. The high elapsed time may be due to low network bandwidth and latency. Nothing much you can do on the database side in that case.