Sql-server – Developer’s query performing very poorly in custom database

database-designperformancequery-performancesql serversql-server-2012

We have an in-house-developed web application with a very small database (~20MB) and small number of users (~20 max) it is unfortunately a very high-visibility application used by executives. They have found that late in the afternoon every week, when multiple users are making changes simultaneously, performance is EXTREMELY poor – sometimes taking 5 minutes to "save an update" which sometimes fails completely.

The table design is not ideal and I've recommended the developer make some changes which would take some time to implement. The main table has a number of varchar(max) columns in which they store delimited lists of users. All of the tables in the database have a single clustered index on an ID column – another thing I've recommended improving to the developer.

Some "projects" in this application are significantly slower than others, there is one in particular that is almost always slow to save, but outside the heavy-usage period, we can typically see queries perform well.

I am not a developer, but this query looks poor/inefficient to me as a DBA – this is typical of the application's "save project" queries. It looks auto-generated by some tool – the query is generated by the application and it passes parameters for each column in the table, it does not use stored procedures at all.

UPDATE Actions SET Name = @Name,
Start_Date = @Start_Date,
End_Date = @End_Date,
Status = @Status,
Comments = @Comments,
Comp_Date = @Comp_Date,
Owner = @Owner,
Owner_EmplID = @Owner_EmplId,
Status_Name = @Status_Name
WHERE ID = @ID;

The server is SQL 2012 Standard and it has 4 CPUs, 18GB RAM (pretty typical for our environment) – it does run other application DBs but the overall server load is "normal" during peak usage times for this application, and no jobs/loads are running in other DBs at peak usage time. I have not found any deadlocks or locking at all while we monitor peak usage performance.

We do have a Tableau report connected to this same database that's auto-generating some other poor reporting queries, the entire database has about 25,000 rows among all tables, and the query Tableau runs generates millions of rows; I suspect cross-joins in it somewhere but none of my peers want to focus on that aspect of this issue.

Do these update queries actually follow good development practice? Can you give any advice that would help me convince the developer to redesign some of this, or is there something I can do to improve performance without involving them?

Definition of the Actions table:

TABLE [dbo].[Actions](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [Project_ID] [int] NULL,
    [Name] [varchar](1000) NULL,
    [Owner] [varchar](500) NULL,
    [Owner_Emplid] [varchar](500) NULL,
    [Comments] [varchar](max) NULL,
    [Start_Date] [date] NULL,
    [End_Date] [date] NULL,
    [Comp_Date] [date] NULL,
    [Status] [int] NOT NULL,
    [Status_Name] [varchar](50) NULL

Best Answer

varchar(max) columns with delimited lists is an anti-pattern, particularly if you split those lists out. Properly designed relational tables provide much better performance. However, that is likely secondary to your performance problem.

I'd bet money on the problem being blocking. From Microsoft:

Blocking is an unavoidable characteristic of any relational database management system (RDBMS) with lock-based concurrency. On SQL Server, blocking occurs when one SPID holds a lock on a specific resource and a second SPID attempts to acquire a conflicting lock type on the same resource.

Blocking is probably occurring when the Tableau queries take a long time to complete, blocking the updates from occurring, and creating the impression of slowness.

Brent Ozar has an excellent set of tools that can help understand performance problems. Specifically around blocking, this page contains a load of great resources.

By the way, it looks to me like that update statement in your question looks perfectly well-formed to me. I see nothing wrong with it as shown. If you provide query plans for the queries and update statements that are problematic, I'll provide more definitive advice.

If you do find that blocking is a significant problem, you may want to look into SQL Server's multi-version-concurrency-control option that can be enabled at the database level, READ COMMITTED SNAPSHOT. From that document:

When the READ_COMMITTED_SNAPSHOT database option is set ON, read committed isolation uses row versioning to provide statement-level read consistency. Read operations require only SCH-S table level locks and no page or row locks. That is, the SQL Server Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions. A user-defined function can return data that was committed after the time the statement containing the UDF began.

Be aware, read committed snapshot isolation (RCSI) is not the silver bullet it can appear to be at first glance. Like most things, it has positive and negative effects. By design, it changes the concurrency profile of the database, such that you can no longer rely on locking to enforce select-then-update semantics, since the select will no longer block other transactions from accessing the row that will soon be updated. Kendra Little has a [great article(https://littlekendra.com/2016/02/18/how-to-choose-rcsi-snapshot-isolation-levels/) talking about RCSI.

Best Answer

Related Solutions

Mysql int vs varchar as primary key (InnoDB Storage Engine

MySQL – Can It Perform Queries on Billions of Rows

Related Question