In order to troubleshoot a problem, I have a one-time question as to whether a specific varchar(max)
field contains non-printing ASCII characters (other than white-space). The following is my straightforward idea about how to determine if there are such characters stored in our production database.
SELECT TOP 10 [CaseNoteId]
,[CaseId]
,[CaseNote]
FROM [DB].[XY].[ReferralCaseNotes]
WHERE CaseNote LIKE ('%[' + CHAR(1) + '-' + CHAR(8) + CHAR(11) + CHAR(12) + CHAR(14) + '-' + CHAR(31) + CHAR(127) + ']%')
My hesitancy to actually run this stems from using wildcards in the LIKE pattern, that there are over a million records in the table in question, the lack of a full-text index on this column, and that this will likely be an exhaustive search because we do not believe that any such characters exist.
I am a neophyte. How can I estimate whether running this query will be a significant load on our production system? Also, is there a better way to get at the same information?
Possible Improvements:
- I'm not worried about data changing while my query runs. Can I change this query to look at a few rows at a time in a way that is beneficial?
- Can I set this query to somehow be a background operation that doesn't get in the way of any other queries?
- Can I run it for a limited time and determine what percentage of the table was searched, so that I can estimate the time required for a full search?
- Would
WITH(READPAST)
improve my performance?
Why?
The database in question involves sensitive data, the government, and security folks making rules. Restoring a backup to a different server makes a ton of sense, but would cost the taxpayer several orders of magnitude more than makes any sense.
If the answer is, "Don't worry, you're just doing a SELECT," then I say, "Great!"
Best Answer
READ COMMITTED
or evenREAD UNCOMMITTED
. It is a common myth that aREAD COMMITTED
scan locks the table.MAXDOP 1
hint. Controlling load of bulk operations is very hard with SQL Server. Depending on the situation you might be 100% fine leaving this running all day, or you might induce timeouts in other parts of the workload. It is not unreasonable to run the query for 10s and cancel it. Then determine whether the application workload was impacted or not.Fulltext search cannot help you because it works on a per-word basis. You'd need to plug in a custom stemmer that knows how to split special characters. Unrealistic. Your query is fine.