Sql-server – Change collation on partition column (SQL Server 2017)

collationpartitioningsql server

I'm trying to change collation on column:

ALTER TABLE [dbo].[yourdad] ALTER COLUMN [yourmam] varchar(36) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL;

and it gives message like this:

Msg 5074, Level 16, State 1, Line 1 The object 'yourdad' is dependent
on column 'yourmam'. Msg 4922, Level 16, State 9, Line 1 ALTER TABLE
ALTER COLUMN yourmam failed because one or more objects access this
column.

It has partition schema and function that depend on this column.

Currently I have like over 40k table with such columns where collation should be changed.

Is it possible to change collation without a table recreation?

Drop the partitioned index(es) – deleted.
But when I'm trying to delete schema:

DROP PARTITION SCHEME yourdad

Msg 7717, Level 16, State 1, Line 1 The partition scheme
"yourdad" is currently being used to partition one or more
tables.

It's a new case insensitive database collation on database I wan't to install, old one was – case sensitive.

Best Answer

You don't need to drop the table(s).

One option (that I have tested) is:

Drop the partitioned non-clustered index(es); move partitioned clustered indexes, primary keys, and unique constraints to a non-partitioned file group; and unpartition heaps by creating a (temporary) clustered index on a non-partitioned file group. For help identifying which indexes are affected and what action needs to be taken for each one, please see the query I have posted in the "Finding and Fixing Ignored Objects" section of "Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong?" (which also contains examples of dealing with clustered indexes, contraints, and heaps).
Drop the partition scheme(s)
Drop the partition function(s)
Alter the column definition with the new collation
Drop and recreate the partition function to specify the COLLATE clause with the new collation for the input parameter (assumption here is that you are not changing the default collation of the database, and the input parameter will use the database's default collation for string types unless specified using the COLLATE clause, and the CREATE INDEX statement using this scheme and function will fail if the collation of this parameter does not match the collation of the partitioning column). Meaning, if you change the column's collation to SQL_Latin1_General_CP1_CI_AI but the database is still using SQL_Latin1_General_CP1_CI_AS, then you need to:
```
CREATE PARTITION FUNCTION PartitionFunc
(
  VARCHAR(36) COLLATE SQL_Latin1_General_CP1_CI_AI
)
...
```

Now, 40k+ tables is perhaps more than a few, but if they are all using the same partition scheme and function, then this is only 4 extra statements (drop and create for the partition schema and partition function) compared to what you already need to do even if partitioning wasn't being used (i.e. you would need to drop and recreate the indexes no matter what). So, while this does require a lot of I/O as the indexes are dropped and recreated, there is no way around that when changing a column's collation.

I am assuming (and have fingers crossed) that these columns are not:

part of the clustered index
key fields in a primary key or unique index/constraint that is foreign-keyed to

If either of those cases is true then you have a lot more work to do, but that has nothing to do with the partitioning.

Another option (given the desire to not recreate everything) might be to use the undocumented sqlservr.exe -q option to change the collation on ALL user columns and databases on the instance. This approach changes the meta-data for the database, columns, parameters, etc, and then rebuilds the indexes (again, no way around this when changing collations, so with 40k+ tables this will likely take a few minutes (or more) and the database, or the entire instance if you update the collation globally, will be offline during that time).

I just tested this scenario on SQL Server 2019 RC1, and it does not work for partition functions, but that doesn't mean that this approach isn't part of the overall solution. It just means that you really have no choice but to go through the process noted in the section above. The sqlservr.exe -q approach will handle everything else (except for user-defined table types, I believe). There are plenty of restrictions on being able to change a column's collation, and even more on changing the database's default collation (lots of dependencies in both cases), so if you are wanting to have a case-sensitive database now, then likely you will have other obstacles besides just the partitioned columns.

The thing that doesn't work with partition functions is that the collation of the input parameter is stored in sys.partition_parameters. This particular meta-data is not changed with the otherwise-awesome-even-if-undocumented -q option. The effects this has are:

Assuming the data is either VARCHAR but no change of code page, or NVARCHAR, then the data will still be partitioned as it was before. For example, let's say that you have a RANGE LEFT partition function using FOR VALUES ( 'D', 'H', 'O', 'V' ). In this case, partition #2 contains values > D and <= H.
- When using the case-insensitive collation of SQL_Latin1_General_CP1_CI_AI, partition #2 would contain E, e, F, f, G, g, H, h. Simple enough, right?
- However, when using the case-sensitive collation of SQL_Latin1_General_CP1_CS_AI, partition #2 would contain d, E, e, F, f, G, g, H. This partition now contains lower-case d, and does not contain lower-case h. This is due to most of the SQL Server collations sorting upper-case before lower-case (Windows collations sort lower-case first).
If data is VARCHAR and the code page is different, then things might get interesting. Any characters with values between 128 and 255 can potentially either have a different value and/or sort position, be mapped to a similar character because it is't available on code page of collation used for partition function collation (i.e. "best fit" mapping), or it might be missing on the old code page hence it becomes a "?" and sorts as that. Depending on what characters are being used, how they map (if they even do) to the other code page, and how the partition ranges are set up, you can experience anything ranging from "no effect at all" to "really strange and difficult to trace/debug behavior".

I have submitted a request to have this particular meta-data added to the sqlservr.exe -q operation, but Microsoft has no real incentive to fix it given that it's an undocumented feature. So, please support it (i.e. vote for it) so they can at least see that people do want it (thanks!): Update Collation of Partition Functions and User-Defined Table Types via SQLSERVR -Q

That being said, the two main questions are:

What is the effect of changing the collation in the first place? What is the current collation and what is the new collation?

If the code page is changing, will there be data loss (the example code in the question shows a VARCHAR column, which makes this a possibility; not so for NVARCHAR)? This approach does not do a code page conversion, so characters with values between 128 and 255 might become some other character, and there are no warnings/errors when this happens (but if you don't have such data, then this is a non-issue).

If sensitivities are changing (i.e. going from Accent-Sensitive to Accent-Insensitive) then will there now be violations of uniqueness in an existing unique index/constraint?
Do you want the new collation used for EVERYTHING? If so, this might work. If not, then this approach is not viable as it changes EVERYTHING.

For complete details on this approach, please see my post:

Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong? (be sure to review the "Finding and Fixing Ignored Objects" section which contains queries to help with fixing partition functions)

If you have multiple databases and want to only change this one (again, EVERYTHING in this database, including the database's default collation), then you can:

Detach (or backup) this database.
Create a new, temporary instance just for making this change, using the same instance-level collation being used for the instance that this database currently belongs to.
Attach (or restore) the database to the new, temporary instance.
Update the temp instance's collation using the sqlservr.exe -q method.
Detach (or backup) the database from the temp instance.
Re-attach (or restore) the database to the instance it came from.

Related Solutions

Sql-server – SQL Server DB restore fails

Msg 5064, Level 16, State 1, Line 1 Changes to the state or options of database 'xyz' cannot be made at this time. The database is in single-user mode, and a user is currently connected to it.

This error does not occur if YOU are the single-user in the database. It only occurs if someone else is in it AND it is ALREADY in single-user mode.

For example, try this:

create database test
--
use test
--
ALTER DATABASE test SET SINGLE_USER WITH ROLLBACK IMMEDIATE
--
ALTER DATABASE test SET SINGLE_USER WITH ROLLBACK IMMEDIATE

Note: there are 4 separate batches, and the 4th one doesn't error. You're setting the mode while IN the database, and also setting it a 2nd time. No error.

Assuming you have been given the mandate to restore the DB regardless of what is happening, do an sp_who2 to look for the user who is connected, and KILL the spid. You can then proceed with the RESTORE.

Sql-server – How to create a partitioned table based on date

As the error message says, any partition-aligned unique index has to include the partitioning key in the index key. This requirement exists so the engine can enforce uniqueness on updates without checking every partition.

In your case, this means including OrderDate in the nonclustered index key, or having a non-aligned index. Both are potentially valid choices, depending on your circumstances. To preserve alignment, your table and index definitions would be:

CREATE TABLE dbo.Orders
(
    OrderID     integer NOT NULL,
    Name        nvarchar(20) NULL,
    OrderDate   date NOT NULL,

    CONSTRAINT PK__Orders_OrderID_OrderDate
        PRIMARY KEY NONCLUSTERED 
            (OrderID, OrderDate)
        ON PS (OrderDate)
)
ON PS (OrderDate);
GO
CREATE CLUSTERED INDEX CX__Orders_OrderDate
ON dbo.Orders (OrderDate)
ON PS (OrderDate);

Of course this changes the uniqueness that the nonclustered index enforces. Now, only the combination of OrderID and OrderDate is guaranteed to be unique. It is theoretically possible to add duplicate OrderIDs, so long as the OrderDate is different. Whether this change of semantic is acceptable to you depends on your circumstances, but it is something to be aware of.

The alternative is to have the nonclustered primary key non-aligned:

CREATE TABLE dbo.Orders
(
    OrderID     integer NOT NULL,
    Name        nvarchar(20) NULL,
    OrderDate   date NOT NULL,

    CONSTRAINT PK__Orders_OrderID
        PRIMARY KEY NONCLUSTERED 
            (OrderID)
        ON [PRIMARY]
)
ON PS (OrderDate);
GO
CREATE CLUSTERED INDEX CX__Orders_OrderDate
ON dbo.Orders (OrderDate)
ON PS (OrderDate);

This preserves the uniqueness of OrderID alone, and has some benefits with queries that compute MIN or MAX aggregates using the index, but you lose the ability to SWITCH partitions in and out without dropping the primary key and recreating it after the switching operation.

You can read more about partitioning in this section of Books Online and more on the issues of aligned and non-aligned indexes in this excellent answer by Remus Rusanu.

Best Answer

Related Solutions

Sql-server – SQL Server DB restore fails

Sql-server – How to create a partitioned table based on date

Related Question