T-sql – Populate ID Field by Converting Group Name Values

functionsssist-sql

This is a bit tricky to explain, but here goes. We have a System Rights database that loads system users and system rights or access levels into two different tables. The database is already built and live with some system data populating it daily. I have been asked to build the SSIS package to populate a few new system users/rights into this database, but a couple of these systems do not have numeric fields identifying the position rights field as an id that I can populate the systemAccessLevelID field in the database with. I will try to outline what I have and what I need to do.

SystemUsers Table:

  • SystemIDInt Defined in a system table
  • SystemAccessLevelIDInt populated from SSIS pull from source
  • Namenvarchar(50) populated from SSIS pull from source

SystemRights Table:

  • SystemIDInt Defined in a system table

  • SystemAccessLevelIDInt populated from SSIS pull from source

  • SystemAccessLevelDescnvarchar populated from SSIS pull from
    source

The idea here is one source database that stores all of the different system users with their rights in those respective systems to be able to run audit reports against the database.

The Problem
As an example in Source1 there is no numeric field that can be converted to Int to identify the Access levels for the different rights in the systems.

This has to get reloaded daily for audit purposes and the SystemAccessLevelID field needs to stay the same regardless if new system access levels are created in the source system.

I debated trying to create a function to loop through the SystemAccessLevelDesc field and converting each letter to an integer through hard coding a numeric value for every letter and special character then adding it all together, but I don't really know where to start with that. I also feel that I must be missing an existing function or process somewhere that I could use and was hoping someone could point me in the right direction.

Example Data:

SystemID, SystemAccessLevelID, SystemAccessLevelDesc
1       , %DontHave%         , SYSADMIN
1       , %DontHave%         , SYSACCESS
1       , %DontHave%         , SYSREAD

How can I convert he different SystemAccessLevelDesc to get unique numeric values for SystemAccessLevelID? I was hoping to be able to do this with a single Select statement still which is why I was thinking the user defined function was my best route.

Best Answer

I would use the CHECKSUM function to determine a numeric value for each SystemAccessLevelDesc. You could make this a persisted, computed column, such that the value is automatically calculated when data is inserted into the column.

From the CHECKSUM page on Microsoft's documentation site:

CHECKSUM computes a hash value, called the checksum, over its list of arguments. The hash value is intended for use in building hash indexes. If the arguments to CHECKSUM are columns, and an index is built over the computed CHECKSUM value, the result is a hash index. This can be used for equality searches over the columns.

CHECKSUM satisfies the properties of a hash function: CHECKSUM applied over any two lists of expressions returns the same value if the corresponding elements of the two lists have the same type and are equal when compared using the equals (=) operator. For this definition, null values of a specified type are considered to compare as equal. If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change. For this reason, we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change. Consider using HashBytes instead.

There is a possibility, however extremely small, that CHECKSUM can return the same value for two different inputs. I've created a UNIQUE constraint on the table such that an insert where the checksum results in two identical values will fail. If you have a relatively small number of values for the SystemAccessLevelDesc column, I would not expect you to experience any checksum collisions.

Sometimes an example is helpful:

USE tempdb;

CREATE TABLE dbo.SystemRights
(
    SystemID int NOT NULL
    , SystemAccessLevelID AS CHECKSUM(SystemAccessLevelDesc) PERSISTED
        CONSTRAINT UQ_SystemAccessLevelID
        UNIQUE
    , SystemAccessLevelDesc nvarchar(100) NOT NULL
    , CONSTRAINT PK_SystemRights
        PRIMARY KEY CLUSTERED
        (SystemId, SystemAccessLevelID)
);

INSERT INTO dbo.SystemRights (SystemID, SystemAccessLevelDesc)
VALUES (1, 'SYSADMIN')
    , (1, 'SYSACCESS')
    , (1, 'SYSREAD');

SELECT *
FROM dbo.SystemRights;

The results from the SELECT above:

╔══════════╦═════════════════════╦═══════════════════════╗
║ SystemID ║ SystemAccessLevelID ║ SystemAccessLevelDesc ║
╠══════════╬═════════════════════╬═══════════════════════╣
║ 1        ║ -32772874           ║ SYSACCESS             ║
║ 1        ║ 1727577266          ║ SYSADMIN              ║
║ 1        ║ 2008724622          ║ SYSREAD               ║
╚══════════╩═════════════════════╩═══════════════════════╝

If you are concerned about checksum collisions, and are not worried about space implications, you could use the HASHBYTES function to generate a 32-byte binary value instead of the int value returned by CHECKSUM:

IF OBJECT_ID(N'dbo.SystemRights', N'U') IS NOT NULL
DROP TABLE dbo.SystemRights;

CREATE TABLE dbo.SystemRights
(
    SystemID int NOT NULL
    , SystemAccessLevelID AS CONVERT(binary(32), HASHBYTES('SHA2_256', SystemAccessLevelDesc)) PERSISTED
        CONSTRAINT UQ_SystemAccessLevelID
        UNIQUE
    , SystemAccessLevelDesc nvarchar(100) NOT NULL
    , CONSTRAINT PK_SystemRights
        PRIMARY KEY CLUSTERED
        (SystemId, SystemAccessLevelID)
);

INSERT INTO dbo.SystemRights (SystemID, SystemAccessLevelDesc)
VALUES (1, 'SYSADMIN')
    , (1, 'SYSACCESS')
    , (1, 'SYSREAD');

SELECT *
FROM dbo.SystemRights;

The output:

╔══════════╦════════════════════════════════════════════════════════════════════╦═══════════════════════╗
║ SystemID ║ SystemAccessLevelID                                                ║ SystemAccessLevelDesc ║
╠══════════╬════════════════════════════════════════════════════════════════════╬═══════════════════════╣
║ 1        ║ 0x09B006A24922040C18013ED244C62245B8DEA81A0AB3829FEF2471299138BEC5 ║ SYSACCESS             ║
║ 1        ║ 0x98AC053FBB8875188E9DCB3F77FDB6245CA9A42B8BECFAA204F2FA3054DACB7A ║ SYSREAD               ║
║ 1        ║ 0xDC61780D08D34D244226A6B414FE382E138819B40362B624888D775D31505C1A ║ SYSADMIN              ║
╚══════════╩════════════════════════════════════════════════════════════════════╩═══════════════════════╝