In Oracle, primary keys are case-sensitive, so if two values differ only in case they can be used as two different primary keys.
In SQL Server, by default, primary keys are case-insensitive and when you transfer data from Oracle to SQL Server and then try to create the primary key constraints, you may have duplicate key errors.
Last Update: Oracle 11g R2 and Microsoft SQL Server 2012
Assume there is a table with primary keys that differ in case only:
Oracle:
CREATE TABLE states ( id CHAR(2) NOT NULL, name VARCHAR2(25) ); -- Insert 2 values that differ in case only INSERT INTO states VALUES ('CA', 'California'); INSERT INTO states VALUES ('Ca', 'California'); ALTER TABLE states ADD PRIMARY KEY (id); -- Table altered
Unlike Oracle, SQL Server does not allow you to create the primary key in this case:
SQL Server:
CREATE TABLE states ( id CHAR(2) NOT NULL, name VARCHAR(25) ); -- Insert 2 values that differ in case only INSERT INTO states VALUES ('CA', 'California'); INSERT INTO states VALUES ('Ca', 'California'); ALTER TABLE states ADD PRIMARY KEY (id); -- Msg 1505, Level 16, State 1, Line 1 -- The CREATE UNIQUE INDEX statement terminated because a duplicate key was found -- The duplicate key value is (Ca).
By default, SQL Server treats values 'CA' and 'Ca' as equal.
You can use the following PL/SQL script to find all columns that are part of a primary or unique key, and differ in case only in Oracle for the current user:
Oracle:
SET SERVEROUTPUT ON -- Cursor to select all character columns that are part of primary or unique key DECLARE CURSOR c1 IS SELECT c.table_name, cc.column_name FROM user_constraints c, user_cons_columns cc, user_tab_columns tc WHERE c.constraint_type IN ('P', 'U') AND c.constraint_name = cc.constraint_name AND c.table_name = cc.table_name AND cc.table_name = tc.table_name AND cc.column_name = tc.column_name AND tc.data_type IN ('VARCHAR2', 'NVARCHAR2', 'CHAR', 'NCHAR'); c2 SYS_REFCURSOR; stmt VARCHAR2(300); dup_value VARCHAR2(300); total_found NUMBER := 0; total_columns NUMBER := 0; count_column BOOLEAN; BEGIN DBMS_OUTPUT.PUT_LINE('Opening a cursor...'); -- For each column find duplicates - values that differ in case only FOR r1 IN c1 LOOP count_column := TRUE; -- Statement to select duplicate values stmt := 'SELECT ' || r1.column_name || ' FROM ' || r1.table_name || ' WHERE UPPER(' || r1.column_name || ') <> ' || r1.column_name || ' AND ' || ' UPPER(' || r1.column_name || ') IN ' || ' (SELECT UPPER(' || r1.column_name || ') FROM ' || r1.table_name || ' GROUP BY UPPER(' || r1.column_name || ')' || ' HAVING COUNT(*) > 1)'; DBMS_OUTPUT.PUT_LINE('Checking ' || r1.table_name || '.' || r1.column_name || '...'); OPEN c2 FOR stmt; -- Output all duplicate values LOOP FETCH c2 INTO dup_value; EXIT WHEN c2%NOTFOUND; DBMS_OUTPUT.PUT_LINE('''' || dup_value || ''''); IF count_column = TRUE THEN total_found := total_found + 1; count_column := FALSE; END IF; END LOOP; total_columns := total_columns + 1; END LOOP; DBMS_OUTPUT.PUT_LINE(total_found || ' duplicate columns found'); DBMS_OUTPUT.PUT_LINE(total_columns || ' columns checked'); END; /
Sample output:
Opening a cursor... Checking ML_SUBSCRIPTION.SUBSCRIPTION_ID... Checking STATES.ID... 'Ca' 1 duplicate columns found 2 columns checked PL/SQL procedure successfully completed.
Note that this script checks each character primary key column individually, and even if duplicates are found, it may be not a problem for multi-column keys, but it is still worth checking.
If you transfer an Oracle table that contains primary keys that differ in case only, you will have errors when you try to create the primary key constraints in SQL Server, or data transfer errors if you create the primary key constraints before transferring the data.
The easiest way to solve this problem is to specify a case-sensitive collation for primary columns in SQL Server:
SQL Server:
CREATE TABLE states ( id CHAR(2) COLLATE Latin1_General_CS_AS NOT NULL, name VARCHAR(25) ); -- Insert 2 values that differ in case only INSERT INTO states VALUES ('CA', 'California'); INSERT INTO states VALUES ('Ca', 'California'); ALTER TABLE states ADD PRIMARY KEY (id); -- Command(s) completed successfully.
CS in COLLATE Latin1_General_CS_AS collation specifies case-sensitivity for column data, and now values 'CA' and 'Ca' are different, so you can successfully create the primary key constraint.
In most cases, having multiple primary key values that differ in case only is a result of a data input error, i.e. multiple different primary keys points to the same entity. In this case it often makes sense to clean the data after you have transferred them to SQL Server, but before creating primary key constraints.
To do so, firstly you have to correct the foreign keys that refer to the table, and then remove duplicate primary keys. For more information, see Trailing Spaces in Primary Keys.
Dmitry Tolpeko, dmtolpeko@sqlines.com - July 2012
Please contact me if you are interested in Oracle to SQL Server migration services, consulting, or training.