String Sorting Order - Informix to PostgreSQL Migration

When you migrate an Informix database to PostgeSQL you may notice differences in the results of queries that use sorting (ORDER BY clause).

Let's consider an example.

Informix: Default codepage 819 (Latin 1 ISO-8859-1)

  CREATE TABLE words (name VARCHAR(30));
 
  -- Sample data
  INSERT INTO words VALUES ('Apple');      
  INSERT INTO words VALUES ('air'); 
  INSERT INTO words VALUES ('Box');  
 
  -- Now let's run a query ordering rows
  SELECT name FROM words ORDER BY name;

Result:

name
Apple
Box
air

This is probably not what you expected as word air goes after Box. Let's add some more data and start the query again:

   INSERT INTO words VALUES (' Cloud'); 
   INSERT INTO words VALUES ('9'); 
   INSERT INTO words VALUES ('a9ir');  
 
   SELECT name FROM words ORDER BY name;

Result:

name
Cloud
9
Apple
Box
a9ir
air

You can see that the data ordered according to ASCII code set where an upper case letter's code is less than the code of a lower case letter, and blanks and digits have code lower than upper case letters.

String Sorting Order in PostgreSQL

By default, PostgreSQL uses English_United States.1252 collation. Let's see how our data will be ordered:

PostgreSQL: Collation - English_United States.1252

   -- Run the same query on the same sample data    
   SELECT name FROM words ORDER BY name;

Result:

name
Cloud
9
a9ir
air
Apple
Box

You can see that the results are completely different: first go blanks, then digits, then lower case letters followed by upper case letters.

How to Get Informix Order in PostgreSQL

Although it looks like PostgreSQL returns data in more correct order (air goes before Box in ascending order), sometime you may need to get exactly the same result as in Informix.

Before PostgreSQL 9.1 you can create a database with collate C:

PostgreSQL 9.0.4: Collation - English_United States.1252

   -- Run the same query on the same sample data    
   SELECT name FROM words ORDER BY name;

Result (the same as Informix):

name
Cloud
9
Apple
Box
a9ir
air

Starting from PostgreSQL 9.1 you can use collate ā€œCā€ clause in ORDER BY, so you can use the database with any collation and still be able to get the same result as in Informix:

PostgreSQL 9.2.4: Collation - C

   -- Run the query and specify collation in ORDER BY clause
   SELECT name FROM words ORDER BY name COLLATE "C";

Result (the same as Informix):

name
Cloud
9
Apple
Box
a9ir
air

Database and SQL Migration Tools

About SQLines

SQLines offers services and tools to help you migrate databases and applications. For more information, please contact us at support@sqlines.com.

Written by Dmitry Tolpeko, dmtolpeko@sqlines.com - August 2013.

You could leave a comment if you were logged in.