The world of the SQL Data Definition Language (DDL), which we started exploring over the past few articles, is a world of mystery, contradiction, and occasional frustration and insanity. Most database products support a robust implementation of ANSI-SQL for manipulating data, but there is far more variance in the details of how DDL is implemented. One of the primary areas for potential danger is in the SQL data types supported by each database platform.
Last column, I introduced the SQL DDL commands for creating a table in a database:
CREATE TABLE table_name (
column_name datatype [modifiers],
(column_name datatype [modifiers],
);
Note that each column is required to have a name and a data type. Different databases, however, offer a different array of choices for the data type definition that have significant effects on performance, database size, and even sorting rules. But the general categories should be familiar to anyone that has done any programming in the past:
Each database has many variations on the individual themes. For example, integer data types often come in two or more sizes to increase the storage and calculation efficiency of algorithms underlying the database functionality. One integer, for example, may only represent values up to 65,000 or so, while another handles numbers over 2 billion. The space set aside for each of the two types of integers is different, even if the values in them are identical. Furthermore, algorithms that are efficient for sorting 65,000 records may or may not be as efficient for 2 billion, necessitating either more running time or more complex code to produce the same effect. These issues are transparent to the database user because the developers of the database itself tackled these issues, but knowing the options will allow you to make better design and implementation decisions.
The "same, yet different" nature of SQL data types is of vital importance for any developer working with multiple database products, or those valiantly attempting to write SQL that lives in the application layer, independent of specific database platform choices. While by no means complete, the following table outlines some of the common names of data types between the various database platforms:
| Access | SQL-Server | Oracle | MySQL | PostgreSQL | |
| boolean | Yes/No | Bit | Byte | N/A | Boolean |
| integer | Number (integer) | Int | Number | Int Integer (synonyms) |
Integer Int |
| float | Number (single) | Float Real |
Number | Float | Numeric |
| currency | Currency | Money | N/A | N/A | Money |
| string (fixed) | N/A | Char | Char | Char | Char |
| string (variable) | Text (<256) Memo (65k+) |
Varchar | Varchar Varchar2 |
Varchar | Varchar |
| binary object | OLE Object Memo |
Binary (fixed up to 8K) Varbinary (<8K) Image (<2GB) |
Long Raw | Blob Text |
Binary Varbinary |
|
Also in aboutSQL: |
As you can see, similarities abound, but there are enough differences or alternative options that we come to another nugget of SQL wisdom:
SQL Wisdom #7) The data type is invariably different -- even if it has the same name -- in another database. Always check the documentation.
Even when the name is the same, the size and other details may be different. Hopefully, you can implement everything as stored procedures and let the DBAs earn their keep making the appropriate translations!
Now that we can create databases and database tables, we'll need to know how to change the tables we've created. In addition, we'll have to take a quick look at data types in databases, one of the primary gotchas when moving database schemas from one platform to another. Until then, feel free to contact me with comments and questions.
John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.
Read more aboutSQL columns.
Return to ONLamp.com.
Copyright © 2009 O'Reilly Media, Inc.