The world's most popular open source database
This section describes issues pertaining to Unicode support that you may face when upgrading from MySQL 5.1 to 6.0. It also provides guidelines for downgrading from 6.0 back to 5.1.
In most respects, upgrading from MySQL 5.1 to 6.0 should present few problems with regard to Unicode usage, although there are some potential areas of incompatibility. Some examples:
For the variable-length character data types
(VARCHAR and the
TEXT types), the maximum
length in characters for utf8 columns is
less in MySQL 6.0 than previously.
For all character data types
(CHAR,
VARCHAR, and the
TEXT types), the maximum
number of characters for utf8 columns
that can be indexed is less in MySQL 6.0 than previously.
Consequently, if you want to upgrade tables from the old
utf8 (now utf8mb3) to the
current utf8, it may be necessary to change
some column or index definitions.
You can upgrade from MySQL 5.1 to 6.0 in two different ways:
You can install 6.0 “in place” on top of an
existing 5.1 installation. In this case, the names of
utf8 character sets and collations will
change to use utf8mb3, but no changes to
column or index lengths in table definitions will be needed.
That is, there will be name differences but no structural
changes.
You can dump your 5.1 data using
mysqldump, and then reload the dump into
6.0. Because definitions in the dump file will refer to
utf8, the server will use
utf8 in the reloaded tables, but these
tables will use the new (four-byte) utf8,
not the old 5.1 (three-byte) utf8. In
this case, the names of utf8 character
sets and collations will remain the same, but changes to
column or index lengths might be needed for long columns. In
other words, there will not be name differences, but you
might need to make some structural changes.
If you upgrade MySQL in place by installing 6.0 on top of an existing 5.1 installation, the changes in the 6.0 Unicode implementation have these effects:
For ucs2, there should be no issues
Database and tables that used utf8 in
MySQL 5.1 will be reported as using
utf8mb3 in 6.0 (for example, if you
examine object structure with SHOW
statements or select definitions from
INFORMATION_SCHEMA tables). However, no
data will have been changed and the server can use
utf8mb3 columns and indexes with no
conversion required.
When the server starts, it checks the structure of certain
system tables in the mysql database. If
it finds that columns have the utf8mb3
character set when it expects utf8, it
writes warnings to the error log but continues to use the
tables. To make the warnings go away, convert the system
tables to utf8 by running
mysql_upgrade.
If you use events, a known issue is that if you upgrade from MySQL 5.1 to 6.0.4 though 6.0.6, the event scheduler will not work, even after you run mysql_upgrade. (This is an issue only for an upgrade, not for a new installation of MySQL 6.0.x.) As of MySQL 6.0.7, mysql_upgrade handles upgrading the system tables properly. If you upgrade to 6.0.4 through 6.0.6, you can work around this upgrading problem by using these instructions:
In MySQL 5.1, before upgrading, create a dump file
containing your mysql.event table:
shell> mysqldump -uroot -p mysql event > event.sql
Stop the server, upgrade to MySQL 6.0, and start the server.
Recreate the mysql.event table using the
dump file:
shell> mysql -uroot -p mysql < event.sql
Run mysql_upgrade to upgrade the other
system tables in the mysql database:
shell> mysql_upgrade -uroot -p
Restart the server. The event scheduler should run normally.
To following example illustrates what you should see if you use MySQL 5.1 tables with an installation that has been upgraded in place to 6.0. Suppose that a table was originally defined as follows in MySQL 5.1:
CREATE TABLE t1 ( col1 CHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, col2 CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL ) CHARACTER SET utf8;
In MySQL 5.1, SHOW CREATE TABLE
produces this result:
mysql> SHOW CREATE TABLE t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`col1` char(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`col2` char(10) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8
After upgrading to MySQL 6.0, SHOW CREATE
TABLE produces this result instead:
mysql> SHOW CREATE TABLE t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`col1` char(10) CHARACTER SET utf8mb3 COLLATE utf8mb3_unicode_ci NOT NULL,
`col2` char(10) CHARACTER SET utf8mb3 COLLATE utf8mb3_bin NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb3
Internally, the IDs for the character data have not changed. Only the names associated with the IDs have changed. For example, in MySQL 5.1, we have this:
mysql> SHOW COLLATION LIKE 'utf8%bin';
+-----------+---------+----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+-----------+---------+----+---------+----------+---------+
| utf8_bin | utf8 | 83 | | Yes | 1 |
+-----------+---------+----+---------+----------+---------+
In MySQL 6.0, we have this:
mysql> SHOW COLLATION LIKE 'utf8%bin';
+-------------+---------+----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+-------------+---------+----+---------+----------+---------+
| utf8mb3_bin | utf8mb3 | 83 | | Yes | 1 |
| utf8_bin | utf8 | 46 | | Yes | 1 |
+-------------+---------+----+---------+----------+---------+
Thus, for table t1, the columnn
col2 still has a collation ID of 83. What has
changed is the name associated with 83. In other words, the
collation IDs for the old utf8 character set
now belong to utf8mb3. The collations IDs for
the new utf8 character set are new.
Tables can be converted from utf8mb3 to
utf8 by using ALTER
TABLE. The following statement does so for
t1:
ALTER TABLE t1 DEFAULT CHARACTER SET utf8, MODIFY col1 CHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, MODIFY col2 CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL;
In terms of table content, conversion from the old
utf8 to the new utf8
presents no problems:
For a BMP character, the 5.1 and 6.0 versions of
utf8 have identical storage
characteristics: same code values, same encoding, same
length.
For a supplementary character, utf8 in
5.1 cannot store the character at all, while
utf8 in 6.0 requires four bytes to store.
Since utf8 in 5.1 cannot store the
character at all, you do not have any supplementary
characters in utf8 columns in your 5.1
database, so you need not worry about converting characters
or losing data.
In terms of table structure, the catch when converting from the
old utf8 to the new utf8
is that the maximum length of a column or index key is unchanged
in terms of bytes. Therefore, it is smaller in terms of
characters, because the maximum length of a character is four
bytes instead of three. The things to watch for when converting
your MySQL 5.1 tables are:
Look at the definitions of utf8 columns,
and make sure they will not exceed the maximum length for
the storage engine.
Look at all indexes on utf8 columns, and
make sure they will not exceed the maximum length for the
storage engine. Sometimes the maximum can change due to
storage engine enhancements.
Check those things for the CHAR,
VARCHAR, and
TEXT data types.
If the preceding conditions apply for you, you will have to
reduce the defined length of columns or indexes, or you will
have to use utf8mb3 rather than
utf8.
Here are some examples where structural changes may be needed:
A TINYTEXT column can hold up
to 255 bytes, so it can hold up to 85 three-byte or 63
four-byte characters. Suppose that you have a
TINYTEXT column that uses
utf8mb3 but must be able to contain more
than 63 characters. You cannot convert it to
utf8 unless you also change the data type
to a longer type such as
TEXT.
Similarly, a very long
VARCHAR column may need to be
changed to one of the longer
TEXT types if you want to
convert it from utf8mb3 to
utf8.
InnoDB has a maximum index length of 767
bytes, so for utf8mb3 or
utf8 columns, you can index a maximum of
255 or 191 characters, respectively. If you currently have
utf8mb3 columns with indexes longer than
191 characters, you will need to index a smaller number of
characters. In an InnoDB table, this
column and index definition is legal:
col1 VARCHAR(500) CHARACTER SET utf8mb3, INDEX (col1(255))
To use utf8 instead, the index must be
smaller:
col1 VARCHAR(500) CHARACTER SET utf8, INDEX (col1(191))
The preceding types of changes are most likely to be required
only if you have very long columns or indexes. Otherwise, you
should be able to convert your tables from
utf8mb3 to utf8 without
problems. You can do this either by dumping them with
mysqldump in MySQL 5.1 and reloading the dump
into 6.0, or by using ALTER TABLE
after upgrading in place from 5.1 to 6.0.
The following items summarize other potential areas of incompatibility:
Performance of four-byte UTF-8 (utf8) is
slower than for three-byte UTF-8
(utf8mb3). There are tradeoffs: If you
want the same character set and collation names, you can
convert from utf8mb3 to
utf8. But that will introduce a small
performance penalty. If you do not want to incur this
penalty, continue to use utf8mb3.
SET NAMES 'utf8' now causes use of
four-byte character set. As long as no four-byte character
are sent in either direction, there should be no problems.
Otherwise, applications that expect to receive a maximum of
three bytes per character may have problems. Conversely,
applications that expect to send four-byte characters must
ensure that the server understands them.
Applications cannot send utf16 or
utf32 character data to a pre-6.0 server.
Applications that use the old collation IDs to detect
utf8 collations need to be updated unless
it is okay that these IDs signify utf8mb3
collations in MySQL 6.0.
Applications that test for the string
'utf8' in character set or collation
names and expect them to refer to the three-byte
utf8 may need to be adjusted to recognize
'utf8mb3' instead.
Look at all application programs that process
utf8 data, and make sure that buffer
lengths are not calculated with an algorithm such as
“number of characters times 3.”
For replication, if four-byte characters are going to be used, all servers involved must understand them.
Check for other uses of utf8, such as
_utf8 introducers,
utf8 stored routine variables, or
utf8 function return types.
Similar restrictions apply if you attempt to replicate from a
MySQL 6.0 master to a MySQL 5.1 slave.
utf8mb3 data will be seen as
utf8 by the slave as long as it is
represented in binary log events by ID, rather than by name (as
in statements such as CREATE
TABLE). But you cannot send utf16
or utf32 data, or data for the new
utf8.
If you have upgraded to MySQL 6.0 and then decide to downgrade back to 5.1, these considerations apply:
ucs2 data should present no problems.
For an in-place downgrade, utf8mb3 data
will be seen as utf8 in 5.1 and should
present no problems.
Any definitions that refer to the
utf8mb3, utf16, or
utf32 character set names will not be
recognized by 5.1.
For objects with definitions that refer to the
utf8 character set name, you can dump
them with mysqldump in 6.0 and then
reload the dump in 5.1, as long as there are no four-byte
characters in the data. The 5.1 server will see
utf8 in the dump file object definitions
and create new objects that use the 5.1 (three-byte)
utf8 character set.
For tables in the mysql database,
mysql_upgrade in 6.0 converts
utf8mb3 columns to
utf8. Thus, if you run
mysql_upgrade after upgrading to 6.0,
these tables will not be legal in 5.1. You will need to
convert utf8 columns to
utf8mb3 with ALTER
TABLE before downgrading, or you can do this:
In MySQL 6.0, before downgrading, create a dump file
containing the mysql database tables:
shell> mysqldump -uroot -p mysql > mysql.sql
Check the dump file to make sure that there are no instances of “utf8mb3”. If there are, change them to “utf8”.
Stop the server, downgrade to MySQL 5.1, and start the
server with the --skip-grant-tables
option so that it does not try to use the
mysql database tables.
Recreate the dump file to recreate the
mysql database tables:
shell> mysql mysql < mysql.sql
Restart the server. It should use the
mysql database tables normally.


User Comments
Add your own comment.