Programming

MySQL: dump the encoding hell

I have encountered this issue in MySQL version 5.1.x, but not in 5.5 (I have not tested other versions): generating a dump of a UTF-8 encoded database (Collation: utf8_general_ci) resulted on non-ASCII characters (like "éà ç") being garbled.

I am used to backup databases with this mysqldump command:

$ mysqldump -h hostname -u username -p databaseName > backup.sql

I never had to worry about encoding since I use UTF-8 everywhere. But when I tried to import a dump file from 5.1 to 5.5, there were weird symbols like àƒ© àƒ§ all over the place! A workaround is to dump the database using the latin1 charset:

$ mysqldump -h hostname -u username -p --default-character-set=latin1 --databases databaseName -r backup.sql

The -r option outputs the data directly into backup.sql, avoiding risks of additional encoding interferences while passing through the underlying system.

Before importing the data, edit the dump file to replace the line

/*!40101 SET NAMES latin1 */;

with

/*!40101 SET NAMES utf8 */;

Finally import the dump file:

$ mysql -u username -p --default-character-set=utf8 databaseName

Source

Back


Comments

No comment yet.

A remark, a suggestion? Do not hesitate to express yourself below. Just be courteous and polite, please.

If this field is left blank, you will appear as Anonymous.