Windows linux Software Super Tips

All Tips For Os & Software

Home

Convert a db to UTF8

Monday, December 18, 2006

If you've ever used a UTF8 application on a pre-4.1 MySQL server, or never cared about encodings on a 4.1 setup even, you may have a non-UTF8 database containing UTF8 data. While this doesn't bother most applications (e.g. PHP weblogs), it's not clean and you can't sort properly with any non-Western characters. This procedure will fix it:

code:
mysqldump --user=username --password=password --default-character-set=latin1 --skip-set-charset dbname > dump.sqlchgrep latin1 utf8 dump.sqlmysql --user=username --password=password --execute="DROP DATABASE dbname; CREATE DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;"mysql --user=username --password=password --default-character-set=utf8 dbname <>

The chgrep part is important, because the table definitions in your dump will likely have "latin1" preserved. If you don't have chgrep, you may use any search-and-replace capable editor, but remember that it must open and save UTF8 properly. Edit: Instead of 'chgrep', you can use 'sed'
e.g.:
sed -i "" 's/latin1/utf8/g' dump.sql

Alternatively you may attach "--skip-create-options" to the mysqldump command, but that could omit some needed options (e.g. PACK_KEYS=1 etc.).You may change the utf8_general_ci collation to whatever you need, e.g. utf8_czech_ci for my purposes.

0 Comments: