english

The thing with LaTeX, Unicode, PDF, Umlaute and OS-X

It is a well known critique against (PDF)LaTeX in the german speaking region that LaTeX tend to mess up umlaute (äöü) by composing them by the base letter (aou) followed by a diacritic character. The causes for this are buried deep somewhere in the inner guts of LaTeX/TeX (at least for me as a dumb user). However, this can be avoided by using "T1" fonts. Therefore not a general problem and i kept promoting the usage of LaTeX for scientific writing (Theses, Papers etc.) as the way to go to my colleagues.

Then i was struck when i experienced that copying and pasting text with diacritics (in fact umlaute in a german title of a reference) from my thesis pdf, opened in preview on my mac, in my text editor (sublime edit) produced this:

Retrieve cleartext from binary encoded Mediawiki database tables

For me (who is not that familiar with databases and SQL) it was a little effort to find the right SQL statement to retrieve the cleartext from a binary encoded mysql table from a mediawiki (1.18.1) installation. The following little SQL statement did the trick, it gives the user names of all users in cleartext:

SELECT CONVERT `user_name` USING UTF8 FROM `user` WHERE 1;

Subscribe to RSS - english