I tried to use unicode stars in posts to show how much I liked some books. I was embarrassed to see them show up as ????? question marks. Yuck. Why is that happening?
I’ve been blogging for a long time and had a suspicion that it was going to be the age of this WordPress install. I’ve diligently upgraded the software, but nobody likes to migrate a DB schema. So I dug through the DB and sure enough, my wp-posts table is in latin1-ci. For non-nerds, that means it’s a case insensitive database that only is set to store “latin” characters – basic ABC123 and punctuation, but none of the fun unicode or non-latin characters in other languages.
I’ve done some digging and it looks like this isn’t a pushbutton process, so it looks like a weekend scripting adventure.
I’ve wondered about this, because I’ve had this error message in my apache logs for several years:
PHP message: WordPress database error Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_unicode_520_ci,COERCIBLE) for operation ‘like’ for query
followed by some horrible hex coded query.
I knew it would be hard to do, so I haven’t bothered yet. Especially since that particular message looks like somebody trying to do some sort of exploit.