About a year ago, Luminoso blogged about how to ungarble garbled Unicode in a post called Fixing common Unicode mistakes with Python â€” after they’ve been made. Shortly after that, we released the code in a Python package called ftfy.
You have almost certainly seen the kind of problem ftfy fixes. Here’s a shoutout from a developer who found that her database was full of place names such as “BucureÅŸti, Romania” because of someone else’s bug. That’s easy enough to fix:
1 2 3 4 5 6 7