Python NBSP DWIM

Chris Angelico rosuav at gmail.com
Wed Jun 10 23:37:59 EDT 2015


On Thu, Jun 11, 2015 at 1:27 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote:
> [...]
>>> Why do the subtitles contain ZWNBSP in the first place? Surely they're
>>> not English subtitles?
>>
>> No, they're not :) The character comes up in the Cantonese and
>> Japanese subs for Once Upon A December.
>>
>> http://youtu.be/CEpcUeWP0bg
>> http://youtu.be/WFZAaHrHens
>>
>> Possibly some others in the series as well. It may well be a fault in
>> the subtitles, but most programs I've seen don't show U+FEFF as a big
>> fat box.
>
> I think that for backwards compatibility, applications (or fonts) are
> permitted to treat U+FEFF as a zero-width invisible character, so perhaps
> you can raise a feature request with VLC.

Yeah. Well, like I said - learn something new every day. I didn't know
it wasn't a bug. (Though it'd still be a font issue, not a VLC one.
With other fonts, it comes up looking different, in some cases
invisible. Unfortunately, the fonts that look good aren't the fonts
that have glyphs for all characters, so I need to figure out why font
substitution isn't working right. But that's a separate issue.)

ChrisA



More information about the Python-list mailing list