Home > Blockchain >  determine whether a unicode character is fullwidth or halfwidth in Perl
determine whether a unicode character is fullwidth or halfwidth in Perl

Time:01-26

How can I determine in Perl, whether a unicode character is a full width (taking two cells; double width) one or half-width (as usual latin characters) one?

E.g. Emoji-s are of double width, but there are also characters in lower blocks such as "\N{MEDIUM BLACK CIRCLE}" (U 26ab).

I tried

Unicode::GCString->new("\N{LARGE RED CIRCLE}")->columns()

but it also returns 1.

CodePudding user response:

I have some C code lying around to calculate character widths. So, a quick conversion to perl later, and...

#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/state/;
use open qw/:std :locale/;
use charnames qw/:full/;
use Unicode::UCD qw/charinfo charprop/;

# Return the number of fixed-width columns taken up by a unicode codepoint
# Inspired by https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
# First adapted to use C  /ICU functions and then to perl
sub charwidth ($) {
  state            
  •  Tags:  
  • Related