Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cliptext-] clipstr() to width 1: truncate chars having width > 1 #2667

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

midichef
Copy link
Contributor

@midichef midichef commented Jan 7, 2025

Let's say we have a sheet with full-width characters, like あ,
normal
clipstr() currently causes a problem: if we change a column to width 1, each cell's first character is drawn with its full width, causing columns to be misaligned.
width1-current

This PR changes clipstr() to trim the fullwidth character to '', so that the column will stay aligned:
width1-new

Note that this affects the behavior of the headers of hidden columns. They currently show the first character even when it's full-width:
hidden-current
but after this PR they will become blank:
hidden-new

I'm fine with blank headers for hidden columns. If we want to solve that, it can be fixed in drawColHeader():

clipdraw(scr, y+i, x, name, hdrcattr, w=colwidth)

An alternate design for this PR would have clipstr() replace the single chars with trunch:

        if w_s > 1:
            #add these 2 lines to replace the char with trunch:
            if trunch:
                return trunch, dispwidth(trunch)

But then we'd want to disable the use of the truncator-symbol in certain places, like the top of hidden columns. To do that is easy enough, just call the clipstr() or clipdraw() with truncator=''. But locating all those places would be more effort. So I'm submitting this PR, which is a less invasive change.

w_s = dispwidth(s)
if w_s > 1:
ret = ''
for c in s:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Do we need a for-loop here? Aren't there just two cases?

        if dispwidth(s[0]) == 1:
            return s[0], 1
        else:
            return '', 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if there are more cases, as I'm not sure which characters get a width of 0 from wcwidth(). I put the for-loop there to handle a situation where a string starts with multiple characters with wcwidth/dispwidth of 0, followed by a final character that has a dispwidth of 1.

An example of a character that might get a wcwidth() of 0 is u"\u0ccd" # Joiner, Category 'Lo', East Asian Width property 'N' -- KANNADA SIGN VIRAMA, taken from a test in jquast/wcwidth. Visidata's wcwidth() returns 1 for this character. But wcwidth() from jquast/wcwidth returns 0. Measuring a longer similar string with jquast's wcswidth() function: wcswidth("\u0CCD\u0CCDj)" gives a width of 1, though Visidata's dispwidth gives a width of 3.

So even though I don't know offhand of a case where Visidata's wcwidth() gives 0 for multiple characters in a row, I figured that it could happen.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I guess VisiData's wcwidth is wrong then. We should be able to come up with some basic strings that use a zero-width joiner like this one and watch them clobber some aspects of the VisiData interface. We may want/have to vendorize jquast/wcwidth and extend it for our purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants