Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search with wildcard gives partial results, and not accurate #2030

Open
science2002 opened this issue Dec 18, 2024 · 5 comments
Open

Search with wildcard gives partial results, and not accurate #2030

science2002 opened this issue Dec 18, 2024 · 5 comments

Comments

@science2002
Copy link

science2002 commented Dec 18, 2024

I am not sure if this is a proper bug, or a just a designed limit (since it occures also in Goldendic 1.5). But for my use it is certainly annoying. When I make a normal search using a wildcard, I get two outcomes that I consider anomalies.

First, the search result box shows only a limited number of words (40), even though there would be in reality much more (hundreds) results. If it is not a bug, but a designed limit, can it be changed?

Second, when I search for instance in Italian dictionaries (where some words have accent), the term: *ità, among the limited results (see above) there are many words that contain "ita" not "ità", i.e. equating the "a" with and without accent. Moreover the "ita" or "ità" are not the final string of letters but it could be a string in the middle of the word, like "digitale" (digital).

Behaviour reproduction
To reproduce the first issue and partially the second (I do not know words with accent in English: to reproduce it one can use the Italian example I gave above), I had active LDOCE5 (i.e. Longman Dictionary), but the issue occures with any dictionary.

  1. In the search box type for insance: *me
  2. See the results: in my case there are words that start only with "a". There is no for instance the expected word "some".
  3. Moreover among the results there is also the word "abdomen" that I would not expected when I search just "*me".

Expected results
I would instead expect:

  1. to find the word "some",
  2. and not to see the word "abdomen" that - in my perception - would have required a search like "*me*" (or "*me?" and not just "*me".

OS and software versions
Win 10 professional
Portable version of:
Goldendict-ng 24.09.1.ca9dd133 at 2024-11-04T21:40:19Z
Qt 6.7.2 Visual C++ Compiler 194134123 windows winnt 10.0.19044 x86_64-little_endian-llp64
Flags: MAKE_ZIM_SUPPORT MAKE_CHINESE_CONVERSION_SUPPORT NO_TTS_SUPPORT no_ffmpeg_player

@xiaoyifang
Copy link
Owner

xiaoyifang commented Dec 19, 2024

First, the search result box shows only a limited number of words (40), even though there would be in reality much more (hundreds) results. If it is not a bug, but a designed limit, can it be changed?

As designed ,too much results will give user no help. users need to refine their input to get the result.

Second, when I search for instance in Italian dictionaries (where some words have accent), the term: *ità, among the limited results (see above) there are many words that contain "ita" not "ità"

standard operation in search engine , called normalization.

but it could be a string in the middle of the word, like "digitale" (digital)

wildcards-search is not a recommend way to search headword. Requirements about wildcards-search will be suspended

@science2002
Copy link
Author

science2002 commented Dec 19, 2024

Thanks @xiaoyifang for your feedback.

  1. About results, I was trying to use GD-ng as a sort of rhymer, that is, to find words that share some string of letters. I see that GD is not - unfortunately - such a tool. I hoped it was able to do it, precisely by adopting wildcards. Is there any chance that the user is able to change the 40 words limit in the results? Or do you know any alternative software, or GD version, able to do that?

  2. Normalization: I can accept that, though I would have preferred an option to look up precisely what word is typed.

  3. Your last explanation ("wildcards-search is not a recommend way to search headword. etc") I cannot understand. Why wildcards-search should not be raccomended for headwords and .... should be suspended? In my opinion it is - when needed - a very useful feature.

@xiaoyifang
Copy link
Owner

3. Your last explanation ("wildcards-search is not a recommend way to search headword. etc") I cannot understand. Why wildcards-search should not be raccomended for headwords and .... should be suspended? In my opinion it is a very useful feature.

The whole logic of search may be rewrite in future.

From your description .I think you may need two kinds of search , the prefix search and suffix search.
the prefix search is already supported ,-- just input the prefix (without any wildcard ) should give you the results.
the suffix-search (As I can remember) is not supported right now.

@science2002
Copy link
Author

science2002 commented Dec 19, 2024

@xiaoyifang: Thanks to clarifying the third point.
Just a final question that I added on the first point (by editing the previous post, but just too late): in the GD's config files is there a way to remove the 40 word limit in the search results, or are available alternatives to GD for this (you know certainly better than I do also possible alternatives)?
Thanks again

@xiaoyifang
Copy link
Owner

unsigned long maxResults = 40,

need change the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants