Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search functionality should have more visibility into datasets #57

Open
rcheetham opened this issue May 11, 2023 · 11 comments
Open

Search functionality should have more visibility into datasets #57

rcheetham opened this issue May 11, 2023 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@rcheetham
Copy link
Collaborator

This is report from @Kistine about trouble finding data sets:

as I went to search for a dataset I know is listed under a data entry, no results showed. Is there any way to improve the search feature so that it will display dataset entries based on details within the entry, not just the title of the entry itself?

LNI has been updating a dataset (currently called 'subcontractors') and it's to be replaced with the updated dataset called 'Permit contractors.' When I search on the ODP page for 'subcontractors,' it displays 0 results. In this case, I know that the subcontractors dataset is filed under the LNI permits dataset. This will significantly affect end-users being able to find relevant data. On the editing end, I usually search before I add a dataset just to make sure it doesn't already exist somewhere to avoid duplicative entries.

@BryanQuigley
Copy link
Member

IIRC it's based on the title and the description.

I don't disagree this might be useful generally - but in this specific case having some text describing the
Subcontractors resources in that description would be helpful as well (and should make it searchable).

@Kistine
Copy link
Collaborator

Kistine commented May 11, 2023 via email

@Kistine
Copy link
Collaborator

Kistine commented May 11, 2023 via email

@BryanQuigley
Copy link
Member

Improving the search is definitely on the TODO, but I have no idea when we can get to it. The goal is actually to completely switch to a new powerful system.

It really doesn't matter if they are created in your fork or in this repo, - in fact, for anyone else we definitely want them doing the fork/PR/review setup - not needing write access here.

That has me thinking for the case of editing a single dataset by you or @rcheetham @lydiascarf if we should not require a 2nd reviewer (or otherwise lower our requirements to just CI passing). Thoughts? (I can mock something up)

@lydiascarf lydiascarf self-assigned this May 16, 2023
@lydiascarf
Copy link
Collaborator

Regarding permissions:

  • @Kistine I've invited you to collaborate via your work email, which should fix your permissions issue, but please let me know if it doesn't!
  • Also, @Alexander-M-Waldman and @jrmidkiff, your invites expired without being accepted. Would you like me to send them to different email addresses?
  • @BryanQuigley I can also handle setting up laxer reviewer requirements for trusted people. I'm already working on a CODEOWNERS file so that someone from the city gets requested if a PR touches a city dataset. I think what you're describing can be covered by that file as well but I'll dig in more today.

Regarding search:

  • @BryanQuigley I've got some availability to look into overhauling search. Was there a particular solution/plugin you were looking into? Let's focus this ticket on that search overhaul going forward (changing the title now)

@lydiascarf lydiascarf changed the title Extend Search to include Dataset Descriptions and the Title of the Resource records Search functionality should have more visibility into datasets May 16, 2023
@BryanQuigley
Copy link
Member

permissions.
That sounds good to me @lydiascarf . Thanks!

As for search here is the JKAN issue: timwis/jkan#225

There aren't specific requirements but I've also been looking at projects like https://stork-search.net/ in addition to lunr.js. I think the first thing is to come up with a list of possible goals:

  • Handle more data - this ticket (search by filetype as well?)
  • Synonyms
  • Highlight found text
  • Be able to have search page from home page <- I think this is actually the most important on the list.

Ideally with a similar flow to the current site - I really like how fast it is and how it doesn't break the view. But many of these other search engines might not work like that - they might be separate pages or pop-outs. Is that worth the trade-off?

@jrmidkiff
Copy link
Collaborator

jrmidkiff commented May 18, 2023 via email

@rcheetham rcheetham added the enhancement New feature or request label Jun 1, 2023
@lydiascarf
Copy link
Collaborator

lydiascarf commented Dec 3, 2024

@BryanQuigley this has basically been stalled because off-the-shelf, static solutions are by and large not well maintained. what if i set up an SQLite db of searchable metadata and re-implemented search on top of FTS5?

@BryanQuigley
Copy link
Member

Where would it run?

Part of me wonders if we should just leave the existing search as is, but add an external search provider as an option to the main page.. Google seems like it has the best results from my quick test, but I'm curious about DuckDuckGo/Stract/others too.

@lydiascarf
Copy link
Collaborator

my thought was it would run in-memory on the client side using sql.js. SQLite is very lightweight and it could be useful for other issues like a dynamic hero image slideshow, expansions to filtering, etc. all it would need is an action to keep it synced and then it would be available for all kinds of dynamic reads. something like this feels more in keeping with the JKAN principles of relying on FOSS over platforms and building things to be customizable, but i might not be seeing the whole picture

@lydiascarf
Copy link
Collaborator

lydiascarf commented Dec 3, 2024

@timwis i remember you were looking to improve search here. i'm curious if you think an SQLite-based search solution would be a good fit for upstream JKAN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants