Skip to content

Latest commit

 

History

History

examples

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
#Vespa

Vespa Code And Operational Examples

Vespa grouping and facets for organizing results

logo Grouping Results demonstrates Vespa grouping and faceting for query time result analytics. Read more in Vespa grouping.

Vespa Predicate Fields

logo predicate-fields uses Vespa's predicate field type to implement indexing of document side boolean expressions. Boolean document side constraints allows the document to specify which type of queries it can be retrieved for. Predicate fields allow expressing logic like "this document should only be visible in search for readers in age range 20 to 30" or "this product should only be visible in search during campaign hours".

Vespa custom linguistics Integration

The logo vespa-chinese-linguistics app demonstrates integrating custom linguistic processing, in this case a Chinese tokenizer Jieba.

Vespa custom HTTP api using request handlers and processors

logo http-api-using-request-handlers-and-processors demonstrates how to build custom HTTP apis, building REST interfaces with custom handlers and renderers. See also Custom HTTP Api tutorial.

Vespa container plugins with multiple OSGI bundles

logo multiple-bundles is a technical sample application demonstrating how to use multiple OSGI bundles for custom plugins (searchers, handlers, renderers).

Distributed joins

logo Joins shows possibilities for doing distributed query time joins. This is for use cases where parent-child is not sufficient.

Document processing

logo Document-processingbuilds on album-recommendation to show some of the possibilities for doing custom document processing in Java.

Generic request processing

logo generic-request-processing Generic request-response processing sample application.

Lucene Linguistics

logo lucene-linguistics contains two sample application packages:

  1. A bare minimal app.
  2. Shows advanced configuration of the Lucene based Linguistics implementation.

Lambda functions in AWS and Google Cloud

logo aws/lambda and logo google-cloud/cloud-functions have examples of (lambda) functions for accessing data and logs with the cloud providers.

Automatic data generation for training embedders using LLMs

logo embedder-auto-training-evaluation does automatic data generation using the ChatGPT API. This in order to train an embedder to perform better for information retrieval on specific datasets without labor-intensive and expensive manual training data annotation.

Machine learned embedder models enable efficient similarity computations, but training these models requires large amounts of (often manually) annotated data. The aim of this app is to investigate whether Large Language Models (LLMs), such as GPT-3.5-turbo, can be employed to generate synthetic data for training embedder, without extensive manual intervention.

The repository contains scripts and notebooks to:

  • Prepare datasets
  • Generate training data for datasets using an LLM
  • Train an embedder
  • Evaluate performance

Read more in the blog post.

Embedding service (WORK IN PROGRESS)

logo embedding-service demonstrates how a Java handler component can be used to process HTTP requests. In this application, a handler is used to implement an embedding service, which takes a string as an input and returns a vector embedding of that string.

FastHTML Vespa frontend

logo FastHTML Vespa frontend is a simple frontend for the Vespa search engine. It is built using FastHTML and written in pure Python. Features:

  • Simple search interface, with links to search results.
  • Accordion with full JSON-response from Vespa.
  • SQLite DB for storing queries.
  • Admin authentication for viewing and downloading queries.
  • Deployment options - Docker + Huggingface spaces.

ONNX Model export and deployment example

Use logo model-deployment to generate a model in ONNX format in the models directory, by running the ONNXModelExport notebook.

Reranker sample application

logo reranker is a stateless application which re-ranks results obtained from another Vespa application. While this does not result in good performance and is not recommended for production, it is useful when you want to quickly do ranking experiments without rewriting application data.

Categorize using an LLM

logo In-Context Learning This is a set of scripts/installs to back up the presentation using In-Context Learning at:

For any questions, please register at the Vespa Slack and discuss in the general channel.


Operations

See operations for sample applications for multinode clusters, deployed in various infrastructure like Kubernetes. Also find examples for CI/CD, security and monitoring.

Note: Applications with pom.xml are Java/Maven projects and must be built before being deployed. Refer to the Developer Guide for more information.

Contribute to the Vespa sample applications.