Releases · UKPLab/sentence-transformers

16 Apr 14:12

nreimers

v0.2.6

eb39d01

v0.2.6 - Transformers Update - AutoModel - WKPooling

The release update huggingface/transformers to the release v2.8.0.

New Features

models.Transformer: The Transformer-Model can now load any huggingface transformers model, like BERT, RoBERTa, XLNet, XLM-R, Elextra... It is based on the AutoModel from HuggingFace. You now longer need the architecture specific models (like models.BERT, models.RoBERTa) any more. It also works with the community models.
Multilingual Training: Code is released for making mono-lingual sentence embeddings models mutli-lingual. See training_multilingual.py for an example. More documentation and details will follow soon.
WKPooling: Adding a pytorch implementation of SBERT-WK. Note, due to an inefficient implementation in pytorch of QR decomposition, WKPooling can only be run on the CPU, which makes it about 40 slower than mean pooling. For some models WKPooling improves the performance, for other don't.
WeightedLayerPooling: A new pooling layer that uses representations from all transformer layers and learns a weighted sum of them. So far no improvement compared to only averaging the last layer.
New pre-trained models released. Every available model is document in a google Spreadsheet for an easier overview.

Minor changes

Clean-up of the examples folder.
Model and tokenizer arguments can now be passed to the according transformers models.
Previous version had some issues with RoBERTa and XLM-RoBERTa, that the wrong special characters were added. Everything is fixed now and relies on huggingface transformers for the correct addition of special characters to the input sentences.

Breaking changes

STSDataReader: The default parameter values have been changed, so that it expects the sentences in the first two columns and the score in the third column. If you want to load the STS benchmkark dataset, you can use the STSBenchmarkDataReader.

Assets 2

10 Jan 09:30

nreimers

v0.2.5

9dbde8d

v0.2.5 - Transformers updates, T5 and XML-RoBERTa added

huggingface/transformers was updated to version 2.3.0

Changes:

ALBERT works (bug was fixed in transformers). Does not yield improvements compared to BERT / RoBERTA
T5 added (does not run on GPU due to a bug in transformers). Does not yield improvements compared to BERT / RoBERTA
CamemBERT added
XML-RoBERTa added

Assets 2

06 Dec 14:12

nreimers

v0.2.4

9c46f02

v0.2.4 - Transformer Update - DistilBERT and ALBERT added

This version update the underlying HuggingFace Transformer package to v2.2.1.

Changes:

DistilBERT and ALBERT modules added
Pre-trained models for RoBERTa and DistilBERT uploaded
Some smaller bug-fixes

Assets 2

20 Aug 17:21

nreimers

v0.2.3

5cacadb

v0.2.3 - Windows bugfixes

No breaking changes. Just update with pip install -U sentence-transformers

Bugfixes:

SentenceTransformers can now be used with Windows (threw an exception before about invalid tensor types before)
Outputs a warning if seq. length for BERT / RoBERTa is too long

Improvements:

A flag can be set to hide the progress bar when a dataset is convert or an evaluator is executed

Assets 2

19 Aug 14:38

nreimers

v0.2.2

1753b68

v0.2.2 - RoBERTa support

Updated pytorch-transformers to v1.1.0. Adding support for RoBERTa model.

Bugfixes:

Critical bugfix for SoftmaxLoss: Classifier weights were not optimized in previous version
Minor fix for including the timestamp of the output folders

Assets 2

16 Aug 21:19

nreimers

v0.2.1

b8bb53d

v0.2.1 - Bugfix pypi

This is a minor fix: Packages were not correctly defined for pypi

Assets 2

16 Aug 08:14

nreimers

v0.2.0

0ea9713

v0.2.0 - New Architecture & Models

v0.2.0 completely changes the architecture of sentence transformers.

The new architecture is based on a sequential architecture: You define individual models that transform step-by-step a sentence to a fixed sized sentence embedding.

The modular architecture allows to easily swap different components. You can choose between different embedding methods (BERT, XLNet, word embeddings), transformations (LSTM, CNN), weighting & pooling methods as well as adding deep averaging networks.

New models in this release:

Word Embeddings (like GloVe) for computation of average word embeddings
Word weighting, for example, with tf-idf values
BiLSTM and CNN encoder, for example, to re-create the InferSent model
Bag-of-Words (BoW) sentence representation. Optionally also with tf-idf weighting.

This release has many breaking changes with the previous release. If you need help with the migration, open a new issue.

New model storing procedure: Each sub-module is stored in its own subfolder. If you need to migrate old models, it is best to create the subfolder structure by the system (model.save()) and then to copy the pytorch_model.bin into the correct subfolder.

Assets 2

25 Jul 08:02

nreimers

v0.1.0

7e5802c

v0.1.0

First release of sentence transformers framework

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Features

Minor changes

Breaking changes

Releases: UKPLab/sentence-transformers

v0.2.6 - Transformers Update - AutoModel - WKPooling

New Features

Minor changes

Breaking changes

v0.2.5 - Transformers updates, T5 and XML-RoBERTa added

v0.2.4 - Transformer Update - DistilBERT and ALBERT added

v0.2.3 - Windows bugfixes

v0.2.2 - RoBERTa support

v0.2.1 - Bugfix pypi

v0.2.0 - New Architecture & Models

v0.1.0