Releases: UKPLab/sentence-transformers
v0.2.6 - Transformers Update - AutoModel - WKPooling
The release update huggingface/transformers to the release v2.8.0.
New Features
- models.Transformer: The Transformer-Model can now load any huggingface transformers model, like BERT, RoBERTa, XLNet, XLM-R, Elextra... It is based on the AutoModel from HuggingFace. You now longer need the architecture specific models (like models.BERT, models.RoBERTa) any more. It also works with the community models.
- Multilingual Training: Code is released for making mono-lingual sentence embeddings models mutli-lingual. See training_multilingual.py for an example. More documentation and details will follow soon.
- WKPooling: Adding a pytorch implementation of SBERT-WK. Note, due to an inefficient implementation in pytorch of QR decomposition, WKPooling can only be run on the CPU, which makes it about 40 slower than mean pooling. For some models WKPooling improves the performance, for other don't.
- WeightedLayerPooling: A new pooling layer that uses representations from all transformer layers and learns a weighted sum of them. So far no improvement compared to only averaging the last layer.
- New pre-trained models released. Every available model is document in a google Spreadsheet for an easier overview.
Minor changes
- Clean-up of the examples folder.
- Model and tokenizer arguments can now be passed to the according transformers models.
- Previous version had some issues with RoBERTa and XLM-RoBERTa, that the wrong special characters were added. Everything is fixed now and relies on huggingface transformers for the correct addition of special characters to the input sentences.
Breaking changes
- STSDataReader: The default parameter values have been changed, so that it expects the sentences in the first two columns and the score in the third column. If you want to load the STS benchmkark dataset, you can use the STSBenchmarkDataReader.
v0.2.5 - Transformers updates, T5 and XML-RoBERTa added
huggingface/transformers was updated to version 2.3.0
Changes:
- ALBERT works (bug was fixed in transformers). Does not yield improvements compared to BERT / RoBERTA
- T5 added (does not run on GPU due to a bug in transformers). Does not yield improvements compared to BERT / RoBERTA
- CamemBERT added
- XML-RoBERTa added
v0.2.4 - Transformer Update - DistilBERT and ALBERT added
This version update the underlying HuggingFace Transformer package to v2.2.1.
Changes:
- DistilBERT and ALBERT modules added
- Pre-trained models for RoBERTa and DistilBERT uploaded
- Some smaller bug-fixes
v0.2.3 - Windows bugfixes
No breaking changes. Just update with pip install -U sentence-transformers
Bugfixes:
- SentenceTransformers can now be used with Windows (threw an exception before about invalid tensor types before)
- Outputs a warning if seq. length for BERT / RoBERTa is too long
Improvements:
- A flag can be set to hide the progress bar when a dataset is convert or an evaluator is executed
v0.2.2 - RoBERTa support
Updated pytorch-transformers to v1.1.0. Adding support for RoBERTa model.
Bugfixes:
- Critical bugfix for SoftmaxLoss: Classifier weights were not optimized in previous version
- Minor fix for including the timestamp of the output folders
v0.2.1 - Bugfix pypi
This is a minor fix: Packages were not correctly defined for pypi
v0.2.0 - New Architecture & Models
v0.2.0 completely changes the architecture of sentence transformers.
The new architecture is based on a sequential architecture: You define individual models that transform step-by-step a sentence to a fixed sized sentence embedding.
The modular architecture allows to easily swap different components. You can choose between different embedding methods (BERT, XLNet, word embeddings), transformations (LSTM, CNN), weighting & pooling methods as well as adding deep averaging networks.
New models in this release:
- Word Embeddings (like GloVe) for computation of average word embeddings
- Word weighting, for example, with tf-idf values
- BiLSTM and CNN encoder, for example, to re-create the InferSent model
- Bag-of-Words (BoW) sentence representation. Optionally also with tf-idf weighting.
This release has many breaking changes with the previous release. If you need help with the migration, open a new issue.
New model storing procedure: Each sub-module is stored in its own subfolder. If you need to migrate old models, it is best to create the subfolder structure by the system (model.save()) and then to copy the pytorch_model.bin into the correct subfolder.