Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
12
This model was finetuned with Unsloth.
based on unsloth/embeddinggemma-300m
This is a sentence-transformers model finetuned from unsloth/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
"TITLE: Essentialism: The Disciplined Pursuit of Less\nGENRES: business, fiction, nonfiction, philosophy, self-help\nAUTHORS: Greg McKeown\nDESCRIPTION: Have you ever found yourself stretched too thin? Do you simultaneously feel overworked and underutilized? Are you often busy but not productive? Do you feel like your time is constantly being hijacked by other people's agendas? If you answered yes to any of these, the way out is the Way of the Essentialist. The Way of the Essentialist isn't about getting more done in less time. It's about getting only the right thingsdone. It is not a time management strategy, or a productivity technique. It is a systematic disciplinefor discerning what is absolutely essential, then eliminating everything that is not, so we can make the highest possible contribution towards the things that really matter. By forcing us to apply a more selective criteria for what is Essential, the disciplined pursuit of less empowers us to reclaim control of our own choices about where to spend our precious time and energy - instead of giving others the implicit permission to choose for us. Essentialism is not one more thing - it's a whole new way of doing everything. A must-read for any leader, manager, or individual who wants to learn who to do less, but better, in every area of their lives, Essentialism is a movement whose time has come.",
'TITLE: The One Thing: The Surprisingly Simple Truth Behind Extraordinary Results\nGENRES: business, fiction, nonfiction, philosophy, self-help\nAUTHORS: Gary Keller, Jay Papasan\nDESCRIPTION: The One Thingexplains the success habit to overcome the six lies that block our success, beat the seven thieves that steal time, and leverage the laws of purpose, priority, and productivity.',
'TITLE: Smarter than Squirrels\nGENRES: adventure, children, fantasy, fiction, middle grade, young adult\nAUTHORS: Lucy Nolan, Mike Reed\nDESCRIPTION: THE HILARIOUS ADVENTURES OF TWO CONFUSED CANINES Down Girl and Sit are two dogs who are "smarter than squirrels." They know how to protect their masters from all the things that can go wrong in the neighborhood: they bark at paperboys and guard the garbage cans, and keep mischievous squirrels at bay. But when Here Kitty Kitty moves in next door, their daily routines are turned topsy-turvy. Filled with humor and adventure, this illustrated chapter book takes a look at life in the backyard from the well-intentioned but misguided viewpoint of man\'s best friend.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0010, 0.8911, 0.1610],
# [0.8911, 1.0010, 0.2207],
# [0.1610, 0.2207, 1.0000]], dtype=torch.float16)
anchor_text and positive_text| anchor_text | positive_text | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor_text | positive_text |
|---|---|
TITLE: The View from the Cheap Seats: Selected Nonfiction |
TITLE: The Art of Asking; or, How I Learned to Stop Worrying and Let People Help |
TITLE: Styxx (Dark-Hunter, #22) |
TITLE: Dark Skye (Immortals After Dark, #15) |
TITLE: Marked (Dark Protectors, #7) |
TITLE: Dark Skye (Immortals After Dark, #15) |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
anchor_text and positive_text| anchor_text | positive_text | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor_text | positive_text |
|---|---|
TITLE: White Pine |
TITLE: Pia the Penguin Fairy (Rainbow Magic: Ocean Fairies, #3) |
TITLE: Skin Game (The Dresden Files, #15) |
TITLE: Hunted (The Iron Druid Chronicles, #6) |
TITLE: Slow Curve on the Coquihalla (A Hunter Rayne Highway Mystery, #1) |
TITLE: The Rock Star |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 256gradient_accumulation_steps: 4learning_rate: 2e-05warmup_ratio: 0.1dataloader_num_workers: 2remove_unused_columns: Falseprompts: {'anchor_text': '', 'positive_text': ''}batch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 256per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 4eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3.0max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 2dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Falselabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: {'anchor_text': '', 'positive_text': ''}batch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.1282 | 10 | 2.9737 |
| 0.2564 | 20 | 2.6733 |
| 0.3846 | 30 | 2.3228 |
| 0.5128 | 40 | 2.1395 |
| 0.6410 | 50 | 2.0539 |
| 0.7692 | 60 | 1.9516 |
| 0.8974 | 70 | 1.917 |
| 1.0256 | 80 | 1.9625 |
| 1.1538 | 90 | 1.8804 |
| 1.2821 | 100 | 1.8654 |
| 1.4103 | 110 | 1.8209 |
| 1.5385 | 120 | 1.8294 |
| 1.6667 | 130 | 1.8817 |
| 1.7949 | 140 | 1.9176 |
| 1.9231 | 150 | 1.9241 |
| 2.0513 | 160 | 1.9469 |
| 2.1795 | 170 | 1.8467 |
| 2.3077 | 180 | 1.8364 |
| 2.4359 | 190 | 1.8705 |
| 2.5641 | 200 | 1.8142 |
| 2.6923 | 210 | 1.8757 |
| 2.8205 | 220 | 1.8214 |
| 2.9487 | 230 | 1.8332 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
unsloth/embeddinggemma-300m