com.redis.vl.utils.vectorize.BaseVectorizer

com.redis.vl.utils.vectorize.SentenceTransformersVectorizer

public class SentenceTransformersVectorizer extends BaseVectorizer

Vectorizer that uses Sentence Transformers models downloaded from HuggingFace. Models are downloaded and cached locally, then run using ONNX Runtime. This provides the same functionality as Python's sentence-transformers library.

Nested Class Summary

Nested classes/interfaces inherited from class com.redis.vl.utils.vectorize.BaseVectorizer
BaseVectorizer.BatchCacheResult
Field Summary

Fields inherited from class com.redis.vl.utils.vectorize.BaseVectorizer
cache, dimensions, dtype, modelName
Constructor Summary

Constructors

Constructor

Description

SentenceTransformersVectorizer(String modelName)

Create a vectorizer with default cache directory.

SentenceTransformersVectorizer(String modelName, String cacheDir)

Create a vectorizer with custom cache directory.
Method Summary

Modifier and Type

Method

Description

void

close()

Close the vectorizer and clean up resources

List<List<Float>>

embedBatchAsLists(List<String> texts)

Generate embeddings for a batch of texts with default batch size.

List<float[]>

embedSentences(List<String> sentences)

Embed multiple sentences for clustering/selection.

protected float[]

generateEmbedding(String text)

Generate embedding for a single text (to be implemented by subclasses).

protected List<float[]>

generateEmbeddingsBatch(List<String> texts, int batchSize)

Generate embeddings for multiple texts in batch (to be implemented by subclasses).

Methods inherited from class com.redis.vl.utils.vectorize.BaseVectorizer
embed, embed, embedBatch, embedBatch, getCache, getDataType, getDimensions, getModelName, getType, processEmbedding, setCache

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- SentenceTransformersVectorizer
  
  public SentenceTransformersVectorizer(String modelName)
  
  Create a vectorizer with default cache directory.
  
  Parameters:
  
  modelName - Name of the HuggingFace model to use
- SentenceTransformersVectorizer
  
  public SentenceTransformersVectorizer(String modelName, String cacheDir)
  
  Create a vectorizer with custom cache directory.
  
  Parameters:
  
  modelName - Name of the HuggingFace model to use
  
  cacheDir - Custom cache directory for model storage
Method Details
- generateEmbedding
  
  protected float[] generateEmbedding(String text)
  
  Description copied from class: BaseVectorizer
  
  Generate embedding for a single text (to be implemented by subclasses).
  
  Specified by:
  
  generateEmbedding in class BaseVectorizer
  
  Parameters:
  
  text - The text to embed
  
  Returns:
  
  The embedding vector
- generateEmbeddingsBatch
  
  protected List<float[]> generateEmbeddingsBatch(List<String> texts, int batchSize)
  
  Description copied from class: BaseVectorizer
  
  Generate embeddings for multiple texts in batch (to be implemented by subclasses).
  
  Specified by:
  
  generateEmbeddingsBatch in class BaseVectorizer
  
  Parameters:
  
  texts - The texts to embed
  
  batchSize - Number of texts to process per batch
  
  Returns:
  
  List of embedding vectors
- embedBatchAsLists
  
  public List<List<Float>> embedBatchAsLists(List<String> texts)
  
  Generate embeddings for a batch of texts with default batch size. Returns List of List of Float for convenience.
  
  Parameters:
  
  texts - List of texts to embed
  
  Returns:
  
  List of embeddings as lists of floats
- embedSentences
  
  public List<float[]> embedSentences(List<String> sentences)
  
  Embed multiple sentences for clustering/selection. Useful for extractive summarization where we need to compare sentence similarities.
  
  Parameters:
  
  sentences - List of sentences to embed
  
  Returns:
  
  List of embedding vectors (float arrays)
- close
  
  public void close()
  
  Close the vectorizer and clean up resources

Class SentenceTransformersVectorizer

Nested Class Summary

Nested classes/interfaces inherited from class com.redis.vl.utils.vectorize.BaseVectorizer

Field Summary

Fields inherited from class com.redis.vl.utils.vectorize.BaseVectorizer

Constructor Summary

Method Summary

Methods inherited from class com.redis.vl.utils.vectorize.BaseVectorizer

Methods inherited from class java.lang.Object

Constructor Details

SentenceTransformersVectorizer

SentenceTransformersVectorizer

Method Details

generateEmbedding

generateEmbeddingsBatch

embedBatchAsLists

embedSentences

close