gitinsp.domain.interfaces.infrastructure.IngestionStrategy
Strategy for preprocessing and preparing documents for vector database ingestion Implements different approaches for transforming and splitting content based on content type
Attributes
Graph
Reset zoom Hide graph Show graph
Supertypes
class Object
trait Matchable
class Any
Known subtypes
Members list
Creates a document splitter appropriate for the given language
Creates a document splitter appropriate for the given language
Value parameters
chunkSize
The target size of each document chunk
lang
The programming language of the document content
overlap
The number of tokens to overlap between chunks
Attributes
Returns
A DocumentSplitter configured for the language
Transforms a text segment before storing it in the vector database
Transforms a text segment before storing it in the vector database
Value parameters
textSegment
The text segment to transform
Attributes
Returns
The transformed text segment