Trait tantivy::tokenizer::Tokenizer [] [src]

pub trait Tokenizer<'a>: Sized + Clone {
    type TokenStreamImpl: TokenStream;
    fn token_stream(&self, text: &'a str) -> Self::TokenStreamImpl;

    fn filter<NewFilter>(
        self,
        new_filter: NewFilter
    ) -> ChainTokenizer<NewFilter, Self>
    where
        NewFilter: TokenFilter<Self::TokenStreamImpl>
, { ... } }

Tokenizer are in charge of splitting text into a stream of token before indexing.

See the module documentation for more detail.

Warning

This API may change to use associated types.

Associated Types

Type associated to the resulting tokenstream tokenstream.

Required Methods

Creates a token stream for a given str.

Provided Methods

Appends a token filter to the current tokenizer.

The method consumes the current TokenStream and returns a new one.

Example


use tantivy::tokenizer::*;

let en_stem = SimpleTokenizer
    .filter(RemoveLongFilter::limit(40))
    .filter(LowerCaser)
    .filter(Stemmer::new());

Implementors