Home Getting started with tantivy
Post
Cancel

Getting started with tantivy

A Step-by-Step Guide

Tantivy is an open-source full-text search engine library written in Rust. It is fast, efficient, and highly customizable, making it a great choice for building search functionality in your Rust applications. In this guide, we will walk you through the process of getting started with Tantivy, from installing the library to building your first search index.

Prerequisites

Before we begin, you should have the following installed on your system:

  • Rust (version 1.45 or higher)
  • Cargo (Rust’s package manager)

Creating a new project

To create a new Tantivy project, open a terminal and run the following command:

1
2
cargo new getting-started-tantivy

This will create a new Rust project with the name “getting-started-tantivy” in your current directory

Adding Tantivy as a dependency

To use Tantivy in your project, you need to add it as a dependency in your Cargo.toml file. Open the Cargo.toml file in your project’s root directory and add the following line under [dependencies]:

1
tantivy = "0.19.2"

This will add Tantivy version 0.19.2 as a dependency for your project. Save the file and run the following command to download and install Tantivy:

1
cargo build

This will download and install all the necessary dependencies, including Tantivy, and build your Rust project.

Creating an index

The first thing you need to do when using Tantivy is to create an index. An index is a data structure that Tantivy uses to store and search your data. To create an index, create a new Rust file called main.rs in your project’s src directory and add the following code:

1
2
3
4
5
6
7
8
9
10
use std::path::Path;
use tantivy::schema::*;
use tantivy::{Index, Result};

fn main() -> Result<()> {
    let schema = Schema::builder().build();
    let path = Path::new("data");
    let index = Index::create_in_dir(&path, schema)?;
    Ok(())
}

This code creates an empty schema and creates an index in a directory called “data” in your project’s root directory.

Adding documents to the index

Now that you have created an index, you can start adding documents to it. Tantivy uses a Document struct to represent a document. Each document can have one or more fields, and each field has a name and a value. To add a document to the index, add the following code to your main.rs file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
use std::fs;
use std::path::Path;
use tantivy::schema::*;
use tantivy::{doc, Index, Result};

fn main() -> Result<()> {
    let mut schema_builder = Schema::builder();
    schema_builder.add_text_field("title", TEXT | STORED);
    schema_builder.add_text_field("body", TEXT | STORED);
    let path = Path::new("data");
    fs::create_dir_all(path)?;
    let index = Index::create_in_dir(&path, schema_builder.build())?;

    let schema= index.schema();
    let mut index_writer = index.writer(50_000_000)?;
    let title_field = schema.get_field("title").unwrap();
    let body_field = schema.get_field("body").unwrap();
    let title = "My First Document";
    let body = "This is the body of my first document.";
    let doc = doc!(
        title_field => title,
        body_field => body,
    );
    index_writer.add_document(doc)?;
    index_writer.commit()?;

    Ok(())
}

This code adds a schema with two fields (“title” and “body”) to the index. It then opens the index and creates an IndexWriter, which is used to add documents to the index. Finally, it creates a new document and adds it to the index.

Searching the index

Now that you have added some documents to the index, you can search for them using Tantivy’s search functionality. To search the index, add the following code to your main.rs file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
use std::path::Path;
use tantivy::{Index, ReloadPolicy, Result};
use tantivy::collector::TopDocs;
use tantivy::directory::MmapDirectory;
use tantivy::query::QueryParser;

fn main() -> Result<()> {

    let path = Path::new("data");
    let index_dir = MmapDirectory::open(path)?;
    let index = Index::open(index_dir)?;
    let schema = index.schema();

    let title = schema.get_field("title").unwrap();
    let body = schema.get_field("body").unwrap();
    let reader = index
        .reader_builder()
        .reload_policy(ReloadPolicy::OnCommit)
        .try_into()?;

    let searcher = reader.searcher();
    let query_parser = QueryParser::for_index(&index, vec![title, body]);
    let query = query_parser.parse_query("first")?;
    let top_docs = searcher.search(&query, &TopDocs::with_limit(10))?;

    for (_score, doc_address) in top_docs {
        let retrieved_doc = searcher.doc(doc_address)?;
        println!("{}", schema.to_json(&retrieved_doc));
    }
    Ok(())
}

This code creates a Searcher for the index, which is used to search the index. It also creates a QueryParser, which is used to parse search queries. Finally, it creates a search query for the word “first” and prints out the title of any document that matches the query.

Conclusion

In this guide, we have shown you how to get started with Tantivy by creating an index, adding documents to it, and searching for them. Tantivy is a powerful search engine library that is easy to use and can help you quickly build search functionality into your Rust projects. With the knowledge you have gained from this guide, you can now start building your own search applications with Tantivy.

This post is licensed under CC BY 4.0 by the author.
Contents