createIndex
Type Definition
Section titled “Type Definition”export function createIndex( index_class: IndexClass, contents: unknown[], opt: IndexOpt = {},): StaticSeekIndex | StaticSeekError;Parameters
Section titled “Parameters”-
index_class: Specifies the search algorithm implementation-
LinearIndex (Default)
- Best for small to medium-sized content
- Simple and reliable
- Good balance of performance and accuracy
-
GPULinearIndex
- WebGPU-accelerated fuzzy search
- 2-10x faster for larger datasets
- Gracefully falls back to LinearIndex when WebGPU is unavailable
-
HybridTrieBigramInvertedIndex
- ~100x faster search performance
- Ideal for larger datasets
- Trade-offs:
- Slower index generation
- Higher false positive rate for CJK-like languages
- Less precise fuzzy search for CJK-like languages
- Limited result metadata
-
-
contents: Array of JavaScript objects to be indexed- Supports string fields and string array fields
- Nested arrays containing strings are excluded from search
-
opt: Configuration options for indexing and searching (optional)export type Path = string;export type IndexOpt = {key_fields?: Path[];search_targets?: Path[];distance?: number;weights?: [Path, number][];};key_fields: Fields to include in search resultssearch_targets: Fields to index for searchingdistance: Default edit distance for fuzzy searchweights: Field-specific weights for ranking
Return Value
Section titled “Return Value”The function returns either a StaticSeekIndex object or StaticSeekError if validation fails.
Configuring Indexing
Section titled “Configuring Indexing”When creating an index, you can control indexing and search behavior by specifying the opt parameter. Here’s an example data structure:
const array_of_articles = [ { slug: "introduction-to-js", content: "JavaScript is a versatile programming language widely used for web development...", data: { title: "Introduction to JavaScript", description: "Learn the basics of JavaScript, a powerful language for modern web applications.", tags: ["javascript", "web", "programming"] } }, // ...];key_fields
Section titled “key_fields”To include specific fields in search results, use the key_fields option:
const index = createIndex(LinearIndex, array_of_articles, { key_fields: ['slug', 'data.title']});search_targets
Section titled “search_targets”You can limit which fields are indexed using the search_targets option:
const index = createIndex(LinearIndex, array_of_articles, { search_targets: ['data.title', 'data.description', 'data.tags']});distance
Section titled “distance”Use the distance option to specify the default edit distance for fuzzy searches. If the distance option is not specified, an edit distance of 1 (allowing up to one character mismatch) is used.
const index = createIndex(LinearIndex, array_of_articles, { distance: 2 // Set default edit distance for all searches});If you set the edit distance to 0, fuzzy search is disabled and exact match is performed instead.
weights
Section titled “weights”To assign different weights to specific fields for ranking, use the weights option:
const index = createIndex(LinearIndex, array_of_articles, { weights: [ ['data.title', 2], // Boost title field ['data.description', 0.5] // Lower weight for description field ]});The default weight is 1. Higher weights increase the relevance of the field in search results.
How to specify fields
Section titled “How to specify fields”There are two ways to specify fields:
- Full path: Use the full path to the field, starting from the root object:
key_fields: ['slug', 'data.title'],search_targets: ['data.title', 'data.description', 'data.tags'],weight: [['data.title', 2]]
- Field name only: If the field is a leaf, you can specify the field name directly:
key_fields: ['slug', 'title'],search_targets: ['title', 'description', 'tags'],weight: [['title', 2]]
You can specify a intermediate node to select all leaf of the intermediate node.
Selecting Another Index
Section titled “Selecting Another Index”The full-text search functionality provided by LinearIndex is sufficient for most static site use cases. However, if you require even faster search performance, alternative indexing methods can be utilized to optimize speed and efficiency.
GPU Linear Index
Section titled “GPU Linear Index”import { GPULinearIndex, createIndex, search, StaticSeekError } from "staticseek";
const index = createIndex(GPULinearIndex, array_of_articles);By leveraging GPULinearIndex, fuzzy searches can be offloaded to the GPU, significantly improving performance. This method can achieve several times the speed of LinearIndex. The usage remains identical to LinearIndex, making it easy to switch between implementations.
If a GPU is not available in the execution environment, GPULinearIndex will automatically fall back to LinearIndex, ensuring compatibility across different devices.
Hybrid Trie Bigram Inverted Index
Section titled “Hybrid Trie Bigram Inverted Index”import { HybridTrieBigramInvertedIndex, createIndex, search, StaticSeekError } from "staticseek";
const index = createIndex(HybridTrieBigramInvertedIndex, array_of_articles);The HybridTrieBigramInvertedIndex offers a 10x - 100x search speed improvement compared to LinearIndex. The API usage remains the same, making integration seamless.
However, this increased speed comes at a cost, introducing several trade-offs:
- Longer Indexing Time: Index creation is slower, taking approximately 2 seconds for 100 articles.
- Higher Search Noise for CJK Language: False positives (irrelevant results appearing in search results) become more frequent in languages such as Chinese, Japanese, and Korean (CJK).
- Reduced Accuracy for CJK Languages: Fuzzy searches in CJK may produce noisier results, matching unintended terms.
- Limited Result Metadata: Some search result details, such as exact match position (
pos) and surrounding text (wordaround), are unavailable. - Incomplete TF-IDF Scoring: Currently, only term frequency (TF) is calculated and weights are not refrected, leading to less refined ranking.
Despite these drawbacks, HybridTrieBigramInvertedIndex ensures fast search performance across all devices, delivering a smooth user experience. If prioritizing responsiveness is critical, this index type is a good choice.