стебло · uk stemmer
Ukrainian · rule-based · pure Go

steblo

стебло — “stem / stalk”. A Porter-style stemmer that strips a Ukrainian word down to its botanical root.

Zero runtime dependencies. Stateless and safe under any number of goroutines. An allocation-free hot path. ~4.9M words/sec.

field notes

Small, sharp, and honest.

i.

Zero deps

No cgo, no models, no regex in the hot path. One package, stdlib only.

ii.

Concurrency-safe

Stateless, no package-level mutable state. Call Stem from any number of goroutines.

iii.

Alloc-free

The stem is a left-anchored slice of the input — StemRunes allocates nothing.

iv.

Spec-driven

A canonical algorithm spec is the source of truth; every cross-impl divergence is resolved on the record.

v.

Differential-tested

A 12k-word corpus, 92.5% cross-implementation consensus, gates CI.

vi.

Bleve-ready

An optional, decoupled uk analyzer + uk_stem token filter.

cultivation

Plant it in three lines.

// go get github.com/tggo/steblo
package main

import (
  "fmt"
  "github.com/tggo/steblo"
)

func main() {
  fmt.Println(steblo.Stem("випробування"))
  // → випробуван
}
# CLI: stemctl
echo "слова українські красиві" | stemctl
# → слов українськ красив

stemctl --strict --json < words.txt
# {"слова":"слов", ...}

stemctl --bench < bench/words.txt
# 10000 words, 2.2ms, 219 ns/word
growth rate

Measured, not promised.

127ns
per word · runes
0alloc
StemRunes hot path
4.9M/s
words / second
91%
test coverage

Apple M4 Max · Go 1.25 · over a 10k-word workload. Numbers, methodology, and an honest accounting live in the repo’s bench report.