Kolect, “collect”, and the moment we realized our search was lying

A friend of ours was beta-testing the agent. His agency is called Kolect. He set it up, ran the first scan, and what came back was… advice on how to organize a Pokémon trading-card collection.

He was good-natured about it. We spent the next three days finding out why.

Layer 1: the obvious culprit

The first guess was that our query expansion was being too aggressive — turning “Kolect” into “collect” via an over-eager spell-correction. We logged the actual queries hitting the search providers and confirmed the suspicion: yes, three of the five providers were silently substituting collect for Kolect.

We turned spell-correction off across the board and re-ran. Things got worse. Now the queries returned almost nothing, and what they did return was unrelated — random Reddit threads where someone happened to type the word “Kolect” in a typo for “collect.”

Layer 2: the disambiguation gap

It turned out spell-correction wasn't the cause; it was a symptom. The real issue was that our query planner had no idea what kind of entity “Kolect” was supposed to be. Without context — industry, geography, what they actually do — the planner was treating the brand name as a free-form keyword. And free-form keywords sit on top of a power-law distribution: most documents matching the literal string are noise.

Stripe doesn't have this problem because everyone knows what Stripe is. Kolect, with 38 employees and a niche in creator-matching, does. So do almost all our users — that's the market we serve.

Layer 3: the rewrite

We rebuilt the query planner around a strict rule: every search query carries the full brand context — industry, market, product description, and disambiguation phrases — whether the underlying provider supports structured queries or not.

For providers that support boolean operators, this becomes:

(Kolect OR "Kolect agency") AND (
    "creator agency" OR "ugc" OR "influencer marketing"
  ) AND (
    "us" OR "united states"
  )

For providers that only accept a single string, we generate a denser query:

Kolect creator agency UGC influencer marketing US

And we score every returned result against the same context vector before we surface it to the user. Anything below a 0.6 relevance score gets dropped silently. Results between 0.6 and 0.8 get a small “low-confidence” flag in the UI so the user can see we're less sure.

What it taught us

Three things that have shaped how we think about the agent:

The user's brand name is rarely enough. Treat every search as a context-rich operation, even if that triples the prompt.
Silent quality is more important than apparent quantity. Better to show 7 relevant results than 30 with 23 noise.
The user has to be able to feel us being careful.The “low-confidence” flag added cost; it also added trust. People who saw the flag assumed the high-confidence results were actually high-confidence.

Kolect now sees Kolect-relevant signals. Pokémon collectors are, presumably, also seeing Pokémon-relevant signals — somewhere else.

later noteWe later added a one-click Not my brandreject button on every signal. Each press updates the per-brand filter within minutes. It's the most-used button in the beta.

Kolect, “collect”, and the moment we realized our search was lying

Layer 1: the obvious culprit

Layer 2: the disambiguation gap

Layer 3: the rewrite

What it taught us

Why we rebuilt Panoverse in terminal-premium

Why six channels — and not three, twelve, or one giant feed