What is Tupshar?

Tupshar (Akkadian: scribe; tablet-writer) is an AI-native document store. It's a research platform exploring how AI systems can effectively work with knowledge bases.

You store documents via HTTP API or MCP tools. Tupshar indexes them with BM25 full-text search. You query with natural language or structured filters. Results come back ranked by relevance.

That's it. No ML magic, no fine-tuning, no black boxes. Just fast, honest search built for AI systems.

Why Now?

Large language models have changed what retrieval means. LLMs don't just need documents — they need relevant context in the right form at the right time.

Most knowledge base systems were built for humans:

Heavy on UI. Light on APIs.
Designed for users to click around. Not designed for AI agents to integrate.
Single-replica, single-tenant, runs-on-your-laptop patterns.

Tupshar is different. It's designed from the ground up as an API. No UI. No console. Just clean, simple HTTP endpoints and MCP tools. Store, search, retrieve. That's the contract.

Research, Not Product

This matters: Tupshar is research software. We're shipping it early to learn.

What that means:

No SLA. No uptime guarantee. No support team.
APIs may change. We'll communicate, but breaking changes may happen.
Data durability guarantees are weaker than production systems.
We make decisions for learning, not for scale.

What it doesn't mean:

It's not broken. 220 tests. Stable API. Works.
It's not "beta". We're not hiding behind marketing language.
It won't be archived. If this proves useful, it becomes a product.

Research Goals

We're exploring:

What AI systems actually need from knowledge bases
- Is BM25 search enough, or do we need semantic embeddings?
- How should APIs look? What operations matter?
- What failure modes hurt?
How to build this at reasonable cost
- Can we use commodity SurrealDB for this workload?
- What does it cost to run per-tenant isolation?
- Can we provide search at API-only cost?
How retrieval fits into LLM workflows
- What latency matters?
- What does "relevant" mean when the user is an AI?
- How do quotas affect the experience?

Your feedback shapes these answers.

Technology

Language: Rust (for performance and safety)

Storage: SurrealDB (per-tenant, multi-modal: documents + queries)

Search: BM25 full-text search with configurable ranking

API: HTTP/REST with Bearer token authentication

Integration: MCP tools for Claude and compatible clients

Observability: Structured logging, metrics, request tracing

Team

Tupshar is maintained by the Upside Down Research team. We build authorization and research systems.

Roadmap

Research Phase (now)

Collect usage patterns
Identify pain points
Decide on long-term architecture

Hardening Phase (TBD)

Multi-tenant safety improvements
Key rotation / tenant separation
Email verification for signup

Product Phase (TBD)

Production SLA
Multi-key tenants
Admin UI
Fine-grained quotas and rate limiting

We'll post updates as we learn. Watch this space.

Questions? Email us at paul@upside-down-research.com