Archy Knowledgebase

Build a context-aware knowledge graph for advanced RAG.

Ingest files, databases, and web pages. Extract entities and relations. Resolve duplicates. Load Neo4j. Then answer questions with hybrid retrieval: vector search plus graph traversal.

Book a walkthrough Request beta access

Pipeline (Dagster)

Ingest → Extract → Resolve → Graph → Retrieve

Neo4jQdrantRedisParquet snapshots

Ingest

Canonical documents + chunks from files, web, and databases.

docs/*.md → Document

html → cleaned text

Extract + resolve

Entities, relations, and probabilistic deduplication to keep the graph clean.

Profiles choose extractors (spaCy rules, LLMs) and matchers (splink).

Hybrid retrieval

Answer questions with evidence: vector search plus graph traversals.

Vector recallQdrant

Relationship contextNeo4j

Why teams use it

Retrieve the “why” with evidence — not vibes.

Vector search alone finds text. Knowledge graphs capture relationships: systems used by services, ownership, dependencies, decisions, and the trail of evidence. Archy Knowledgebase combines both so answers stay grounded in your sources.

See common questions

Hybrid retrieval

Combine vector recall with graph traversal to surface the most relevant context and the relationships around it.

Entity resolution

Probabilistic matching deduplicates entities so your graph stays coherent across sources and naming variations.

Snapshots + lineage

Materializations, caching, and Parquet snapshots let you reproduce results, compare runs, and debug extraction changes.

Configurable by profile

Adapt extraction and matching to your domain.

Use profile-driven configuration to choose extractors (spaCy patterns, LLMs), tune chunking and prompts, and switch matching strategies per source type or language.

Talk to us

Profiles control

extractors: [spacy, azure-openai]

matchers: [splink, rules]

graph: neo4j (weighted edges)

retrieval: qdrant + graph traversals

FAQ

A few quick answers about how Archy Knowledgebase fits into your workflow.

Talk to us Join beta

Build a context-aware knowledge graph for advanced RAG.

Ingest → Extract → Resolve → Graph → Retrieve

Retrieve the “why” with evidence — not vibes.

Hybrid retrieval

Entity resolution

Snapshots + lineage

Adapt extraction and matching to your domain.

FAQ

Is this replacing our docs?

Is this just vector search?

What data sources can we ingest?

How do snapshots help?