Back to Blog
7 min read

Karpathy's 400,000-Word Obsidian Wiki Has Zero RAG Infrastructure

Andrej Karpathy posted a short tweet last week that made about half the personal RAG stacks on the internet look like massive overkill. No vector database. No embeddings. No retrieval chain. Just a folder of markdown files, Obsidian, and one schema file that Claude Code reads every session. He is running a 400,000-word personal research wiki on it.

This post is the complete walkthrough of the pattern, the exact schema file every other tutorial glosses over, and a downloadable Obsidian vault template you can use today.

The Core Insight

Most people's experience with LLMs and documents looks like standard RAG. You upload a collection of files, the model retrieves relevant chunks at query time, and generates an answer. This works, but the model is rediscovering knowledge from scratch on every question. There is no accumulation. Ask a subtle question twice, the model has to find and piece together the relevant fragments twice.

The wiki pattern inverts this. Instead of retrieving from raw sources at query time, the LLM incrementally builds and maintains a structured wiki that sits between you and the raw sources. When you add a new source, the model reads it, extracts the key information, and integrates it into the existing wiki. Updating entity pages, revising topic summaries, noting contradictions, strengthening the synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you have read. It gets richer with every source you add and every question you ask.

The Three Layers

Every version of this pattern has the same three layers. Get these right, nothing else matters.

Layer 1: raw/

This is your inbox. Articles, papers, transcripts, pasted notes, screenshots. The LLM reads from this folder and never writes to it. Everything in here is immutable. It is your source of truth.

Layer 2: wiki/

This is where the LLM lives. It writes summaries, entity pages, concept pages, topic indexes, and the master index. You read this layer in Obsidian. You do not write in it. Manual edits cause the system to drift session over session.

Layer 3: CLAUDE.md

One file at the vault root. The schema. This is what turns a generic Claude Code session into a disciplined librarian. Every session reads this file first. Every operation follows rules from this file. This is the piece every other tutorial skips.

Optional fourth layer: **output/** for query results, reports, slide decks, and generated artifacts. Good outputs get promoted into the wiki as new articles so your explorations compound.

The Four Operations

Four verbs run the whole system.

**Ingest.** Drop a source in raw/ and type compile in Claude Code. The model reads the file, decides which topic it belongs to, writes or updates a wiki article, adds backlinks, updates the topic index, updates the master index, and appends a log entry. One source typically touches 10 to 15 wiki files in a single pass. That is the bookkeeping humans abandon and the reason every second brain system eventually dies of neglect.

**Query.** Ask a question. The model reads the master index first, then the topic index, then 1 to 3 specific articles. Three to four file reads, no vectors, no embeddings. The index files are the retrieval layer, and the model maintains them for you automatically.

**Lint.** Periodic health check. The model reads every file in the wiki and produces a report covering contradictions, stale claims, orphan pages, missing cross-links, unsourced claims, and suggested new articles. No changes happen during the lint pass. You approve specific fixes one at a time.

**Log.** Every operation appends one line to wiki/log.md. Append-only, timestamped, parseable with grep. Gives the wiki a memory of what happened when.

The CLAUDE.md File (The Part Nobody Shows)

Every walkthrough online calls this file "the brain" and never opens it on camera. Here is the exact structure that makes the pattern survive across sessions.

The file is about 60 lines. It opens by telling the model who it is:

> You are the librarian of this vault. The wiki/ folder is your domain. You write and maintain every file in wiki/. The human rarely edits wiki files directly.

That sentence is load-bearing. Without it, Claude Code treats wiki files like any other files and starts deferring to whatever it sees there. With it, the model takes responsibility and rewrites pages confidently when new sources conflict with old ones.

The next section defines the four operations with numbered procedures. When you type compile, the model runs seven specific steps in order. When you ask a question, the model runs five specific steps. When you type lint, the model produces a seven-section report. These procedures are what makes the behavior consistent across sessions.

The conventions section enforces the rules that keep the wiki honest:

  • Every wiki article cites the raw source file it was compiled from.
  • Every article includes a Key Takeaways section.
  • File names use lowercase with hyphens.
  • Wikilinks are required for every cross-reference.
  • Bullets over paragraphs.
  • Never invent claims. Flag gaps in an Open Questions section.
  • The citation rule is the hallucination fix. Every article has to name the raw file it came from. If the model writes something the source does not say, the next lint pass catches it. That is how you keep a wiki honest at 200 articles.

    The full file is in the downloadable vault template linked at the bottom of this post.

    Live Query Walkthrough

    Three query patterns work well against a wiki built this way.

    **Direct lookup.** Ask a specific factual question about a single article. The model reads the master index, finds the topic, reads the article, answers. Three file reads.

    **Cross-topic synthesis.** Ask a question that spans multiple topic folders. The model reads the master index, then multiple topic indexes, then 2 to 4 articles across topics, synthesizes. Six file reads for a complex question.

    **File-back synthesis.** Ask a question and tell the model to file the answer back into the wiki. The model produces the answer as a new wiki article in the appropriate topic folder. Your exploration compounds into the knowledge base. Every future query benefits from this answer being present.

    All three patterns run in seconds. None of them touch a vector database.

    Vector RAG vs the Wiki Pattern

    The question everyone asks is whether this replaces vector RAG. The honest answer is below 500 articles, the wiki wins on four out of five factors.

    FactorKarpathy WikiVector RAG
    InfrastructureFolder of markdown filesVector DB, embedding model, hosting
    Setup time15 minutesHours to days
    Scale ceiling~500 articlesMillions of chunks
    Human browsableRead and navigate freelyBlack box
    Outputs compoundQueries file back as new articlesChat is ephemeral

    Vector RAG wins on scale. Wiki wins on everything else. Below 500 sources, the wiki is strictly better for solo operators. Above that, hybrid makes sense: wiki for structured synthesis, vector for semantic fallback across long-tail retrieval.

    Scale Ceiling and Hallucination

    Two objections come up every time this pattern gets posted.

    **Scale.** The wiki starts breaking down around 500 articles because the master index stops being a reliable navigation layer. The fix is either to split into multiple topic-specific vaults with a top-level router, or to bolt on a small BM25 or hybrid search tool over the markdown files. Karpathy mentions using a local search tool at larger scale. You probably will not hit this ceiling for a year.

    **Hallucination drift.** If the model writes a wiki page that drifts from the source, the error propagates into every future query. The fix is in the schema file above. Every article cites its raw source. Every lint pass checks citations against sources. Flag anything unsourced, review it, correct it. This is not automatic safety. It is a maintenance loop the model runs for you.

    The Free Template

    Everything in this post is in a ready-to-use Obsidian vault template. Folder structure, the full CLAUDE.md with all four operations, three example pages (entity, concept, source summary) so the model has shape references, a starter master index, a starter log, and a README with 2-minute setup.

    Download the template. Open the folder as a vault in Obsidian. Run Claude Code in the root. Drop a sample article in raw/. Type compile. You have a working self-maintaining wiki in under 90 seconds.

    [Download the Karpathy Vault Template]

    What Comes Next

    The wiki pattern is the foundation. Once it works, the useful extensions start showing up:

  • **Multi-vault federation.** One vault per domain, a top-level CLAUDE.md that routes queries across them.
  • **Auto-ingest.** A cron job that watches a Gmail label, a Slack channel, or an RSS feed and drops new items into `raw/` automatically. The model processes on schedule.
  • **Agent handoff.** Separate agents with different roles all reading from the same wiki. Research agent ingests, executive assistant queries, content agent publishes.
  • **Voice note ingestion.** Record a voice memo, transcribe with Whisper, drop into `raw/`, the model files it as a dated journal entry cross-linked to relevant concepts.
  • All of these reuse the same vault template. The pattern compounds.

    Key Takeaways

  • The wiki pattern compiles raw sources once into a persistent structured artifact instead of retrieving from them on every query.
  • Three layers: raw (immutable source of truth), wiki (LLM-owned synthesis), schema (CLAUDE.md configuration).
  • Four operations: ingest, query, lint, log.
  • The CLAUDE.md file is the load-bearing piece every tutorial glosses over. The full version is in the free template.
  • Below 500 sources the wiki beats vector RAG on almost every axis.
  • Download the template, open in Obsidian, run Claude Code, start compiling. Under 90 seconds from zero to working.
  • Video walkthrough: [YouTube link]

    Download the template: [link]

    Free community: https://www.skool.com/stride-ai-academy-7057

    Transcript

    Everything you're looking at right now

    was built 100% with Claude Code and

    Obsidian. Every wiki page, every

    backlink, every summary, I simply

    dropped in raw articles, raw transcripts

    into a folder and walked away. When I

    came back, I had this. Now, this entire

    pattern didn't come from me. This came

    from a Andre Karpathy tweet, which went

    mega viral, and this whole entire setup

    takes about 15 minutes, and you're going

    to be getting access to this finished

    vault template at the this video 100%

    for free, no paywall or anything like

    that. Now, you've probably saved

    articles, podcasts, notes, whatever the

    case is, you know, in Notion, PDFs,

    podcast transcripts, and you meant to

    review these, but you never got around

    to them, and part of the reason of that

    is is because none of it is actually

    searchable. None of it's connected, it's

    just scattered all over the place, and

    none of it actually feeds into your

    work. And every second brain system you

    may have tried out there, whether it's

    Notion, different systems and setups,

    they typically fail for the same reason.

    You stop maintaining it, and the reading

    and thinking was never really the

    problem, the bookkeeping was. Now, Andre

    Karpathy just posted the fix on Twitter,

    it went viral, he posted a gist as well,

    which really dives deep, and all this

    will be linked, of course, in the

    description down below. I'm also going

    to be giving you a free resource, it's a

    12-page document that literally goes

    over everything, goes over in-depth this

    entire system from start to end, which

    we're, of course, going to dive deep

    into in today's video. I also will be

    giving you this exact template, so you

    can literally just copy and paste and

    start using this by the end of this

    video for your own knowledge system.

    Now, before we dive in, just to give you

    some quick context in case you don't

    already know, who is Andre Karpathy?

    Well, he was a part of the founding team

    at OpenAI, he was the head of AI at

    Tesla, built the entire autopilot vision

    stack, he created one of the most

    popular deep learning courses on the

    internet. You may have seen his recent

    GitHub project go viral, auto research.

    And when this guy posts a workflow for

    managing knowledge, or really anything

    about AI that he posts, it's worth

    paying attention to, and really everyone

    just listens. So, currently he is

    running a 400,000 word personal research

    wiki with no vector database, no

    embeddings, no retrieval chain, just

    markdown files and Obsidian, and one

    schema file that Claude Code reads every

    single session. Now, in the next 10

    minutes or so, I'm going to show you the

    entire pattern so you have a deep

    understanding of it, every line of the

    schema file that nobody else wants to

    explain. I'm going to show you a live

    compile, a live query, a live lint, and

    by the end of this video, you will have

    a working system and a template that you

    can run on your own stuff. Now, every

    version of this pattern has the same

    three layers. Get these right and

    nothing else really matters. So, layer

    one is raw. This essentially is your

    personal inbox as a human. So, what goes

    in here are articles, papers,

    transcripts, screenshots, anything that

    you dump in here that you want the LLM

    to read, and that's where it's going to

    read is from this raw folder. Now, keep

    in mind the LLM is never going to write

    into this folder. This is literally just

    an inbox for you to dump things in, and

    this is immutable. It is your source of

    truth. Now, layer two right here is the

    actual wiki, so this is where the LLM

    files live. It writes summaries, it

    writes entity pages, it writes concept

    pages, it builds an index, it maintains

    cross-links between every article. You

    read this layer in Obsidian. Now, you as

    the human do not write in this layer.

    Not because you can't, but because every

    edit that you make in here is one that

    the model cannot predict in the next

    session, and the whole system starts to

    drift. And then at the top here for

    layer three is the schema. This is one

    claude.md at the vault root. And this is

    what turns a generic Claude Code session

    into a disciplined librarian. Every

    session it reads this file first, every

    operation follows rules from this file,

    and this is the core piece of the

    system. Now, if you want a deep dive

    into claude.md files, make sure to check

    out this video right here I did a few

    days ago on claude.md files and how to

    properly structure them, but of course,

    like I mentioned, I'm going to be giving

    you this free template, which includes a

    structured Claude MD that you can use

    out of box. So, raw, wiki, schema. That

    is the whole architecture. Now, let's

    dive into the actual layers and how this

    all works. So, there's really four

    operations, and we're going to start off

    with the first one, which is ingest. So,

    this is where you drop one source in the

    raw folder, and you tell Claude Code to

    then compile it. And in one pass, it can

    touch 10 to 15 wiki pages. Here's what

    it actually does though under the hood.

    So, it reads the raw file in full, then

    it runs checks in the master index to

    see if the topic already exists. If yes,

    it updates the existing article and adds

    backlinks from any related pages. If no,

    it creates a new topic folder with its

    own index file. Either way, it updates

    the master index and appends a line to

    the log, and the wiki has just

    compounded by one source. All right, so

    let me show you how this actually works

    in action. So, first things first,

    you're going to want to download the

    Obsidian web clipper. So, link to this

    will be in the resource below, and this

    whole document, as well as all the

    different resources, templates, etc.

    from this video and others is available

    in our free Stride AI Academy. Now, once

    you go ahead and actually download this,

    you'll see the Obsidian icon right here

    for the extension. We can go ahead and

    click on it. You're going to want to

    click on settings to open up the

    Obsidian settings. You'll see right

    here, I actually selected the name of my

    specific vault. By default, it will save

    it to the open vault. You can also

    create new templates for how it's going

    to save it, or use this default one

    right here, and all you're going to want

    to do is just change this note location.

    It's going to be clippings by default,

    but you can change this to raw, or

    whatever you want to have your actual

    intake folder to be. Once we have that

    set up, we're going to find a blog or

    whatever piece of information that you

    want to actually ingest into the system.

    For this, we're going to be using one of

    Claude's blog posts right here. I'm just

    going to go ahead and click on this, and

    then I'm going to click add to Obsidian.

    Next, it's going to ask me to open

    Obsidian. And boom, here you can see we

    have this entire article with different

    properties, such as the title, the

    source, the author, when it was

    published, created, a description, any

    specific tags. And you can see here, it

    literally pulled in everything,

    including the images. Now, by default,

    it won't actually pull in the images,

    but we're using a plugin right here

    called local images plus, which by

    default is actually installed in the

    template that I'm providing you for

    free. Actually, every single plugin that

    I cover is actually installed in this

    template by default. But if you already

    have an Obsidian setup and you're

    setting this up in there, or whatever

    the case is, maybe you just need to

    install this plugin yourself. You can

    just go over to community plugins, make

    sure you turn them on, and then you're

    going to want to search over here,

    browse, and you're going to want to

    search for the specific plugins that we

    cover. So, I have DataView installed

    here. I also have local images plus, as

    well as terminal. So, the terminal one

    right here, you can actually use Claude

    Code if you want in Obsidian if you

    don't want to leave uh Obsidian

    whatsoever. I personally usually prefer

    using it in something like Cursor's

    terminal or VS Code, and I have Obsidian

    and my actual IDE open at the same time.

    So, here you can see we have that same

    Claude blog post that we just saved into

    our Obsidian, and I'm going to say to

    Claude, compile, and then I'm linking to

    that specific uh blog post right here,

    that markdown file, compile this one

    into the wiki. That's all I'm going to

    say, and Claude's just going to do its

    magic. It's going to take maybe uh 30

    seconds to a minute, depending on how

    much you're compiling, and it's going to

    actually go about the process. So, watch

    what happens. It's going to read the

    article, it's going to identify the core

    topics, going to decide what needs its

    own concept page. If the topic doesn't

    exist yet, it's going to then write the

    summary, it's going to write a key

    takeaway section, it's going to add

    inbound and outbound wiki links, and

    then it's going to update the topic

    index, and it's also going to update the

    master index, and then it's going to log

    the entry. All from one simple command,

    which, of course, is powered by our

    claude.md file. So, before Claude Code,

    Obsidian was kind of like a big scary

    tool for a lot of people because you

    have to do all these different things,

    backlinks. It was great for bookkeeping

    your notes and knowledge, but it's

    something that humans aren't really

    going to do, and they're just going to

    abandon, you know, 15 edits across eight

    files from one source, you would never

    personally do this manually, but Claude

    Code does this for us in 30 seconds, and

    it's easy as that. So, if you start

    building this up, you do it 10 times,

    you start to have a real knowledge base,

    and if you do it maybe like 100 times or

    a couple hundred times, you know, the

    graph view is going to start looking

    like actual research. And boom, it's

    done. You can see we have a new topic,

    which is agent design patterns with five

    articles. So, we have three key

    patterns, composable general tools,

    progressive context management, prompt

    caching strategies, and declarative tool

    designs. You can see it also updated the

    master index, the log.md, plus added

    backlinks with a total of 11 files

    touched. And like I said, the template

    that I'm providing you for free with

    this entire knowledge system comes with

    a complete claude.md file, and this is

    really what makes this system work. But

    you can, of course, customize this if

    you want to change the system for your

    specific flow. You can see here, it

    starts off saying, "You are the

    librarian of this vault. The wiki

    {slash} folder is your domain. You

    write, you maintain every file in this

    wiki. The human rarely edits wiki files

    directly." Now, this basically is

    defining the ownership. Without it,

    Claude Code treats wiki files like any

    other file and starts deferring to

    whatever it sees there. But with it,

    it's going to start taking

    responsibility and write these pages

    confidently with new source conflicts.

    So, you can see here for ingest, read

    the raw file in full, identify the core

    topic or topics, check wiki master index

    for matching topic folder. If the topic

    exists, update or extend the relevant

    articles and add backlinks from touched

    pages. If the topic does not exist,

    create a new folder under wiki with a

    lowercase hyphenated name and create a

    underscore index.md inside of it. Every

    wiki article must include a top level

    title, source which is path to raw md

    file line, a two to four sentence intro

    paragraph, a key takeaway section with

    bullet points, a related section with

    wiki links to three to eight related

    pages, update the topic underscore

    index.md,

    update the wiki master index if a new

    topic was created, append one line to

    wiki.log.md

    if the source spans multiple topics

    create articles in both and cross link.

    We have about 10 steps here and the

    model's going to follow them in order

    every single time because this file in

    the cloud.md tells it what to do and

    it's loaded in every single conversation

    so you won't have to prompt it again.

    Next we have the query section so this

    is triggered when the human asks a

    question so read wiki master index first

    then read the matching topic index.md,

    read one to three specific articles in

    full, synthesize the answer with

    citations, and then offer to file

    substantial answers as new wiki

    articles. So three to four file reads to

    answer any question. No vectors, no

    embeddings, no cosine similarity, BM25,

    the index file is the retrieval here and

    that's because the model maintains it

    for you. All right, so next is the lint

    which is the health check. So live is

    the append only record and we're going

    to run both the query and the lint live

    in just a second and you can see here

    this is going to be triggered simply by

    just saying the word lint or audit the

    wiki. It's basically going to read every

    file in the wiki and produce a report

    covering contradictions, stale claims,

    orphan pages, missing concepts, missing

    cross links, unsourced claims, and

    suggested new articles and then output

    the report to output-lint-report

    and then the date here and then wait for

    approval before fixing. And then for log

    every operation appends one line to the

    wiki.log.md in this

    time, operation, short description, and

    then the files touched. And this is

    append only it never rewrites existing

    lines. And then we have conventions so

    things like citations are required every

    wiki article includes source line, key

    takeaway section is required on every

    article, file names are lower case

    hyphenated, use wiki links for every

    cross reference, bullets over

    paragraphs, never invent claims, flag

    gaps and open questions, and then flag

    contradictions when found. And then when

    the human asks something outside of the

    rules ask a clarifying question, do not

    silently invent a new operation. So the

    citation rule right here is the

    hallucination fix. Every article has to

    name the raw file it came from. If the

    model writes something that the source

    doesn't say that in the next lint pass

    catches it. And that's how we're going

    to keep this wiki honest at scale. And

    that's literally the entire system. It's

    essentially a 60 line or so cloud.md

    markdown file. And this essentially is

    the brain of the entire system and

    that's why I went through it for you

    guys because it's very important for you

    guys to know it. Now just so you guys

    know in our template we do have some

    additional things that Karpathy doesn't

    even cover and that I just felt did add

    value to this system so I'll quickly go

    over them. We also have an AI-research

    folder. So this is the AI's research

    folder. So you're going to see in a

    second yes we can ingest manually

    through the means I just showed you with

    the Obsidian Clipper or just you know

    scraping our own podcast transcripts or

    whatever the case is from different

    sources. You can also get Claude code of

    course to conduct autonomous research on

    the web and save the full clean source

    content into this folder as markdown

    files. And you'll see here we're telling

    Claude that it can write to this folder.

    And files here are immutable once saved,

    do not overwrite, create new files. And

    this separates human curated sources

    which are in the raw folder from AI

    discovered sources which are in the AI

    research folder. And you'll see research

    is triggered when the human asks you to

    research a topic or when a query reveals

    gaps the wiki cannot answer from

    existing sources. What it's going to do

    is search the web for relevant high

    quality sources on this topic and then

    for each source found save the full

    clean content not a summary as markdown

    files in this folder. You can see the

    format here and we're basically just

    giving it some additional stuff for the

    format that it's saving it as. So in the

    doc here if you want to see more in

    depth stuff such as the folder

    structure, the page templates, entity

    page templates that we have within the

    system as well as the concept page

    template, the source summary template,

    you can see all that there. But we're

    just going to move on to the query

    pattern. So there's really three types

    of queries that work well against the

    wiki built this way. The first is the

    direct lookup. We have direct lookup,

    cross topic synthesis, and file back

    synthesis. I'm going to run all three

    against this vault that already has

    about 10 plus articles as well as

    transcripts from some of my YouTube

    videos inside of it. All right, so I'm

    actually just going to use the Obsidian

    terminal right here but you can use

    either or the IDE with cursor, VS code,

    whatever the case is. I'm just going to

    say what are the key points from my

    cloud.md video. And you can see here I

    have my YouTube transcripts which

    includes the cloud.md video. So this is

    a direct query. You can see here let me

    find the relevant raw files and query

    the wiki for additional coverage.

    There's a wiki article specifically

    about cloud.md best practices. And boom

    here you can see this is exactly what I

    covered in that video. If you didn't

    haven't watched that video I definitely

    suggest you to go watch it. We covered

    two different research papers right

    here. Why Claude ignores your

    instructions, the fix so six different

    sections right here, and then some

    practical tips.

    All right, so that's the first query

    direct lookup. Next is the cross topic

    synthesis. So I'm going to ask what

    techniques are mentioned across multiple

    videos. You can see here nine videos

    across 11 topics. Let me read the topic

    indexes to trace which videos feed into

    which articles then identify overlapping

    techniques. And boom here we got our

    answers. So we can see technique number

    one context/token management so that's

    from videos one, two, three, four, five.

    And we can see what it pulled there and

    then we can see technique two prompt

    caching from videos one, three, and

    four.

    And then we can see number three

    composable general tools over task

    specific tools video three, four, and

    nine. And then dead weight pruning,

    reevaluate assumptions videos two and

    three, and then cost monitoring and

    mitigation videos one, five, and nine,

    security boundaries, and then persistent

    always on agents. All right, so query

    three file back synthesis. So this is

    where the answer becomes a new wiki.

    Basically the cool thing about this is

    the exploration compounds into new

    knowledge in your knowledge base. So

    here we're saying based on everything in

    the wiki what are the main approaches to

    long-term memory for AI agents. Save

    your answer as a new wiki article and

    cross link in the sources you

    referenced. And boom you can see here it

    is now done so here's what it created a

    new article right here. It's under the

    wiki agent design patterns and we have

    long-term memory approaches so you can

    see this whole entire document right

    here. You can see it has links to

    related documents right here so you

    could see multbook VPS setup, AI

    workflow builder, cloud.md best

    practices. So we can see it synthesizes

    four approaches from across this wiki so

    memory folders, compaction, structured

    config files, external rag. The

    overarching finding simpler model native

    memory is replacing external

    infrastructure as models get more

    capable which we're kind of seeing right

    here in front of our eyes with this

    system. Now this query is great because

    everything you ask adds value to the

    system. You never lose a good answer to

    chat history like you may have in the

    past and this is the part that turns a

    knowledge base into a research partner.

    All right, so next we're going to do the

    AI research query. So this is basically

    a query that the AI shouldn't have

    access to within the current knowledge

    base. So I'm going to ask it research

    what QMD is and how it works as a

    research layer on top of markdown wikis.

    Save what you find to AI research. You

    can see here it is doing its research.

    It's running the fetch multiple

    different times for different things

    right here. If you don't know about QMD

    I'll dive deeper into this in future

    videos. We do mention it in the resource

    down below. This is a tool for BM25 rag

    which is created by Toby who is the

    founder of Shopify. Okay, and boom you

    can see that it saved the different

    research in our AI research folder right

    here and you can see over here in our

    knowledge search wiki we have QMD right

    here and we can see all the different

    stuff about it BM25, vector semantic,

    hybrid plus re-ranking. Really cool

    stuff and just to go over one more time

    each wiki has its own index so we can

    see all the different stuff within this

    wiki right here. So QMD overview,

    indexing and chunking, MCP integration.

    And we have that for every single wiki

    with its own index and then we have the

    master index which links to each

    specific wiki. So this is sort of like

    progressive disclosure. This is exactly

    progressive disclosure like we talk

    about in some of our other videos. I

    talked about it in the cloud markdown

    video as well. This is how skills work.

    We're giving it a short description

    right here for each specific wiki and

    then it's able to distinguish which wiki

    it should actually go to based on this

    description. And then once it's in this

    wiki we can see as well another index

    for the specific wiki where it has a

    description for each specific article.

    All right, so at this point if you

    stopped right now you'd have a full

    working system but in the next few

    minutes I'm going to show you how to

    actually keep this system healthy at

    100, 200, or 300 articles. So now all

    I'm going to do is just simply run lint.

    And once again the lint operation it

    basically reads every file in this wiki,

    cross references them, and returns a

    report. So contradictions, stale claims,

    orphan pages, missing cross links,

    unsourced claims, all these things that

    you know humans as us may miss this is

    going to be able to detect. And now we

    can see it's going to work. All right,

    and boom the lint is complete so we can

    see our report is saved right here in

    the output folder.

    So we can see here we had no

    contradictions, no stale claims, no

    orphan pages, but we did have six

    missing cross links, three unsourced

    claims, and then five suggested

    articles. And in this report, we can see

    all the different details for each

    specific one. And then we can see here

    it's waiting for our approval before it

    actually applies those fixes. So I'm

    going to go ahead and say I approve. All

    right, and boom, we can see that all the

    fixes are applied. So run this weekly or

    after any big ingest batch. This will

    take 5 to 10 minutes, give or take, give

    you a healthy wiki and a log entry so

    you can know when you last ran it. Now

    the question everyone's been asking is

    whether this actually replaces Vector

    Rag. And the honest answer is below 500

    articles or so, this wiki wins on four

    or five factors. So on infrastructure,

    Vector Rag needs a database and an

    embedding model, you need hosting. It's

    a lot more complex where the wiki is

    literally just a folder of markdown

    files. So the wiki wins in terms of

    simplicity and setup time. Vector Rag

    may take a few hours to maybe even a

    couple days depending on how complex the

    setup is. So the wiki wins with

    literally just a 15-minute setup. You

    can literally set it up and have it

    fully running by the time you're done

    watching this video. Now for scale,

    there is a ceiling. So Vector Rag

    handles millions of chunks. The wiki

    starts breaking down past maybe 500 or a

    few thousand articles. So in that case,

    Vector is going to win. And this here is

    one of the axes that actually matters.

    Next for human browsability, Vector Rag

    is literally just a black box. So the

    wiki is a set of pages that you can read

    and navigate and you can view in a nice

    GUI interface like Obsidian. So in that

    case, the wiki wins. And the cool thing

    is with the wiki, outputs compound. With

    Vector Rag, the chat is ephemeral. So

    wiki queries file back as new articles.

    So in this case, the wiki would win. So

    four out of five below 500 sources, the

    wiki wins. Above that, a hybrid approach

    makes sense. So wiki for structured

    synthesis and Vector for semantic

    fallback across long-tail retrieval. Now

    this could change as models progress and

    whatnot, but that's also why I reference

    QMD which you can definitely take a look

    at and I'll do more videos talking about

    QMD and showing you guys how to actually

    use it, but it is linked in the

    resources well. Now you could also use

    other vector or embedding strategies

    besides QMD, but that's just one I'm

    referencing here. There's a few

    objections that have been coming up

    every time people have been posting this

    on X and whatnot and let me handle some

    of them directly here for you guys. So

    the first objection is scale. So we

    talked about this already, but just as a

    baseline, the wiki starts breaking at

    around 500 articles because the index

    stops being a reliable navigation layer.

    Similar to when you bloat Claude Code

    with too many different skills, you will

    start to see some degradation in

    quality. Now above that, you either want

    to split it into multiple topic specific

    vaults or you would want to bolt on a

    small search tool over the markdown

    files. Like I said, I'm linking to QMD

    which is the same search tool that

    Karpathy actually uses, so that will be

    in the doc. And you probably won't hit a

    ceiling for quite a while depending on

    what your usage is with your actual

    vault, but then you may want to actually

    scale it and look at some of these other

    methods. So either a different vault or

    a hybrid strategy using something like

    QMD. The next objection is

    hallucinations. This is huge when

    dealing with really anything with AI

    models. So if the model writes a wiki

    page that drifts from the actual source,

    the error propagates into every future

    query. So it's essentially like a virus.

    So the fix for this is that Claude.md

    file and that's why we took a deep look

    at it so you have an understanding how

    this actually works. Now every article

    sites the raw file that it came from. So

    every lint pass that you do in future,

    checks those citations against the

    actual sources. Flags anything that's

    unsourced, it reviews it, corrects it.

    Now this is not automatic safety, but it

    is a maintenance loop for the model that

    runs for you. And like I said guys, this

    entire document breaking down everything

    that we covered in depth as well as some

    additional things that you can read

    through and our complete Karpathy

    Obsidian Vault Starter Kit is available

    for free in our school community, Stride

    AI Academy. So make sure to check that

    out. It's 100% for free to join, no

    paywall or anything like that. And you

    can network with myself as well as other

    like-minded AI entrepreneurs,

    enthusiasts. We have some really cool

    people in there and I'm excited to

    continue building this community with

    you guys. So that's pretty much it for

    this video guys. I hope you guys got

    some value from this. I definitely tried

    to explain everything and be as in-depth

    as possible because I know on the

    surface just reading his tweet, it's

    maybe a little bit difficult to

    understand. So I wanted to cover

    everything and show you the exact

    process and literally just give you the

    starter kit so you can get going with

    this as soon as possible because it is

    very valuable when you start using

    Claude Code with something like Obsidian

    and you actually start managing your

    second brain in an efficient way.

    Now I've personally been using Obsidian

    probably for about the last four to five

    months. I have a few different vaults

    that I've been using and it's helped me

    tremendously on many different facets

    from content creation, business,

    personal life, really anything for

    knowledge management. And I'm excited to

    share more about my custom Obsidian

    Vaults in the future. So if you want to

    stay up-to-date with that as well as

    Claude Code tutorials, AI tutorials,

    make sure to like this video, comment

    down below your thoughts. If I missed

    anything or if you have any insights or

    ideas about this Obsidian Vault or

    Karpathy or really just anything about

    Claude Code and this video in general,

    let me know on down below and stay tuned

    for future uploads because I plan on

    giving you guys a immense amount of

    value on this channel. So if you're new

    here, smash that subscribe button to

    stay up-to-date, join the Stride AI

    Academy down below. And I hope you guys

    have an amazing week. I will see you in

    the next video guys. Keep hustling, keep

    grinding and of course guys, accelerate

    your stride. Take care.

    Enjoyed this article?

    Join the Stride AI Academy for more insights and connect with 1,000+ builders.

    Join the Academy