Below is a practical “menu” of the three most common ways people are combining local LLMs and Kiwix ZIM files today. Pick the style that matches your hardware, patience, and privacy needs.
Install the official “llm-tools-kiwix” plug-in inside the same Python environment as the llm CLI tool (works with any local model that llm can see, e.g. Llamafile, Ollama, LM-Studio, etc.).
# 1. Install the plug-in
pip install llm-tools-kiwix # or: llm install llm-tools-kiwix
# 2. Drop your .zim files in the folder where you will run commands
# (or export KIWIX_HOME=/path/to/zim/folder)
# 3. Ask anything – the LLM automatically searches the ZIMs for you
llm -m deepseek-r1:7b --tool kiwix_search_and_collect \
"Explain how a transistor works" --tools-debug
Clone the “zim-llm” project. It extracts every article from the ZIMs, chunks them, creates embeddings, and stores them in Chroma/FAISS so a local LLM can do retrieval-augmented generation.
# 1. Clone & install
git clone https://github.com/rouralberto/zim-llm.git
cd zim-llm && ./setup.sh # creates venv + installs libzim, sentence-transformers, langchain, etc.
# 2. Drop ZIM files into zim_library/
curl -L -o zim_library/en_wiki.zim "https://download.kiwix.org/zim/wikipedia_en_all_maxi.zim"
# 3. Build the vector index (one-time, hours for big ZIMs)
source zim_rag_env/bin/activate
python zim_rag.py build
# 4. Chat with your offline archive
python zim_rag.py rag-query "Compare Newton’s and Leibniz’s notation for calculus"
If you prefer point-and-click, run an MCP (Model-Context-Protocol) server that exposes the ZIMs as tools your chat GUI can call.
pip install openzim-mcp
mkdir ~/zim-files && cd ~/zim-files
# (download a small .zim first, e.g. wikipedia_en_simple)
# start the server
openzim-mcp --mode simple ~/zim-files
Then in LM-Studio (or any MCP client) add a new MCP server pointing to http://localhost:8000 and tick the “search_with_filters” tool. From now on you can simply ask questions in the chat and the LLM will transparently pull the relevant Wikipedia articles from your local ZIM file .
ollama pull llama3.1:8b) or LM-Studio.Pick one of the three recipes, keep everything on localhost, and you have a completely private, off-grid “AI Internet” running on your own hardware.