NixCon 2024

Llamafiles plus RAG to have a chat with NixOS documentation running locally
2024-10-26 , Arena

Picture this: you're on a remote tropical island, with limited internet access and only your ultraportable laptop, which lacks a dedicated GPU. Now imagine you need to search resources like nix.dev, various GitHub configurations, or books like NixOS in Production. Basic indexing and grepping just won't cut it for handling docs, code, and especially PDFs. Plus, relying on exact matches feels outdated. What you really want is to ask natural language questions about the problems you're solving, and get comprehensive, detailed responses—using only the latest, most relevant docs and references.

Until recently, this idea was pure science fiction. But with the rise of large language models (LLMs), it's become reality. However, the usual suspects like ChatGPT have their drawbacks: they don't provide source links, tend to hallucinate, and require careful fact-checking. On top of that, running them is computationally expensive. Most ultraportable laptops lack the necessary GPU power, or, if they have one, it's energy-hungry or inadequate for LLMs. This effectively rules out using LLMs in the tropical laptop scenario. Even with enough compute power, setting up an LLM can be a headache, partly due to the notorious challenges of getting Nvidia CUDA to work smoothly on Linux.

Enter Justine Tunney's llamafile. It's a game-changer: a single zip file that runs on CPUs and across multiple architectures, built on the llama.cpp library, which is a C/C++ implementation of LLM evaluation for various model architectures. This simplifies running LLMs on CPUs by allowing existing model weights to be used without needing a GPU. Combine this with retrieval-augmented generation (RAG), which pulls relevant information from your chosen docs and GitHub repositories, and you've got an offline, fully functional setup for querying NixOS-related content—right from your beachside ultraportable laptop.


What level of experience in Nix is the talk addressed to?

beginner, anyone having to use nix documentation

Founded Data Science Retreat, longest running and most advanced ML training in Europe. Working on similar courses for security engineering (Why Nix is interesting for me!) and practical reasoning in the AI era. Trying to improve the human side on the human-AI pairing.