Paris 1788
LLM chatbot NPC in engine
Technical Implementation
Challenge
Implementing a large language model API into my own Unity / Unreal project without a paid subscription or dependency on a non-open-source provider. —> self-hosting. The AI should
- Enact a specific NPC’s personality
- Have access to custom and setting-specific background knowledge
- Be able to recall and reference the chat history
Solution
First, I set up a Jupyter Notebook that would later host my AI.
mkdir "Ollama Paris Chatbot"cd "Ollama Paris Chatbot"
For many of my next steps, Python is required. Install if uninstalled and create an .venv environment with Python as my interpreter. Navigate to the virtual environment where I will install some essentials for hosting my LLM of choice, “Ollama”.
.\.venv\Scripts\activate
pip install fastapi uvicorn[standard]pip install langchain-ollamapip install langchain-corepip install fastapi[dev]
From that point on, I am able to host my script with my LLM. Let’s set up a new Python file that will act as a server. It will require
- some references, among which lies my upcoming chatbot
- the ability to ask my LLM a question with a history
- the ability to clear my history when restarting the chat in-game
from fastapi import FastAPI
from chatbot import RAGChatBot
app = FastAPI()
@app.get("/getanswer/{request}")
def get_answer(request):
chatbot = RAGChatBot()
response = chatbot.ask(question=request, session_id="default")
return "Vivienne says: " + response
@app.get("/clearhistory")
def clear():
chatbot.clear_history
return "History Cleared"
For my chatbot to know some specific facts, I set up a folder with a few relevant text files populated with custom and general information about the time and setting “Paris 1788”, along with a “knowledge_retriever.py” script:
from langchain_chroma import Chromafrom langchain_core.documents import Documentfrom langchain_ollama import OllamaEmbeddingsimport osdocuments_folder = 'paris_knowledge'docs = []embedding_model = OllamaEmbeddings(model='mxbai-embed-large:latest')for filename in os.listdir(documents_folder): if filename.endswith('.txt'): with open(os.path.join(documents_folder, filename), 'r') as textfile: # print('Contents of', filename) # print(content) content = textfile.read() doc = Document(page_content = filename + content) docs.append(doc)vector_database = Chroma(collection_name='paris', persist_directory='paris_db', embedding_function=embedding_model)vector_database.add_documents(docs)retriever = vector_database.as_retriever()
This was the last thing required for my “chatbot.py” to be ready for implementation. There you can find a system prompt about how the AI has to respond
from langchain_ollama.llms import OllamaLLM
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
from knowledge_retriever import retriever
class RAGChatBot:
def __init__(self, model_name='llama3.2:latest'):
self.model = OllamaLLM(model=model_name)
self.store = {} # Dictionary to store chat sessions
# Proper LangChain prompt template with message history placeholder
self.prompt = ChatPromptTemplate.from_messages([
("system",'''[...]'''),
# system prompt includes instructions for how the AI should behave, and what it has to do
MessagesPlaceholder(variable_name="history"),
("human", """{question}""")
])
# Create the chain
self.chain = self.prompt | self.model
# Wrap with message history
self.chain_with_history = RunnableWithMessageHistory(
self.chain,
self.get_session_history,
input_messages_key="question",
history_messages_key="history",
)
def get_session_history(self, session_id: str) -> ChatMessageHistory:
"""Get or create chat history for a session"""
if session_id not in self.store:
self.store[session_id] = ChatMessageHistory()
return self.store[session_id]
def ask(self, question: str, session_id: str = "default"):
"""Ask a question with proper history handling"""
# Retrieve relevant documents
documents = retriever.invoke(question)
# Invoke with history
response = self.chain_with_history.invoke(
{
"question": question,
"documents": documents
},
config={"configurable": {"session_id": session_id}}
)
return response
def clear_history(self, session_id: str = "default"):
"""Clear history for a session"""
if session_id in self.store:
self.store[session_id].clear()
print(f"History cleared for session: {session_id}")
Now just activate the server!
fastapi dev server.py #(strg + c stops the server)
I can now set up my project(s). I won’t show the whole setup, but this is the web requester necessary architecture to fetch what I am providing in my hosted server.
unity c# script relevant lines include:
using UnityEngine.Networking;
[...]
IEnumerator AwaitRequest()
{
UnityWebRequest request = UnityWebRequest.Get("http://127.0.0.1:8000/request-string");
loading = true;
yield return request.SendWebRequest();
string answer = request.downloadHandler.text;
[...]
}
[...]
The “request-string” can be one of the above-defined methods, “getanswer” or “clearhistory“.
In UE5, I didn’t create a front end, but the web requester equivalent is the “Call URL” node:


Leave a Reply