Securing large language models with a reverse proxy

In a previous post, I explained how to host a private ChatGPT using Docker and Traefik. I didn’t spend a lot of time on the security aspect of the project. I see many people asking how to expose their large language model on Internet and ask how to secure it. Since most (all?) open-source projects have adopted the OpenAI API, it uses standard HTTP. Therefore you can use all the traditional techniques to secure your large language model with a reverse proxy....

April 5, 2024

Self-hosted coding assistant with llamafile, continue.dev and docker

There was a recent dramatic improvement on the speed of LLM’s on CPU thanks to llamafile’s author. She goes on extensively about it on her blog but the short version is: expect 7-billion parameters to be usable on consumer-grade CPU even in Q8. Now it’s certainly possible to self-host a coding assistant with llamafile, continue.dev and Docker on a VPS. Let’s see how to achieve that. I’ll use Docker + Traefik but you can easily convert it to anything else (native + nginx for example)....

April 1, 2024

Europe GPU prices update - March 28 2024

With all the buzz about AI these days, let’s have a look at the GPU prices in Europe and check which one gives the best “bang for the buck” as YouTubers like to say. YouTube is filled with people telling you how cheap GPUs are or that this model is the best value but unfortunately most of those people are living in the USA. Here in Europe, the story is usually different....

March 28, 2024

Ollama, open-webui, mitmproxy in a docker compose stack, behind traefik

Reading Ollama discord channel, I notice many people want to self-host their chatGPT with Docker and don’t know how to do it. Here’s how to host the whole stack with docker compose. Here’s my docker-compose.yml including the mitmproxy from the previous article. version: "3" services: ollama: build: ollama user: 1001:1001 environment: - OLLAMA_HOST=0.0.0.0 - OLLAMA_DEBUG=1 - OLLAMA_KEEP_ALIVE=60m volumes: - /etc/localtime:/etc/localtime:ro - ollama_models:/home/ollama/.ollama/models mitmproxy: image: mitmproxy/mitmproxy command: mitmweb --web-host 0.0.0.0 --web-port 8080 --mode reverse:http://ollama:11434@11434 --verbose --anticache --anticomp depends_on: - ollama labels: - "traefik....

March 23, 2024

Ollama system prompt

Ollama I have recently started to use Ollama and I was unimpressed by some models as they did not follow instructions, especially in their output format. I knew about model system prompt but I thought it was fixed in the model. Then I found out you could change the system prompt at run time with the /set system command and immediately, most models responded as expected. That was so much better!...

March 18, 2024