Let’s say you are using GitOps to deploy your applications on Kubernetes. Let’s say you want to be sure you wrote your manifests right. Then kubeconform and kube-score are the tools you are looking for. And you can of course check it on every push or pull/merge request. kubeconform By default, kubeconform will use vanilla Kubernetes schemas but if you use Custom Resource Definitions, kubeconform can support them. Use the -schema-location to indicate where to find the JSON schemas. It accepts templated URLs and it’s handy because there’s a GitHub repository with the most common CRD’s out there: https://github.com/datreeio/CRDs-catalog. ...
The Power of 10: Rules for Developing Safety-Critical Code
Kind of a note to self, a reminder of some NASA programming practices for the JPL code. I’m not a programmer myself but some, if not all, of those rules can be applied to my humble projects. Avoid complex flow constructs, such as goto and recursion. All loops must have fixed bounds. This prevents runaway code. Avoid heap memory allocation. Restrict functions to a single printed page. Use a minimum of two runtime assertions per function. Restrict the scope of data to the smallest possible. Check the return value of all non-void functions, or cast to void to indicate the return value is useless. Use the preprocessor sparingly. Limit pointer use to a single dereference, and do not use function pointers. Compile with all possible warnings active; all warnings should then be addressed before release of the software. Source: ...
UUIDv7 implementation with sub-millisecond precision in Python by ChatGPT
I am sure I am not the first to do it, but I have asked ChatGPT to implement a UUID v7 function based on RFC 9562. It did not get it right the first time, but after some back and forth, it gave me this answer: import time, random, uuid def generate_uuid_v7_fast(): ts = int(time.time() * 1000) & ((1 << 48) - 1) upper = (ts << 16) | ((7 << 12) | random.getrandbits(12)) lower = (0b10 << 62) | random.getrandbits(62) return str(uuid.UUID(int=(upper << 64) | lower)) Compared to the reference implementation, it is more than 2 times slower (+117% on my system). However, the reference implementation returns a bytearray while this version returns a string after a call to uuid.UUID(). ...
Migrating to Hugo
I had been thinking using Wordpress for a personal blog is kind of wasteful for a long long time. After all, I don’t have dynamic content, it’s really just a bunch of text. I wanted to migrate to a static-file CMS for a long time but I never had the courage to do so. I recently had a few days of downtime, and so finally I did it. I decided to use Hugo as it was the most popular option at the time. ...
Docker Compose: simple firewall using Bash and labels
It has been a long time since I wanted to control connections from/to Docker containers but I could never find a simple enough solutions. We can control reverse proxy settings (Traefik) using labels but we can’t apply iptables rules with them? Nonsense. If you add to this that every container lives in a namespace, and namespaces can have their iptables rules, you have there an easy solution. So I wrote a Bash script that listen to Docker events. It filters on container starts which have the label firewall.enable=true, so it does not wake up often. ...
Securing large language models with a reverse proxy
In a previous post, I explained how to host a private ChatGPT using Docker and Traefik. I didn’t spend a lot of time on the security aspect of the project. I see many people asking how to expose their large language model on Internet and ask how to secure it. Since most (all?) open-source projects have adopted the OpenAI API, it uses standard HTTP. Therefore you can use all the traditional techniques to secure your large language model with a reverse proxy. ...
Self-hosted coding assistant with llamafile, continue.dev and docker
There was a recent dramatic improvement on the speed of LLM’s on CPU thanks to llamafile’s author. She goes on extensively about it on her blog but the short version is: expect 7-billion parameters to be usable on consumer-grade CPU even in Q8. Now it’s certainly possible to self-host a coding assistant with llamafile, continue.dev and Docker on a VPS. Let’s see how to achieve that. I’ll use Docker + Traefik but you can easily convert it to anything else (native + nginx for example). ...
Europe GPU prices update - March 28 2024
With all the buzz about AI these days, let’s have a look at the GPU prices in Europe and check which one gives the best “bang for the buck” as YouTubers like to say. YouTube is filled with people telling you how cheap GPUs are or that this model is the best value but unfortunately most of those people are living in the USA. Here in Europe, the story is usually different. I checked the cheapest model of each chip, and sorted them by the price per GB VRAM. The full table is available below. ...
Ollama, open-webui, mitmproxy in a docker compose stack, behind traefik
Reading Ollama discord channel, I notice many people want to self-host their chatGPT with Docker and don’t know how to do it. Here’s how to host the whole stack with docker compose. Here’s my docker-compose.yml including the mitmproxy from the previous article. version: "3" services: ollama: build: ollama user: 1001:1001 environment: - OLLAMA_HOST=0.0.0.0 - OLLAMA_DEBUG=1 - OLLAMA_KEEP_ALIVE=60m volumes: - /etc/localtime:/etc/localtime:ro - ollama_models:/home/ollama/.ollama/models mitmproxy: image: mitmproxy/mitmproxy command: mitmweb --web-host 0.0.0.0 --web-port 8080 --mode reverse:http://ollama:11434@11434 --verbose --anticache --anticomp depends_on: - ollama labels: - "traefik.enable=true" # ollama endpoint - "traefik.http.routers.ollama.rule=Host(`llm.example.com`)" - "traefik.http.routers.ollama.tls=true" - "traefik.http.routers.ollama.entrypoints=websecure" - "traefik.http.routers.ollama.tls.certresolver=le" - "traefik.http.routers.ollama.service=ollama" - "traefik.http.services.ollama.loadbalancer.server.port=11434" - "traefik.http.services.ollama.loadbalancer.server.scheme=http" # mitmweb endpoint - "traefik.http.routers.ollama-mitm.rule=Host(`inspector.example.com`)" - "traefik.http.routers.ollama-mitm.tls=true" - "traefik.http.routers.ollama-mitm.entrypoints=websecure" - "traefik.http.routers.ollama-mitm.tls.certresolver=le" - "traefik.http.routers.ollama-mitm.service=ollama-mitm" - "traefik.http.services.ollama-mitm.loadbalancer.server.port=8080" - "traefik.http.services.ollama-mitm.loadbalancer.server.scheme=http" - "traefik.http.middlewares.ollama-mitm-headers.headers.customrequestheaders.Host=0.0.0.0" - "traefik.http.middlewares.ollama-mitm-headers.headers.customrequestheaders.Origin=" open-webui: build: context: . args: OLLAMA_API_BASE_URL: '/ollama/api' dockerfile: Dockerfile image: ghcr.io/open-webui/open-webui:main volumes: - /etc/localtime:/etc/localtime:ro - open-webui:/app/backend/data depends_on: environment: - 'OLLAMA_API_BASE_URL=http://mitmproxy:11434/api' - 'WEBUI_SECRET_KEY=' labels: - "traefik.enable=true" - "traefik.http.routers.open-webui.rule=Host(`chatgpt.example.com`)" - "traefik.http.routers.open-webui.tls=true" - "traefik.http.routers.open-webui.entrypoints=websecure" - "traefik.http.routers.open-webui.tls.certresolver=le" - "traefik.http.routers.open-webui.service=open-webui" - "traefik.http.services.open-webui.loadbalancer.server.port=8080" - "traefik.http.services.open-webui.loadbalancer.server.scheme=http" volumes: ollama_models: open-webui: This exposes 3 different endpoints: ...
Troubleshoot HTTP API requests with mitmproxy
Sometimes you connect a new tool to one of your servers and it doesn’t work as expected. You are sure you follow the documentation or tutorials but you don’t get the expected results. Before you throw away everything, you should check what’s actually going on between the 2 applications. And if none of them supports logging requests and responses, you can use mitmproxy for troubleshooting. As the name imply (MITM = Man In the Middle), mitmproxy sits between both applications and intercepts all the traffic. You can use it to log the traffic but also modify the content of the requests and/or responses? On the fly. I will not cover that use-case here. ...