Kubernetes

Mochi: An Algorithmic Trading Backtest Platform on Kubernetes

Mochi is a self-hosted algorithmic trading backtest platform. It takes a stock ticker and date range, downloads historical market data, runs trading strategy simulations across multiple parameter combinations, aggregates the results with Trino, generates statistical graphs with R, and presents everything through a React dashboard. The entire pipeline runs on a homelab Kubernetes cluster orchestrated by Argo Workflows. Architecture Overview ┌──────────────────────────────────────────────────────────────────────────┐ │ dashboard.minoko.life │ │ React + Vite + TypeScript │ │ S3 client (MinIO) ─── Backtest API client │ └──────────────┬───────────────────────────────┬───────────────────────────┘ │ │ ▼ ▼ ┌──────────────────────┐ ┌──────────────────────────┐ │ MinIO (S3 API) │ │ Backtest API (FastAPI) │ │ s3.minoko.life │ │ backtest-api.minoko.life│ │ 13 buckets │ │ POST /backtest │ └──────────┬───────────┘ └────────────┬─────────────┘ │ │ │ ┌───────────────────▼──────────────────┐ │ │ Argo Workflows │ │ │ workflows.minoko.life │ │ │ │ │ │ Phase 1: backtest-pipeline │ │ │ ┌────────┐ ┌──────────┐ ┌────────┐ │ │ │ │Polygon │→│Enhancer │→│Metadata│ │ │ │ │Python │ │Kotlin │ │Python │ │ │ │ └────────┘ └──────────┘ └───┬────┘ │ │ │ │ │ │ │ Phase 2: per scenario (x N) │ │ │ │ ┌──────────────────────────┐│ │ │ │ │mochi-trades (Java) ││ │ │ │ │ │ ││ │ │ │ │sync-partitions (Trino) ││ │ │ │ │ │ ││ │ │ │ │mochi-aggregate (Java) ││ │ │ │ │ │ ││ │ │ │ │ ┌──┴──┬──────┐ ││ │ │◀─────────────│──│ │years│stops │best- ││ │ │ results │ │ │ .r │ .r │traders.r││ │ │ │ │ └─────┴──────┴────┬────┘│ │ │ │ │ trade-extract (Kt) │ ││ │ │ │ │ py-trade-lens (Py) │ ││ │ │ │ │ trade-summary (Py) │ ││ │ │ │ └──────────────────────────┘│ │ │ └──────────────────────────────┘ │ │ │ │ ┌──────────────────────────────┐ │ └────────▶│ Trino (Coordinator + Worker) │◀──────────┘ │ Hive Metastore + Postgres │ │ 100GB worker for aggregation │ └──────────────────────────────┘ Components Component Language Purpose Image mochi-dashboard TypeScript/React UI, S3 browsing, backtest submission harbor.minoko.life/mochi/mochi-dashboard backtest-api Python/FastAPI Accepts backtest requests, creates Argo Workflows harbor.minoko.life/mochi/backtest-api polygon Python Downloads historical data from Polygon.io API harbor.minoko.life/mochi/polygon trade-data-enhancer Kotlin Calculates ATR and technical indicators harbor.minoko.life/mochi/trade-data-enhancer data-metadata Python Generates scenario parameter combinations for Phase 2 harbor.minoko.life/mochi/data-metadata mochi-java Java 21 Core trading simulation engine and Trino aggregation harbor.minoko.life/mochi/mochi-java r-graphs R Statistical visualizations (years, stops, best traders) harbor.minoko.life/mochi/r-graphs trade-extract Kotlin Extracts individual trades from aggregated results harbor.minoko.life/mochi/trade-extract py-trade-lens Python Trade analysis and insights harbor.minoko.life/mochi/py-trade-lens trade-summary Python Final result summarization harbor.minoko.life/mochi/trade-summary Five languages across ten containerized services. ...

Kubernetes Local PVs and Symlinks Do Not Mix

Elasticsearch was stuck in a crash loop with the error “health check failed due to broken node lock”. The data directory inside the container was empty, yet the PVC showed as bound. The root cause: the OpenEBS hostPath was a symlink, and Kubernetes local PVs do not follow symlinks. The Problem The Elasticsearch pod showed 1/2 containers running, with the readiness probe failing: Warning Unhealthy 2m12s (x35651 over 2d) kubelet Readiness probe failed: Elasticsearch is not ready yet. Check the server logs. Elasticsearch logs revealed the issue: ...

Avoiding Stale Builds with Kaniko and Container Registry Caching

A blog post was committed and pushed, CI built and pushed the image, but the deployed site showed old content. This post documents the debugging process and the fixes to prevent stale builds. The Problem After pushing a new blog post: GitLab CI pipeline succeeded Kaniko pushed the image to Harbor ArgoCD deployed the new image The blog showed old content - new post missing Root Causes Two caching layers caused the issue: ...

Clean URLs with NGINX Ingress Redirects

Uptime Kuma status pages use the URL pattern /status/{slug}. With a slug of “status”, the full URL becomes /status/status. This post covers redirecting /status to /status/status using NGINX Ingress annotations. The Problem Uptime Kuma’s status page URL structure is fixed: /status/{slug}. Creating a status page with slug “status” results in: https://uptime.minoko.life/status/status The goal: make /status redirect to /status/status for a cleaner URL. Configuration Snippets Are Disabled The obvious solution is an NGINX configuration snippet: ...

VLAN Traffic Separation with MikroTik and OPNsense

This post documents setting up VLAN separation to isolate Kubernetes cluster traffic from bulk data transfers on a dual-homed node. The minis node has two NICs - one for Kubernetes API and overlay networking, another for pod data traffic like large file downloads. The Problem The minis Kubernetes node in the DMZ became unresponsive during large file transfers. Pods downloading or uploading large files saturated the network connection, affecting Kubernetes API communication, kubelet health checks, and Calico VXLAN overlay traffic. ...

Configuring Alertmanager Slack Notifications with kube-prometheus-stack

The kube-prometheus-stack Helm chart deploys Alertmanager with a default configuration that routes all alerts to a “null” receiver—effectively discarding them. This post documents configuring Alertmanager to send notifications to Slack. The Problem Default Alertmanager configuration: receivers: - name: "null" route: receiver: "null" # All alerts discarded Alerts fire, but nobody gets notified. Solution Architecture ┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────┐ │ Prometheus │────▶│ Alertmanager │────▶│ Slack │ │ (fires alerts) │ │ (routes & groups) │ │ (#alerts) │ └─────────────────────┘ └──────────────────────┘ └─────────────┘ │ ▼ ┌──────────────────────┐ │ Routing Rules │ ├──────────────────────┤ │ critical → 1h repeat │ │ warning → 4h repeat │ │ Watchdog → silenced │ └──────────────────────┘ Directory Structure monitoring/alertmanager/ ├── .env.example # Webhook URL template ├── .env # Actual webhook (gitignored) ├── create-secret.sh # Creates Kubernetes secret └── README.md # Setup documentation Setup Step 1: Create Slack Webhook Go to https://api.slack.com/apps Click “Create New App” → “From scratch” Name: Alertmanager, select your workspace Go to “Incoming Webhooks” → Toggle “Activate” Click “Add New Webhook to Workspace” Select the channel for alerts (e.g., #alerts) Copy the webhook URL Step 2: Create Kubernetes Secret # monitoring/alertmanager/.env.example SLACK_WEBHOOK_URL=https://hooks.slack.com/services/XXX/YYY/ZZZ SLACK_CHANNEL=#alerts #!/bin/bash # monitoring/alertmanager/create-secret.sh set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" if [ -f "$SCRIPT_DIR/.env" ]; then source "$SCRIPT_DIR/.env" else echo "Error: .env file not found" exit 1 fi kubectl create secret generic alertmanager-slack-config \ --from-literal=slack-webhook-url="${SLACK_WEBHOOK_URL}" \ --from-literal=slack-channel="${SLACK_CHANNEL}" \ --namespace=monitoring \ --dry-run=client -o yaml | kubectl apply -f - Run the setup: ...

Adding NetworkPolicies for Defense-in-Depth with Linkerd

Linkerd provides automatic mTLS between all pods in the mesh. This encrypts traffic and provides identity verification. However, it does not restrict which pods can communicate with each other. Any pod in the mesh can connect to any other pod. Kubernetes NetworkPolicies add an additional layer of security by defining explicit allow rules at the network level. This provides defense-in-depth: if Linkerd’s proxy is somehow bypassed, NetworkPolicies still enforce access control. ...

In-Cluster GitLab Runner with Kubernetes Executor

This post covers deploying a GitLab Runner inside a Kubernetes cluster using the Kubernetes executor. Each CI job spawns as a pod, runs its tasks, and is automatically cleaned up. Docker builds use Kaniko (rootless, no privileged containers), and job artifacts/dependencies are cached in MinIO. Architecture ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ GitLab CI Job │────▶│ Runner Manager │────▶│ Job Pod │ │ (push to repo) │ │ (polycephala) │ │ (auto-created) │ └─────────────────┘ └──────────────────┘ └─────────────────┘ │ ┌──────────────────┐ │ │ MinIO Cache │◀─────────────┘ │ (shared deps) │ └──────────────────┘ The runner manager pod runs continuously and polls GitLab for jobs. When a job is picked up, it creates a new pod in the gitlab-runner namespace, executes the job, and deletes the pod when complete. ...

Gold bars with AI neural network overlay

Building a GPU-Accelerated RAG System for Gold Market Intelligence

This post documents the implementation of a Retrieval-Augmented Generation (RAG) system for gold market intelligence, running entirely on a homelab Kubernetes cluster with GPU acceleration. The Goal Build a self-hosted AI system that: Ingests gold market data from multiple sources (FRED, GoldAPI, RSS feeds) Stores embeddings in a vector database Provides natural language query capabilities using a local LLM Runs on an NVIDIA RTX 5070 Ti GPU Architecture ┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐ │ Data Ingestion │───▶│ Embedding Service │───▶│ Qdrant │ │ (CronJobs) │ │ (nomic-embed-text) │ │ (Vector Store) │ └─────────────────────┘ └─────────────────────┘ └──────────┬──────────┘ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ Query Service │◀───│ Ollama │◀──────────────┘ │ (RAG API + UI) │ │ (Llama 3.1 8B) │ └─────────────────────┘ └─────────────────────┘ │ │ │ ┌──────┴──────┐ ▼ │ RTX 5070 Ti │ Web UI @ :80 │ (16GB) │ └─────────────┘ Components Component Purpose Image Ollama LLM inference (Llama 3.1 8B) + embeddings (nomic-embed-text) ollama/ollama Qdrant Vector database for storing embeddings qdrant/qdrant Data Ingestion CronJobs fetching from FRED, GoldAPI, RSS Custom Python/FastAPI Embedding Service Converts text to vectors, stores in Qdrant Custom Python/FastAPI Query Service RAG pipeline + web UI Custom Python/FastAPI Data Sources Source Data Schedule FRED Gold price history, CPI, Fed Funds Rate, 10Y Treasury, USD Index Every 6 hours GoldAPI.io Real-time XAU/USD spot price Hourly RSS Feeds Market news from Investing.com Every 4 hours Implementation Repository Structure gold-intelligence/ ├── .gitlab-ci.yml ├── services/ │ ├── data-ingestion/ │ │ ├── Dockerfile │ │ ├── requirements.txt │ │ └── src/ │ │ ├── main.py │ │ └── collectors/ │ │ ├── fred.py │ │ ├── gold_api.py │ │ └── news_rss.py │ ├── embedding-service/ │ │ ├── Dockerfile │ │ ├── requirements.txt │ │ └── src/ │ │ ├── main.py │ │ ├── embedder.py │ │ └── qdrant_client.py │ └── query-service/ │ ├── Dockerfile │ ├── requirements.txt │ └── src/ │ ├── main.py │ ├── rag_pipeline.py │ ├── ollama_client.py │ └── static/ # Web UI ├── helm/ │ ├── data-ingestion/ │ ├── embedding-service/ │ ├── query-service/ │ ├── ollama-values.yaml │ └── qdrant-values.yaml └── kubernetes/ └── argocd/ Ollama Configuration The key to GPU acceleration is the runtimeClassName: nvidia in the Helm values: ...

Enabling HTTPS on Wiki.js with Let's Encrypt via OPNsense

Wiki.js was running on a LoadBalancer IP with no TLS. This post covers migrating to HTTPS using a Let’s Encrypt wildcard certificate managed by OPNsense, with automatic synchronization to Kubernetes. The Problem The wiki was accessible at http://192.168.2.204 with: No TLS encryption Direct LoadBalancer service exposure No ingress controller The goal: HTTPS with a publicly trusted certificate, no browser warnings. Architecture ┌─────────────────────────────────────────────────────────────────────────┐ │ OPNsense Firewall │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ ACME Client │ │ │ │ - Let's Encrypt account │ │ │ │ - Cloudflare DNS-01 validation │ │ │ │ - Wildcard cert: *.minoko.life │ │ │ │ - Auto-renewal at 60 days │ │ │ └────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ │ OPNsense API (daily sync) ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Kubernetes Cluster │ │ ┌─────────────────────┐ ┌─────────────────────────────────────────┐ │ │ │ letsencrypt-sync │ │ ingress-nginx namespace │ │ │ │ CronJob (5 AM) │───▶│ letsencrypt-wildcard secret │ │ │ │ - Fetch from API │ └─────────────────────────────────────────┘ │ │ │ - Build cert chain │ ┌─────────────────────────────────────────┐ │ │ │ - Update secrets │───▶│ wikijs namespace │ │ │ └─────────────────────┘ │ letsencrypt-wildcard secret │ │ │ └─────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ ingress-nginx-controller │ │ │ │ LoadBalancer: 192.168.2.224 │ │ │ │ │ │ │ │ Ingress: wiki.minoko.life ──▶ wikijs:80 │ │ │ │ TLS: letsencrypt-wildcard secret │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ Prerequisites OPNsense with ACME plugin configured for Let’s Encrypt Cloudflare (or other DNS provider) for DNS-01 validation Existing wildcard certificate for *.minoko.life Step 1: Install ingress-nginx Controller Create the ingress-nginx infrastructure: ...