11 mo. ago • 100%

Need Brains from the self-hosters

I have a very simple setup running Gitea which I love. However, I enabled Elastic Search because it makes searching much faster than the default method.

I have a VPS running 16GB memory. The only things running on it are Nginx, PHP, Mysql, docker, and a few other things. Very rarely I ever hit over 6GB usage.

The issue comes when I enable Elastic Search. It seems to wipe me out at 15.7GB usage out of 16GB as soon as I start it up.

I searched online and found out about the /etc/elasticsearch/jvm.options.d/jvm.options and adding

-XmxXG
-XmsXG

The question is, what should this amount be. I read that by default, Elastic uses 50%, however, when I started it up, it was wiping me out of memory and making the system almost have a stroke.

But setting it to 2GB seems to make it not as responsive on the Gitea website, sometimes even timing the website out.

So I'm not sure what "range" I should be using here. Or if I'm going to have to upgrade my VPS to 32GB in order to run this properly.

6 comments

Just in case youre curious, here is my complete Gitea stack using MariaDB as db and issue indexer, and Redis for cache, queue and sessions.

Disclaimer: I am neither a Gitea or MariaDB or Redis expert, i know nothing about anything...

version: "3.3"

networks:
  gitea:
    name: gitea
  traefikproxy:
    external: true

services:

  gitea:
    container_name: gitea
    image: gitea/gitea:1.20.5-rootless
    restart: unless-stopped
    labels:
      - traefik.enable=true
      - traefik.docker.network=traefikproxy
      - traefik.http.routers.gitea.rule=Host(`gitea.local.example.com`)
      - traefik.http.services.gitea.loadbalancer.server.port=3000
    networks:
      - gitea
      - traefikproxy
  # ports:
  #   - 7744:3000   # webui
  #   - 2200:22     # ssh    (needs more configuration)
    depends_on:
      gitea-mariadb:
        condition: service_healthy
      gitea-redis:
        condition: service_healthy
    environment:
      - TZ=Europe/Berlin
      - USER_UID=1000
      - USER_GID=1000
      - USER=username
      ### basic settings
      - "GITEA__default__APP_NAME=Gitea"
      - "GITEA__time__DEFAULT_UI_LOCATION=Europe/Berlin"
      - "GITEA__ui__DEFAULT_THEME=arc-green"
      - "GITEA__server__HTTP_PORT=3000"
      - "GITEA__server__OFFLINE_MODE=true"
      - "GITEA__server__ROOT_URL=https://gitea.local.example.com"
      #- "GITEA__server__LOCAL_ROOT_URL=http://localhost:3000/"
      #- "GITEA__server__DOMAIN="
      ### database settings
      - GITEA__database__DB_TYPE=mysql
      - GITEA__database__HOST=gitea-mariadb:3306
      - GITEA__database__NAME=gitea
      - GITEA__database__USER=gitea
      - GITEA__database__PASSWD=gitea
      ### use db as indexer
      - GITEA__indexer__ISSUE_INDEXER_TYPE=db
      ### redis for cache, session and queue
      - GITEA__cache__ENABLED=true
      - GITEA__cache__ADAPTER=redis
      - GITEA__cache__HOST=redis://gitea-redis:6379/0
      - GITEA__session__PROVIDER=redis
      - GITEA__session__PROVIDER_CONFIG=redis://gitea-redis:6379/1
      - GITEA__queue__TYPE=redis
      - GITEA__queue__CONN_STR=redis://gitea-redis:6379/2
      ### mail settings
      - "GITEA__mailer__ENABLED=true"
      - "GITEA__mailer__PROTOCOL=smtp+starttls"
      - "GITEA__mailer__SUBJECT_PREFIX=Gitea: "
      - "GITEA__mailer__FROM=example@gmail.com"
      - "GITEA__mailer__USER=example@gmail.com"
      - "GITEA__mailer__PASSWD=CHANGEME"
      - "GITEA__mailer__SMTP_ADDR=smtp.gmail.com"
      - "GITEA__mailer__SMTP_PORT=587"
      ### secret key and token
      ### generate once with:
      #      docker run -it --rm gitea/gitea:1.20.5-rootless gitea generate secret SECRET_KEY
      #      docker run -it --rm gitea/gitea:1.20.5-rootless gitea generate secret INTERNAL_TOKEN
      - "GITEA__security__SECRET_KEY=CHANGEME"
      - "GITEA__security__INTERNAL_TOKEN=CHANGEME"
    volumes:
      - ./data:/var/lib/gitea
      - ./config:/etc/gitea
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
    healthcheck:
      test: "wget --no-verbose --tries=1 --spider --no-check-certificate http://localhost:3000/api/healthz || exit 1"
      start_period: 60s
  # mem_limit: 250m
  # memswap_limit: 250m
  # mem_reservation: 125m
  # cpu_shares: 1024
  # cpus: 1.0


  gitea-mariadb:
    container_name: gitea-mariadb
    image: mariadb:11.1.2
    restart: unless-stopped
    networks:
      - gitea
  # ports:
  #   - 3306:3306
    environment:
      - TZ=Europe/Berlin
      - MYSQL_ROOT_PASSWORD=gitea
      - MYSQL_USER=gitea
      - MYSQL_PASSWORD=gitea
      - MYSQL_DATABASE=gitea
    volumes:
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
      - ./mariadb:/var/lib/mysql
    command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
    healthcheck:
      test: ["CMD-SHELL", "mariadb-admin --user=$${MYSQL_USER} --password=$${MYSQL_PASSWORD} --host=localhost ping"]
      start_period: 60s
  # mem_limit: 400m
  # memswap_limit: 400m
  # mem_reservation: 200m
  # cpus: 0.5


  gitea-redis:
    container_name: gitea-redis
    image: redis:7.2.2-alpine
    restart: unless-stopped
    networks:
      - gitea
  # ports:
  #   - 6379:6379
    volumes:
      - ./redis:/data
  ### run on Docker host and reboot:
  ###    echo "vm.overcommit_memory = 1" | sudo tee /etc/sysctl.d/nextcloud-aio-memory-overcommit.conf
    command: redis-server --save 30 1 --loglevel warning
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
  # mem_limit: 150m
  # memswap_limit: 150m
  # mem_reservation: 25m
  # cpus: 0.5

Oh damn, thanks. I'll throw this in Obsidian.

Reverse proxy is exactly why I don't have more things setup in docker. I haven't quite figured out how it, nginx, and the app work together yet.

I had to setup caddy when I installed vaultwarden, and while that was easy because I had a very good guide to assist me, I would have been completely and totally lost if I had to setup caddy2 on my own.

So I definitely need to sit down one day and just do a full day's read on reverse proxy, how it works with Docker and its function, and what I can do with it. Because the vaultwarden setup made it no easier to understand.

I wanted to actually move nginx and mysql over to docker, but reverse proxy is also the reason that's holding me back.

If you already have Caddy running on that same Docker host then its very simple to add another proxy target to that through the Caddyfile.

I have been using Traefik for years and i am mostly happy with it but recently spend a day on trying out Caddy together with Authelia for authentication. Here is what came out of it as a very basic example to use them together. Its using a custom Docker image for Caddy that contains a few extra modules but it can easily be replaced with the basic official one, depending on what modules you need (for example, Lets Encrypt with dns01-challenge requires modules for DNS providers). My example uses www.desec.io for that but Cloudflare, DuckDNS etc. are possible too.

docker-compose.yml

version: "3.3"

networks:
  caddy:
    external: true

services:

  caddy:
    container_name: caddy
    image: l33tlamer/caddy-desec:latest
    restart: unless-stopped
    networks:
      - caddy
    ports:
      - 0.0.0.0:80:80/tcp
      - 0.0.0.0:443:443/tcp
      - 0.0.0.0:443:443/udp
    environment:
      - TZ=Europe/Berlin
      - DESEC_TOKEN=CHANGEME
    volumes:
      - ./required/Caddyfile:/etc/caddy/Caddyfile
      - ./config:/config
      - ./data:/data

  authelia:
    container_name: authelia
    image: authelia/authelia:latest
    restart: unless-stopped
    networks:
      - caddy
    ports:
      - 9091:9091
    environment:
      - TZ=Europe/Berlin
    volumes:
      - ./required/configuration.yml:/config/configuration.yml:ro
      - ./required/users_database.yml:/config/users_database.yml:ro
      - ./required/db.sqlite3:/config/db.sqlite3

    ### use pre-defined external Docker network: docker network create caddy
    ### db.sqlite3 needs to exist before first container start, can be created with: touch ./required/db.sqlite3
    ### Caddy config can be quickly reloaded with: docker exec -w /etc/caddy caddy caddy reload
    ### changes to Authelia files require its container to be restarted

required/Caddyfile

{
        debug
        http_port 80
        https_port 443
        email mail@example.com
        # acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
        acme_ca https://acme-v02.api.letsencrypt.org/directory
}

*.example.com {
        tls {
                dns desec {
                        token {env.DESEC_TOKEN}
                }
                propagation_timeout -1
        }

        @authelia host auth.example.com
        handle @authelia {
                forward_auth authelia:9091 {
                        uri /api/verify?rd=https://auth.example.com
                        copy_headers Remote-User Remote-Groups Remote-Name Remote-Email
                }
                reverse_proxy authelia:9091
        }

        ### example of a basic site entry
        @matomo host matomo.example.com
        handle @matomo {
                forward_auth authelia:9091 {
                        uri /api/verify?rd=https://auth.example.com
                        copy_headers Remote-User Remote-Groups Remote-Name Remote-Email
                }
                reverse_proxy matomo:8080
        }

        ### example of a site entry when target is HTTPS
        @proxmox host proxmox.example.com
        handle @proxmox {
                forward_auth authelia:9091 {
                        uri /api/verify?rd=https://auth.example.com
                        copy_headers Remote-User Remote-Groups Remote-Name Remote-Email
                }
                reverse_proxy 192.168.50.75:8006 {
                        transport http {
                                tls_insecure_skip_verify
                        }
                }
        }

        ### Fallback for otherwise unhandled domains
        handle {
                abort
        }

}

required/configuration.yml

# yamllint disable rule:comments-indentation
---
theme: grey

# generate with: openssl rand -hex 32
jwt_secret: CHANGEME

default_redirection_url: https://www.google.com/

default_2fa_method: ""

server:
  host: 0.0.0.0
  port: 9091
  path: ""
# asset_path: /config/assets/
  enable_pprof: false
  enable_expvars: false
  disable_healthcheck: false
  tls:
    key: ""
    certificate: ""
    client_certificates: []
  headers:
    csp_template: ""

log:
  level: debug

telemetry:
  metrics:
    enabled: false
    address: tcp://0.0.0.0:9959

totp:
  disable: false
  issuer: Authelia
  algorithm: sha1
  digits: 6
  period: 30
  skew: 1
  secret_size: 32

webauthn:
  disable: false
  timeout: 60s
  display_name: Authelia
  attestation_conveyance_preference: indirect
  user_verification: preferred

ntp:
  address: "time.cloudflare.com:123"
  version: 4
  max_desync: 3s
  disable_startup_check: false
  disable_failure: false

authentication_backend:
  password_reset:
    disable: false
    custom_url: ""
  refresh_interval: 1m

  file:
    path: /config/users_database.yml
    password:
      algorithm: argon2id
      iterations: 1
      key_length: 32
      salt_length: 16
      memory: 1024
      parallelism: 8

password_policy:
  standard:
    enabled: false
    min_length: 8
    max_length: 0
    require_uppercase: true
    require_lowercase: true
    require_number: true
    require_special: true
  zxcvbn:
    enabled: false
    min_score: 3

access_control:
  default_policy: deny
  rules:
    # authelia itself gets bypass
    - domain:
        - "auth.example.com"
      policy: bypass
    # list all your protected subdomains like this
    - domain: "matomo.example.com"
      policy: one_factor
    - domain: "proxmox.example.com"
      policy: one_factor

session:
  name: authelia_session
  domain: example.com
  same_site: lax
  # generate with: openssl rand -hex 32
  secret: CHANGEME
  expiration: 3h
  inactivity: 15m
  remember_me_duration: 1M

regulation:
  max_retries: 3
  find_time: 2m
  ban_time: 12h

storage:
  local:
    path: /config/db.sqlite3
  # generate with: openssl rand -hex 32
  encryption_key: CHANGEME

notifier:
  disable_startup_check: false
  filesystem:
    filename: /config/notification.txt

required/users_database.yml

users:

  admin:
    displayname: "Admin"
    # generate encoded password with for example https://argon2.online/
    # settings: argon2id, random salt, parallelism=8, memorycost=1024, iterations=1, hashlength=32
    # this example password = admin
    password: "$argon2id$v=19$m=1024,t=1,p=8$WnZGOU03d1lnMFRuYzZVQw$V+pcZ9KqXmsSZGuhzrv75ZPy5VQi9rfrWZnFKlJxTcI"
    email: email@example.com
    groups:
      - admins

Those values depends on the use case. You can set min and max values and fine tune as the need occurs. Some extensive information are discussed in these links:

https://docs.vmware.com/en/VMware-GemFire/10.0/gf/managing-monitor_tune-system_member_performance_jvm_mem_settings.html

https://stackoverflow.com/questions/14763079/what-are-the-xms-and-xmx-parameters-when-starting-jvm

https://www.elastic.co/guide/en/elasticsearch/reference/current/advanced-configuration.html

0
- Thanks, I saw the last link when I first set this up, but not the first two. I'll go through them and see if I can find the sweet spot.
  
  It's hard to tell because while I'm the only user using my Gitea repo website, which is pretty much your own personal Github. However, from what I've read, even though there may only be one or two users, the usage of Elastic greatly depends on how much code it has to cache. Then when you search for something, Elastic has to go through all that code.
  
  So from what I understand, the more code you have in a repo, the more Elastic has to work, which makes figuring out the memory a bit of a random gamble.
  
  1
  
  I haven't had first hand experience with gitea, but there would be some fine tuning that might ease the memory usage. What backend have you deployed ? You can make some config adjustment to it. If memory constrained, then swapiness could be set and any monitoring could be disabled or kept bare minimum. I read somewhere its useful to pprof https://github.com/google/pprof to get some insight about visually test memory usage, though haven't used it.
  
  1

You've viewed 6 comments.