matteorenzi.com - My Linux & DevOps Tips and Tricks collection

DevOps, DevSecOps, Kubernetes, Monitoring Tools, Prometheus

How to expose kubernetes api-server metrics

10/06/2023 /
Kubernetes api-server provides very interesting metrics which could make a difference when it comes to detecting potential security threats.

Accessing api-server requires a Token and a certificate. Both must be related to a ServiceAccount with sufficient permissions to access metrics endpoint. This post describes how to achieve such setup.

Namespace
Before to start, make sure your current context is using “default” namespace
```
kubectl config set-context --current --namespace=default
```
Step 1: Create a new ServiceAccount
```
kubectl create serviceaccount metrics-explorer
```
Step 2: Create a new ClusterRole with sufficient permissions to access api-server metrics endpoint via HTTP GET
```
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: metrics-explorer
rules:
- nonResourceURLs:
  - /metrics
  - /metrics/cadvisor
  verbs:
  - get
```
Step 3: Create new ClusterRoleBinding to bind the ServiceAccount with ClusterRole
```
kubectl create clusterrolebinding metrics-explorer:metrics-explorer --clusterrole metrics-explorer --serviceaccount default:metrics-explorer
```
Step 4: Export ServiceAccount’s token Secret’s name
```
SERVICE_ACCOUNT=metrics-explorer
SECRET=$(kubectl get serviceaccount ${SERVICE_ACCOUNT} -o json | jq -Mr '.secrets[].name | select(contains("token"))')
```
Step 5: Extract Bearer token from Secret and decode it
```
TOKEN=$(kubectl get secret ${SECRET} -o json | jq -Mr '.data.token' | base64 -d)
```
Step 6: Extract, decode and write the ca.crt to a temporary location
```
kubectl get secret ${SECRET} -o json | jq -Mr '.data["ca.crt"]' | base64 -d > /tmp/ca.crt
```
Final step: Test access to metrics endpoint
```
curl -s <API-SERVER>/metrics  --header "Authorization: Bearer $TOKEN" --cacert /tmp/ca.crt | less
```
Configuring as additional scrape target on Prometheus

Transfer the certificate file from api-server’s VM to Prometheus’ VM. (e.g. destination filename: /opt/api-server-files/ca.crt)

Save the TOKEN obtained on steps above to a file on Prometheus’ VM. (e.g. destination filename: /opt/api-server-files/api-server-token)

Edit Prometheus main configuration file (e.g. /etc/prometheus/prometheus.yml) and add the following scrape target:
```
  - bearer_token_file: /opt/api-server-files/api-server-token
    job_name: kubernetes-apiservers
    static_configs:
    - targets: ['<API-SERVER-IP>:6443']
    metrics_path: '/metrics'
    scheme: https
    tls_config:
      ca_file: /opt/api-server-files/ca.crt
```
read more

Matteo Renzi

You May Also Like

Kubernetes observability – log aggregation – Grafana-loki deployment and configuration
17/12/2022

Grafana running on kubernetes: How to configure SMTP integration
08/10/2022

Kubernetes multi-node cluster deployment from scratch
14/01/2023

DevOps, Kubernetes, Linux

Kubernetes multi-node cluster deployment from scratch

14/01/2023 /

Table of contents

Intro
Preparing all hosts (master/worker nodes)
Preparing the Load Balancer
Installing client tools
- CFSSL (Cloud Flare SSL tool)
  - Generating TLS certificate
  - Creating the certificate for the Etcd cluster
Etcd installation and configuration (only Master nodes)
Master nodes initialisation
Worker nodes initialisation
Checking nodes status
Possible issues
- Node NotReady – Pod CIDR not available
Assigning role to worker nodes

Intro

The following guide shows how to deploy and configure a multi-node kubernetes cluster on-premise.

Master/Worker nodes are using Rocky Linux 9.1 as host OS.

Load balancer in front of all master nodes, is running on Photon OS 4.

All VMs are using the same network.

Once all cluster members have been configured as explained below, the following configuration will be effective:

Preparing all hosts (master/worker nodes)

On all hosts (master/worker):

Setup hostname DNS resolution (either via a DNS server or by adding entries to /etc/hosts)
Setup appropriate firewalld rules:
- On master nodes:

sudo firewall-cmd --add-port=6443/tcp --permanent
sudo firewall-cmd --add-port=2379-2380/tcp --permanent
sudo firewall-cmd --add-port=10250/tcp --permanent
sudo firewall-cmd --add-port=10259/tcp --permanent
sudo firewall-cmd --add-port=10257/tcp --permanent             
sudo firewall-cmd --reload
sudo firewall-cmd --list-all

On worker nodes:

sudo firewall-cmd --add-port=10250/tcp --permanent
sudo firewall-cmd --add-port=30000-32767/tcp --permanent               
sudo firewall-cmd --reload
sudo firewall-cmd --list-all

Configure SELinux assigning “Permissive mode”

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
sestatus

Enable kernel modules “overlay” and “br_netfilter”

sudo modprobe overlay
sudo modprobe br_netfilter
         
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
         
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF
         
sudo sysctl --system
         
echo 1 > /proc/sys/net/ipv4/ip_forward

Disable swap

sudo swapoff -a
free -m
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

Install container runtime: Containerd

sudo dnf install dnf-utils
sudo yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
         
sudo dnf repolist
sudo dnf makecache
         
sudo dnf install containerd.io
         
sudo mv /etc/containerd/config.toml /etc/containerd/config.toml.orig
sudo containerd config default > /etc/containerd/config.toml

Edit file /etc/containerd/config.toml and change value of cgroup driver “SystemdCgroup = false” to “SystemdCgroup = true”. This will enable the systemd cgroup driver for the containerd container runtime.

sudo systemctl enable --now containerd
sudo systemctl is-enabled containerd
sudo systemctl status containerd

Install kubernetes packages

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
         
sudo dnf repolist
sudo dnf makecache
         
sudo dnf install kubelet kubeadm kubectl --disableexcludes=kubernetes
sudo systemctl enable --now kubelet

Install CNI plugin: Flannel (check for latest version available)

mkdir -p /opt/bin/
curl -fsSLo /opt/bin/flanneld https://github.com/flannel-io/flannel/releases/download/v0.20.2/flanneld-amd64
chmod +x /opt/bin/flanneld

Preparing the Load Balancer

In this example, we are using a Photon OS 4 image, but it can be any Linux distro on top of which we can run a HAProxy instance.

Update existing packages

tdnf update / apt-get update && apt-get upgrade / yum update

Install HAProxy

tdnf install -y haproxy / apt-get install -y haproxy / yum install -y haproxy

Configure HAProxy to load balance the traffic between the three Kubernetes master nodes (Replace <K8S-MASTER-NODE-1> and <K8S-MASTER-NODE-1-IP> with your node name/IP)

$ sudo vim /etc/haproxy/haproxy.cfg
global
...
default
...
frontend kubernetes
bind <HAProxy Server IP>:6443
option tcplog
mode tcp
default_backend kubernetes-master-nodes
backend kubernetes-master-nodes
mode tcp
balance roundrobin
option tcp-check
server <K8S-MASTER-NODE-1> <K8S-MASTER-NODE-1-IP>:6443 check fall 3 rise 2
server <K8S-MASTER-NODE-2> <K8S-MASTER-NODE-2-IP>:6443 check fall 3 rise 2
server <K8S-MASTER-NODE-3> <K8S-MASTER-NODE-3-IP>:6443 check fall 3 rise 2

Restart the service

$ sudo systemctl restart haproxy

Installing client tools

Steps below relate to preparation of TLS certificate that will be used to communicate with each etcd instance.

TLS certificate can be prepared on any of the hosts, on a separate one as well, since the TLS certificate obtained will then be copied to all relevant hosts.

CFSSL (Cloud Flare SSL tool)

Download the binaries and grant execution permission

$ wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
$ wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
$ chmod +x cfssl*

Move the binaries to /usr/local/bin and verify the installation

$ sudo mv cfssl_linux-amd64 /usr/local/bin/cfssl
$ sudo mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
$ cfssl version

Generating TLS certificate

Create a Certification Authority

$ vim ca-config.json
{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "kubernetes": {
        "usages": ["signing", "key encipherment", "server auth", "client auth"],
        "expiry": "8760h"
      }
    }
  }
}

Create the certificate authority signing request configuration file

$ vim ca-csr.json
{
  "CN": "Kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
  {
    "C": "IE",
    "L": "Cork",
    "O": "Kubernetes",
    "OU": "CA",
    "ST": "Cork Co."
  }
 ]
}

Generate the certificate authority certificate and private key

$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca

Make sure that ca-key.pem and the ca.pem have been generated

Creating the certificate for the Etcd cluster

Create the certificate signing request configuration file

$ vim kubernetes-csr.json
{
  "CN": "kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
  {
    "C": "IE",
    "L": "Cork",
    "O": "Kubernetes",
    "OU": "Kubernetes",
    "ST": "Cork Co."
  }
 ]
}

Generate the certificate and private key (Replace <MASTER-NODE-1-IP>,<MASTER-NODE-2-IP>,<MASTER-NODE-3-IP>,<LOAD-BALANCER-IP> accordingly)

$ cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-hostname=<MASTER-NODE-1-IP>,<MASTER-NODE-2-IP>,<MASTER-NODE-3-IP>,<LOAD-BALANCER-IP>,127.0.0.1,kubernetes.default \
-profile=kubernetes kubernetes-csr.json | \
cfssljson -bare kubernetes

Verify that the kubernetes-key.pem and the kubernetes.pem file were generated
Copy the certificate to all nodes

$ scp ca.pem kubernetes.pem kubernetes-key.pem root@MASTER-NODE-1:~
$ scp ca.pem kubernetes.pem kubernetes-key.pem root@MASTER-NODE-2:~
$ scp ca.pem kubernetes.pem kubernetes-key.pem root@MASTER-NODE-3:~
$ scp ca.pem kubernetes.pem kubernetes-key.pem root@LOAD-BALANCER:~
$ scp ca.pem kubernetes.pem kubernetes-key.pem root@WORKER-NODE-1:~
$ scp ca.pem kubernetes.pem kubernetes-key.pem root@WORKER-NODE-2:~

Etcd installation and configuration (only Master nodes)

sudo mkdir /etc/etcd /var/lib/etcd

Move certificates to the configuration directory

$ sudo mv ~/ca.pem ~/kubernetes.pem ~/kubernetes-key.pem /etc/etcd

Download the etcd binaries (check latest release available), extract and move to /usr/local/bin

$ wget https://github.com/etcd-io/etcd/releases/download/v3.4.23/etcd-v3.4.23-linux-amd64.tar.gz
$ tar zxvf etcd-v3.4.23-linux-amd64.tar.gz
$ sudo mv etcd-v3.4.23-linux-amd64/etcd* /usr/local/bin/

Create an etcd systemd unit file (replace <CURRENT-MASTER-NODE-IP> with ip address of master node you are configuring and <OTHER-MASTER-NODE-IP> with ip address(es) of the remaining 2 master nodes

$ sudo vim /etc/systemd/system/etcd.service
[Unit]
Description=etcd
Documentation=https://github.com/coreos
 
[Service]
ExecStart=/usr/local/bin/etcd \
  --name <CURRENT-MASTER-NODE-IP> \
  --cert-file=/etc/etcd/kubernetes.pem \
  --key-file=/etc/etcd/kubernetes-key.pem \
  --peer-cert-file=/etc/etcd/kubernetes.pem \
  --peer-key-file=/etc/etcd/kubernetes-key.pem \
  --trusted-ca-file=/etc/etcd/ca.pem \
  --peer-trusted-ca-file=/etc/etcd/ca.pem \
  --peer-client-cert-auth \
  --client-cert-auth \
  --initial-advertise-peer-urls https://<CURRENT-MASTER-NODE-IP>:2380 \
  --listen-peer-urls https://0.0.0.0:2380 \
  --listen-client-urls https://<CURRENT-MASTER-NODE-IP>:2379,http://127.0.0.1:2379 \
  --advertise-client-urls https://<CURRENT-MASTER-NODE-IP>:2379 \
  --initial-cluster-token etcd-cluster-0 \
  --initial-cluster <CURRENT-MASTER-NODE-IP>=https://<CURRENT-MASTER-NODE-IP>:2380,<OTHER-MASTER-NODE-IP>=https://<OTHER-MASTER-NODE-IP>:2380,<OTHER-MASTER-NODE-IP>=https://<OTHER-MASTER-NODE-IP>:2380 \
  --initial-cluster-state new \
  --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
 
[Install]
WantedBy=multi-user.target

Reload daemon configuration files and enable service to be started at boot

$ sudo systemctl daemon-reload
$ sudo systemctl enable etcd

Repeat steps above on all master nodes and then:
Start the service on all master nodes

$ sudo systemctl start etcd

Wait a few seconds and check that the cluster is up and synchronised (run the command on all master nodes)

$ ETCDCTL_API=3 etcdctl member list

Master nodes initialisation

Master node #1

Create a configuration file for kubeadm (replace values of <MASTER-NODE-1/2/3-IP> and <LOAD-BALANCER-IP> accordingly

$ vim config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
etcd:
  external:
    endpoints:
    - https://<MASTER-NODE-1-IP>:2379
    - https://<MASTER-NODE-2-IP>:2379
    - https://<MASTER-NODE-3-IP>:2379
    caFile: /etc/etcd/ca.pem
    certFile: /etc/etcd/cc-ha.pem
    keyFile: /etc/etcd/cc-ha-key.pem
networking:
  podSubnet: "10.244.0.0/24"
controlPlaneEndpoint: "<LOAD-BALANCER-IP>:6443"
apiServer:
  extraArgs:
    apiserver-count: "3"
  certSANs:
    - "<LOAD-BALANCER-IP>"
  timeoutForControlPlane: 4m0s

Initialize the machine as a master node

$ sudo kubeadm init --config=config.yaml
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Install CNI plugin

kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

Copy the certificate to the two other master nodes

$ sudo scp -r /etc/kubernetes/pki root@<MASTER-NODE-2-IP>:~
$ sudo scp -r /etc/kubernetes/pki root@<MASTER-NODE-3-IP>:~

Master node #2

Remove the apiserver.crt and apiserver.key

$ rm ~/pki/apiserver.*

Move the certificates to the /etc/kubernetes directory

$ sudo mv ~/pki /etc/kubernetes/

Create a configuration file for kubeadm (same content as file above used on master node #1)
Initialize the machine as a master node

$ sudo kubeadm init --config=config.yaml
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Master node #3

Same identical steps as master node #2.

Copy the kubeadm join command created when adding master nodes: It will be required to join worker nodes.

Worker nodes initialisation

Run the “kubeadm join” command copied from step above.

Checking nodes status

From a master node:

$ kubectl get nodes

Check nodes status (Ready/NotReady).

Possible issues

Node NotReady – Pod CIDR not available

Solution: Patch node with the following command (replace NODE-NAME and CIDR value accordingly).

kubectl patch node <NODE-NAME> -p '{"spec":{"podCIDR":"10.244.0.0/24"}}'

Assigning role to worker nodes

kubectl label node <NODE-NAME> node-role.kubernetes.io/worker=worker

DevOps, Kubernetes

kube-proxy explained

01/01/2023 /

Table of Contents

Intro
kube-proxy operating modes
kube-proxy – iptables mode
External references

Intro

kube-proxy is a cluster component responsible for network traffic routing. Because of that, 1 instance is running on each cluster node.

It is responsible for routing traffic between cluster components but also for traffic incoming from outside the cluster.

It essentially implements rules part of Service(s). A Service represents a rule which is then implemented by kube-proxy.

kube-proxy operating modes

kube-proxy can implement network traffic rules 3 different ways:

iptables (default)
userspace (old, deprecated)
IPVS (IP Virtual Server)

This page focuses on iptables mode.

kube-proxy – iptables mode

By using iptables mode, whenever a Service is created, related iptables rules are created on each node by kube-proxy.

Such rules are part of PREROUTING chain: This means that traffic is forwarded as soon as it gets into the kernel.

Listing all iptables PREROUTING chains

sudo iptables -t nat -L PREROUTING | column -t

Example:

root@test:~# sudo iptables -t nat -L PREROUTING | column -t
Chain            PREROUTING  (policy  ACCEPT)
target           prot        opt      source    destination
cali-PREROUTING  all         --       anywhere  anywhere     /*        cali:6gwbT8clXdHdC1b1  */
KUBE-SERVICES    all         --       anywhere  anywhere     /*        kubernetes             service   portals  */
DOCKER           all         --       anywhere  anywhere     ADDRTYPE  match                  dst-type  LOCAL

Listing all rules part of a given chain
sudo iptables -t nat -L KUBE-SERVICES -n | column -t

For a better understanding, let’s consider the following example:

A new NodePort Service has been created with the following command:

kubectl expose deployment prometheus-grafana --type=NodePort --name=grafana-example-service -n monitoring

By executing the command above, a new Service got created:

[test@test ~]$ kubectl get svc/grafana-example-service -n monitoring
NAME                      TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
grafana-example-service   NodePort   10.111.189.177   <none>        3000:31577/TCP   100m

We did not specify any specific node port, therefore a random one between 30000 and 32767 has been automatically assigned: 31577.

Yaml manifest of Service object created with the command above:

kind: Service
metadata:
  labels:
    app.kubernetes.io/instance: prometheus
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: grafana
    app.kubernetes.io/version: 9.2.4
    helm.sh/chart: grafana-6.43.5
  name: grafana-example-service
  namespace: monitoring
spec:
  clusterIP: 10.111.189.177
  clusterIPs:
  - 10.111.189.177
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - nodePort: 31577
    port: 3000
    protocol: TCP
    targetPort: 3000
  selector:
    app.kubernetes.io/instance: prometheus
    app.kubernetes.io/name: grafana
  sessionAffinity: None
  type: NodePort

Assigning a custom nodePort
If you want to expose the Service on a custom node port, patch/edit the Service object by changing value of spec.ports.nodePort

Once the Service got created, we were able to reach grafana with the following URL: http://<NODE_IP_ADDRESS>:31577

This is made possible by kube-proxy

When Service grafana-example-service got created, kube-proxy has actually created iptables rules within KUBE_SERVICES chain which belongs to PREROUTING group, as well as a chain which collects all rules related to all NodePorts services:

sudo iptables -t nat -L KUBE-SERVICES -n  | column -t
Chain                      KUBE-SERVICES  (2   references)
target                     prot           opt  source       destination
KUBE-SVC-MDD5UT6CKUVXRUP3  tcp            --   0.0.0.0/0    10.98.226.44    /*  loki/loki-write:http-metrics                                      cluster   IP          */     tcp   dpt:3100
KUBE-SVC-FJOCBQUA67AJTJ4Y  tcp            --   0.0.0.0/0    10.103.120.150  /*  loki/loki-read:grpc                                               cluster   IP          */     tcp   dpt:9095
KUBE-SVC-GWDJ4KONO5OOHRT4  tcp            --   0.0.0.0/0    10.106.191.67   /*  loki/loki-gateway:http                                            cluster   IP          */     tcp   dpt:80
KUBE-SVC-XBIRSKPJDNCMT43V  tcp            --   0.0.0.0/0    10.111.129.177  /*  metallb-system/webhook-service                                    cluster   IP          */     tcp   dpt:443
KUBE-SVC-UZFDVIVO2N6QXLRQ  tcp            --   0.0.0.0/0    10.103.243.43   /*  monitoring/prometheus-kube-prometheus-operator:https              cluster   IP          */     tcp   dpt:443
KUBE-SVC-L5JLFDCUFDUOSAFE  tcp            --   0.0.0.0/0    10.96.126.22    /*  monitoring/prometheus-grafana:http-web                            cluster   IP          */     tcp   dpt:80
KUBE-SVC-NPX46M4PTMTKRN6Y  tcp            --   0.0.0.0/0    10.96.0.1       /*  default/kubernetes:https                                          cluster   IP          */     tcp   dpt:443
KUBE-SVC-OIUYAK75OI4PJHUN  tcp            --   0.0.0.0/0    10.111.189.177  /*  monitoring/grafana-example-service                                cluster   IP          */     tcp   dpt:3000
KUBE-SVC-FP56U3IB7O2NDDFT  tcp            --   0.0.0.0/0    10.108.50.82    /*  monitoring/prometheus-kube-prometheus-alertmanager:http-web       cluster   IP          */     tcp   dpt:9093
KUBE-SVC-TCOU7JCQXEZGVUNU  udp            --   0.0.0.0/0    10.96.0.10      /*  kube-system/kube-dns:dns                                          cluster   IP          */     udp   dpt:53
KUBE-SVC-JD5MR3NA4I4DYORP  tcp            --   0.0.0.0/0    10.96.0.10      /*  kube-system/kube-dns:metrics                                      cluster   IP          */     tcp   dpt:9153
KUBE-NODEPORTS             all            --   0.0.0.0/0    0.0.0.0/0       /*  kubernetes                                                        service   nodeports;

iptables applies all rules subsequently.

Rules must be interpreted like this:

target: What to do whenever a given packet is matching all entry conditions (can be another rule or an action)
prot: The protocol
source: Source IP address of packet
destination: Destination IP address of packet
dpt: Destination port of packet

Example:

Consider the following rule:

target                     prot           opt  source       destination
KUBE-SVC-OIUYAK75OI4PJHUN  tcp            --   0.0.0.0/0    10.111.189.177  /*  monitoring/grafana-example-service                                cluster   IP          */     tcp   dpt:3000

Interpreting the rule
IF transmission protocol = tcp AND
whatever source IP address (0.0.0.0/0 = ANY) AND
destination IP address is 10.111.189.177 AND
destination port is 3000
THEN
apply rule KUBE-SVC-OIUYAK75OI4PJHUN

Moving on with our sample Service, when it got created, the 2 following rules have been instantiated by kube-proxy:

Chain                      KUBE-SERVICES  (2   references)
target                     prot           opt  source       destination
KUBE-SVC-OIUYAK75OI4PJHUN  tcp            --   0.0.0.0/0    10.111.189.177  /*  monitoring/grafana-example-service                                cluster   IP          */     tcp   dpt:3000
KUBE-NODEPORTS             all            --   0.0.0.0/0    0.0.0.0/0       /*  kubernetes                                                        service   nodeports;

1st rule listed above consists of the following items:

[test@test~]$ sudo iptables -t nat -L KUBE-SVC-OIUYAK75OI4PJHUN -n  | column -t
Chain                      KUBE-SVC-OIUYAK75OI4PJHUN  (2   references)
target                     prot                       opt  source          destination
KUBE-MARK-MASQ             tcp                        --   !10.244.0.0/16  10.111.189.177  /*  monitoring/grafana-example-service  cluster  IP                 */  tcp  dpt:3000
KUBE-SEP-LAT64KIID4KEQMCP  all                        --   0.0.0.0/0       0.0.0.0/0       /*  monitoring/grafana-example-service  ->       10.244.0.115:3000  */

1st item (KUBE-MARK-MASQ) marks the TCP packed as “must go through IP masquerading” whenever the source IP address does NOT belong to 10.244.0.0/16 (in short words, whenever it is not internal traffic among cluster Pods part of the current node) AND if destination address is 10.111.189.177 AND if destination port is 3000.

Then, rule KUBE-SEP-LAT64KIID4KEQMCP is applied.

Rule KUBE-SEP-LAT64KIID4KEQMCP consists of the following items:

[test@test ~]$ sudo iptables -t nat -L KUBE-SEP-LAT64KIID4KEQMCP -n  | column -t
Chain           KUBE-SEP-LAT64KIID4KEQMCP  (1   references)
target          prot                       opt  source        destination
KUBE-MARK-MASQ  all                        --   10.244.0.115  0.0.0.0/0    /*  monitoring/grafana-example-service  */
DNAT            tcp                        --   0.0.0.0/0     0.0.0.0/0    /*  monitoring/grafana-example-service  */  tcp  to:10.244.0.115:3000

Which means:
IF source address is 10.244.0.115, regardless of the destination IP address, mark packet as to go through IP masquerading.
THEN, execute DNAT (Destination Network Address Translation) and forward it to 10.244.0.115:3000

Traffic from source IP addresses which do NOT belong to cluster internal network would indeed get discarded, that explains why IP masquerading is required in this case.

Whenever the 1st rule is not matching, which means that the source IP address already belongs to internal cluster network, no IP masquerading is required, and in this case the 2nd rule above will be applied (rule KUBE-SEP-LAT64KIID4KEQMCP).

This rule simply forwards the packet to 10.244.0.155:3000 which relates to grafana Pod’s IP address:

[test@test ~]$ kubectl describe pods/prometheus-grafana-5f848c4987-btg95 -n monitoring
Name:             prometheus-grafana-5f848c4987-btg95
Namespace:        monitoring
Priority:         0
Service Account:  prometheus-grafana
Node:             cc-sauron/172.25.50.60
Start Time:       Wed, 16 Nov 2022 15:10:34 +0100
Labels:           app.kubernetes.io/instance=prometheus
                  app.kubernetes.io/name=grafana
                  pod-template-hash=5f848c4987
Annotations:      checksum/config: b9e953e845ac788d3c1ac8894062e8234ed2fd5b5ca91d5908976c4daf5c4bb8
                  checksum/dashboards-json-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
                  checksum/sc-dashboard-provider-config: fbdb192757901cdc4f977c611f5a1dceb959a1aa2df9f92542a0c410ce3be49d
                  checksum/secret: 12768ec288da87f3603cb2ca6c39ebc1ce1c2f42e0cee3d9908ba1463576782a
Status:           Running
IP:               10.244.0.115

The traffic therefore eventually reaches the Pod either by going through IP masquerading (re-mapping source IP address) or directly, depending on the initial source IP address.

Because of such rule, whenever a client connects to http://<NODE_IP_ADDRESS>:31577 even though there are no LISTENING sockets on the node, traffic is forwarded to the grafana Pod.

Should any process open a socket and bind it to the same port (31577, in this case), the Pod would still receive all traffic directed to that port since iptables rules are applied as soon as the packet reaches the kernel.

We can summarise the traffic flow – from external systems – like this:

kubernetes-networking-route Download

External references

The following pages helped a lot:

DevOps, DevSecOps, Kubernetes

Cloud Security – Curiefense deployment and configuration

27/12/2022 /

Table of Contents

Intro
Overview
Deploying as NGINX-Ingress
- Step-by-step guide
Deploying with docker-compose
Configuring as edge reverse proxy
Caveats

Intro

Curiefense is an open source project managed by Reblaze (see https://www.reblaze.com/).

It adds a security layer on top of your existing stack by scanning inbound network traffic.

It comes with a set of pre-configured rules out of the box which cover most of the known threats.

Full product documentation is available here: https://docs.curiefense.io/

Overview

Curiefense stack is made up of several components:

As viewable from the overview diagram above, all the incoming traffic has to go through the proxy. That’s where all rules are applied and traffic gets monitored and filtered, before to be routed to its final destination (the server).

Each inbound request gets logged and becomes part of metrics (data is stored into a mongoDB and Prometheus instance) and traffic logs (stored into Elasticsearch).

Metrics can then be exposed to Grafana and traffic logs become available either on Kibana or Grafana as well (by adding an extra data source, referring to Elasticsearch).

Rules that determine whether incoming traffic is eligible to be accepted/rejected or just tagged for future analysis are accessible through the Config Server (data is stored into a Redis instance).

Deploying as NGINX-Ingress

Official documentation provides instructions to deploy curiefense on top of an existing kubernetes cluster so that it gets attached to the ingress-controller (nginx).

Official how-to guide is available here: https://docs.curiefense.io/installation/deployment-first-steps/nginx-ingress

The steps included on the official guide linked above include the Config Server, the corresponding redis data store, but do NOT include all the other components (proxy, prometheus, mongodb, elasticsearch, grafana, kibana).

Step-by-step guide

Create a new namespace:

kubectl create namespace curiefense

In case you need to use a local bucket (rather than storage hosted on some cloud provider):
- create file secret.yaml with following content:

export CURIE_BUCKET_LINK=file:///u01/curiefense/prod/manifest.json

Create a Secret:

kubectl create secret generic curiesync --from-file=secret.yaml --dry-run=client -o yaml > curiesync-secret.yaml
kubectl apply -f curiesync-secret.yaml -n curiefense

Create file values.ingress.yaml with following content:

controller:
  image:
    repository: curiefense/curiefense-nginx-ingress
    tag: e2bd0d43d9ecd7c6544a8457cf74ef1df85547c2
 
  volumes:
    - name: curiesync
      secret:
        secretName: curiesync
 
  volumeMounts:
    - name: curiesync
      mountPath: /etc/curiefense

Install the helm chart:

helm repo add nginx-stable https://helm.nginx.com/stable
helm repo update
helm -n curiefense install --version 0.11.1 -f values.ingress.yaml ingress nginx-stable/nginx-ingress
(wrt version to use, see https://github.com/nginxinc/kubernetes-ingress/releases/tag/v2.0.1)

Create a file s3cfg-secret.yaml with the following content (dummy secret values since we are not using a s3 bucket but this secret is required to start up the application):

apiVersion: v1
kind: Secret
metadata:
  name: s3cfg
  namespace: curiefense
type: Opaque
stringData:
  s3cfg: |
    [default]
    access_key = test
    secret_key = test

Create the secret:

kubectl -n curiefense apply -f s3cfg-secret.yaml

Create a file values.curiefense.yaml with the following content:

global:
  proxy:
    frontend: "nginx"
 
  settings:
    curieconf_manifest_url: "file:///u01/curiefense/prod/manifest.json"

Clone and install the git repo:

git clone https://github.com/curiefense/curiefense-helm.git
helm install -n curiefense -f values.curiefense.yaml curiefense ./curiefense-helm/curiefense-helm/curiefense

Expose the Config Server web UI to make it accessible via browser:

kubectl expose service uiserver -n curiefense --port=8088 --target-port=80 --external-ip=172.25.50.44 --name=uiserver-external

Deploying with docker-compose

In this case, the deployment instructions include all the items listed on the overview diagram above.

Official how-to deployment guide: https://docs.curiefense.io/installation/deployment-first-steps/docker-compose

Application components URLs

Config Server web UI: http://YOUR_SERVER_IP:30080 (no authentication required by default)
Grafana: http://YOUR_SERVER_IP:30300 (admin / admin)

Adding Elasticsearch data source on Grafana

Testing the traffic filtering rules

The deployment instructions linked above include a test echoserver to which we can address a malicious request and see how curiefense reacts to that.

One of the pre-defines rules relates to SQL injections attacks.

We can simulate the request via curl from the server itself:

curl -vvv 'http://localhost:30081/?username=%22delete%20from%20a%22'
 
Response:
*   Trying ::1:30081...
* Connected to localhost (::1) port 30081 (#0)
> GET /?username=%22delete%20from%20a%22 HTTP/1.1
> Host: localhost:30081
> User-Agent: curl/7.76.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 473 Unknown
< content-length: 13
< content-type: text/plain
< date: Wed, 21 Dec 2022 09:14:19 GMT
< server: envoy
<
* Connection #0 to host localhost left intact
access denied

Request has been blocked with a HTTP 473 and error message “access denied”.

Response codes and messages can be customised through the Config Server UI:

The request has been logged on Elasticsearch with the following JSON entry:

{
    "_id": "9VT2M4UBh8jqMva1ar--",
    "_type": "_doc",
    "_index": "curieaccesslog-2022.12.20-000001",
    "sort": [
        1671614064340,
        17
    ],
    "@timestamp": "2022-12-21T09:14:24.340Z",
    "global_filter_triggers": [
        {
            "id": "3c266a476d1e",
            "name": "test",
            "active": false
        }
    ],
    "response_code": 473,
    "logs": [
        "D 0µs Inspection init",
        "D 22µs Inspection starts (grasshopper active: true)",
        "D 38µs CFGLOAD logs start",
        "D 1258µs Loading configuration from /cf-config/current/config",
        "D 1645580µs Loaded profile __defaultcontentfilter__ with 188 rules",
        "D 40µs CFGLOAD logs end",
        "D 41µs Selected hostmap default security policy",
        "D 92µs Selected hostmap entry __root_entry__",
        "D 94µs map_request starts",
        "D 110µs headers mapped",
        "D 118µs geoip computed",
        "D 123µs uri parsed",
        "D 123µs no body to parse",
        "D 124µs args mapped",
        "D 198µs challenge phase2 ignored",
        "D 198µs Global filter decision [BlockReason { initiator: GlobalFilter { id: \"3c266a476d1e\", name: \"test\" }, location: Request, extra_locations: [], decision: Monitor, extra: Null }]",
        "D 216µs limit checks done",
        "D 222µs ACL result: bot(none)/human(none)",
        "D 503µs matching content filter signatures: true",
        "D 515µs signature matched [0..15] ContentFilterRule { id: \"100007\", operand: \"(\\\"|'|\\\\s|;)delete\\\\s+from\\\\s+.+(--|'|\\\"|;)\", risk: 5, category: \"sqli\", subcategory: \"statement injection\", tags: {\"rtc:injection\"} }",
        "D 545µs Content Filter checks done"
    ],
    "profiling": [
        {
            "name": "secpol",
            "value": 93
        },
        {
            "name": "mapping",
            "value": 184
        },
        {
            "name": "flow",
            "value": 209
        },
        {
            "value": 216,
            "name": "limit"
        },
        {
            "name": "acl",
            "value": 223
        },
        {
            "value": 544,
            "name": "content_filter"
        }
    ],
    "content_filter_triggers": [
        {
            "id": "100007",
            "risk_level": 5,
            "value": "\"delete from a\"",
            "name": "username",
            "request_element": "uri",
            "active": true
        }
    ],
    "log": {
        "offset": 0,
        "file": {
            "path": ""
        }
    },
    "proxy": [
        {
            "name": "geo_long"
        },
        {
            "name": "geo_lat"
        },
        {
            "name": "container",
            "value": "curieproxyenvoy"
        }
    ],
    "authority": "localhost:30081",
    "reason": "blocking - content filter 100007[lvl5] - [URI argument username=\"delete from a\"]",
    "headers": [
        {
            "name": "x-envoy-internal",
            "value": "true"
        },
        {
            "value": "3b962173-5af0-4c21-b7af-f7b631148f82",
            "name": "x-request-id"
        },
        {
            "value": "curl/7.76.1",
            "name": "user-agent"
        },
        {
            "name": "accept",
            "value": "*/*"
        },
        {
            "name": "x-forwarded-for",
            "value": "172.18.0.1"
        },
        {
            "value": "https",
            "name": "x-forwarded-proto"
        }
    ],
    "ip": "172.18.0.1",
    "uri": "/?username=%22delete%20from%20a%22",
    "processing_stage": 6,
    "security_config": {
        "cf_rules": 188,
        "rate_limit_rules": 4,
        "global_filters_active": 7,
        "revision": "10861a33c58a25fe433596a736d6af8803e85214",
        "acl_active": true,
        "cf_active": true
    },
    "path_parts": [
        {
            "name": "path",
            "value": "/"
        }
    ],
    "path": "/",
    "method": "GET",
    "curiesession": "ab026b48001ae1563689b0171cf7966cefc4f75524f1c3f403cfdfb7",
    "timestamp": "2022-12-21T09:14:20.243210800Z",
    "trigger_counters": {
        "content_filters_active": 1,
        "acl": 0,
        "acl_active": 0,
        "global_filters": 1,
        "global_filters_active": 0,
        "rate_limit": 0,
        "rate_limit_active": 0,
        "content_filters": 1
    },
    "acl_triggers": [],
    "ecs": {
        "version": "1.8.0"
    },
    "restriction_triggers": [],
    "arguments": [
        {
            "name": "username",
            "value": "\"delete from a\""
        }
    ],
    "rate_limit_triggers": [],
    "input": {
        "type": "stdin"
    },
    "host": {
        "name": "curieproxyenvoy"
    },
    "agent": {
        "hostname": "curieproxyenvoy",
        "ephemeral_id": "0e742beb-b416-44ad-880e-08b35c69229b",
        "id": "9fee5a67-484e-4ad9-8740-29ba9c8aa9ec",
        "name": "curieproxyenvoy",
        "type": "filebeat",
        "version": "7.13.3"
    },
    "cookies": [],
    "tags": [
        "cookies:0",
        "geo-region:nil",
        "action:content-filter-block",
        "geo-org:nil",
        "geo-city:nil",
        "cf-rule-subcategory:statement-injection",
        "cf-rule-id:100007",
        "geo-continent-code:nil",
        "action:monitor",
        "securitypolicy:default-security-policy",
        "host:localhost:30081",
        "ip:172-18-0-1",
        "all",
        "bot",
        "args:1",
        "geo-country:nil",
        "headers:6",
        "securitypolicy-entry:--root--",
        "aclid:--acldefault--",
        "aclname:acl-default",
        "contentfilterid:--defaultcontentfilter--",
        "cf-rule-category:sqli",
        "rtc:injection",
        "geo-subregion:nil",
        "contentfiltername:default-contentfilter",
        "cf-rule-risk:5",
        "geo-continent-name:nil",
        "geo-asn:nil",
        "network:nil",
        "status:473",
        "status-class:4xx"
    ],
    "curiesession_ids": []
}

Configuring as edge reverse proxy

Curiefense comes with an envoy proxy which can be used as edge proxy.

When following the deployment with docker-compose, this component is included.

There are some steps to be taken into account, when it comes to configuring as reverse proxy.

The curiefense envoy reverse proxy image is built on top of official envoy proxy image. For more details, see https://docs.curiefense.io/reference/services-container-images#curieproxy-envoy

Even though the official documentation linked above mentions 1 configuration, so that requests are proxied to 1 destination (TARGET_ADDRESS:TARGET_PORT), it actually comes pre-configured to route 2 requests toward 2 back-ends (both 443 and 80).

Standard envoy proxy (see https://www.envoyproxy.io/ ) can indeed be configured with multiple back-ends, but the docker image built on top of it by curiefense (see https://github.com/curiefense/curiefense/tree/main/curiefense/images/curieproxy-envoy ) is actually including 4 environment variables (2 for each back-end, 1 for the address and one for the port).

The docker-compose.yaml part of the repo (https://github.com/curiefense/curiefense.git) looks like this (available at https://github.com/curiefense/curiefense/tree/main/deploy/compose):

version: "3.7"
services:
  curieproxyenvoy:
    container_name: curieproxyenvoy
    hostname: curieproxyenvoy
    image: "curiefense/curieproxy-envoy:${DOCKER_TAG}"
    restart: always
    volumes:
      - curieproxy_config:/cf-config
      - ./filebeat/ilm.json:/usr/share/filebeat/ilm.json
      - ./filebeat/template.json:/usr/share/filebeat/template.json
    environment:
      - ENVOY_UID
      - TARGET_ADDRESS_A=${TARGET_ADDRESS_A:-echo}           # 1st back-end
      - TARGET_PORT_A=${TARGET_PORT_A:-8080}             # 1st back-end
      - TARGET_ADDRESS_B=${TARGET_ADDRESS_B:-juiceshop}      # 2nd back-end
      - TARGET_PORT_B=${TARGET_PORT_B:-3000}             # 2nd back-end
      - XFF_TRUSTED_HOPS
      - ENVOY_LOG_LEVEL
      - FILEBEAT
      - FILEBEAT_LOG_LEVEL
      - ELASTICSEARCH_URL=${ELASTICSEARCH_URL:-http://elasticsearch:9200}
      - KIBANA_URL=${KIBANA_URL:-http://kibana:5601}
    networks:
      curiemesh:
        aliases:
          - curieproxy
    ports:
      - "30081:80"                                           # routing traffic from host port 30081 to container port 80
      - "30082:81"                                           # routing traffic from host port 30082 to container port 81
      - "30444:443"                                          # routing traffic from host port 30444 to container port 443
      - "30445:444"                                          # routing traffic from host port 30445 to container port 444
      - "8001:8001"                                          # routing traffic from host port 8001 to container port 8001
    secrets:
      - curieproxysslcrt
      - curieproxysslkey

Environment variables referring to elastic/kibana endpoints are also listed.

The service attributes include also 2 secrets (curieproxysslcrt and curieproxysslkey): They refer to these objects, still part of same file docker-compose.yaml:

secrets:
  curieproxysslcrt:
    file: "curiesecrets/curieproxy_ssl/site.crt"
  curieproxysslkey:
    file: "curiesecrets/curieproxy_ssl/site.key"

Secrets relate to TLS certificate public and private keys that will be exposed by the reverse proxy.

The file path is relative and root folder is the same where docker-compose.yaml file is hosted (curiefense/deploy/compose where curiefense is the folder that was created when you cloned the git repo).

To configure your TLS public certificate you can either overwrite the 2 files above or change the “secrets” configuration within docker-compose.yaml

Regarding the ports to be exposed, as viewable from the code snippet above, by default 80 and 443 are not served. To expose them, the following ports configuration will do the job:

    ports:
      - "80:80"
      - "443:443"

Assuming you need just 1 back-end, the 2 remaining values to be customised relate to TARGET_ADDRESS_A and TARGET_PORT_A.

They both refer to environment variables.

Customising environment variables, when using docker-compose, can be achieved in 2 ways:

defining their name/value within a file named .env available within the same folder of docker-compose.yaml (recommended)
exporting the variable name/value as system-wide environment variable (e.g. export NAME=value). By doing so, you would override any value defined within .env file mentioned above.

To assign back-end IP/PORT we can therefore create/edit the file .env

TARGET_ADDRESS_A=back-end1.sample.demo
TARGET_PORT_A=443

curiefense proxy image has been re-built since, as it is, proxying over HTTPS is not working properly.

curieproxy-envoy image builds up the complete envoy.yaml configuration file (main configuration file for envoy proxy) by putting together the three following files, all available at ~/curiefense/curiefense/curieproxy

envoy.yaml.head
envoy.yaml.tls
envoy.yaml.tail

Sections that needs to be adapted relate to envoy.yaml.tail:

  clusters:
  - name: target_site_a
    connect_timeout: 25s
    type: strict_dns # static
# START EXTRA SECTION 1
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
        common_tls_context:
          tls_certificates:
            - certificate_chain:
                filename: "/run/secrets/curieproxysslcrt"
              private_key:
                filename: "/run/secrets/curieproxysslkey"
          alpn_protocols: ["h2,http/1.1"]
# END EXTRA SECTION 1
    # Comment out the following line to test on v6 networks
    dns_lookup_family: V4_ONLY
    lb_policy: round_robin
# START EXTRA SECTION 2
    typed_extension_protocol_options:
      envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
        "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
        explicit_http_config:
          http2_protocol_options:
            initial_stream_window_size: 65536  # 64 KiB
            initial_connection_window_size: 1048576  # 1 MiB
# END EXTRA SECTION 2
    load_assignment:
      cluster_name: target_site_a
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: TARGET_ADDRESS_A
                port_value: TARGET_PORT_A
  - name: target_site_b
    connect_timeout: 25s
    type: strict_dns # static
    # Comment out the following line to test on v6 networks
    dns_lookup_family: V4_ONLY
    lb_policy: round_robin
    load_assignment:
      cluster_name: target_site_b
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: TARGET_ADDRESS_B
                port_value: TARGET_PORT_B

File envoy.yaml.tls:

  - name: tls
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 443
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: auto
          use_remote_address: true
          skip_xff_append: false
          access_log:
            name: envoy.file_access_log
            typed_config:
              "@type": "type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog"
              path: /dev/stdout
              log_format:
                text_format_source:
                  inline_string: "%DYNAMIC_METADATA(com.reblaze.curiefense:request.info)%\n"
                content_type: "application/json"
          route_config:
            name: local_route
            virtual_hosts:
            - name: target_site_a
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: target_site_a
                metadata:
                  filter_metadata:
                    envoy.filters.http.lua:
                      xff_trusted_hops: 1
          http_filters:
          - name: envoy.filters.http.lua
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
              default_source_code:
                inline_string: |
                  local session = require "lua.session_envoy"
                  function envoy_on_request(handle)
                    session.inspect(handle)
                  end
                  function envoy_on_response(handle)
                    session.on_response(handle)
                  end
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
            - certificate_chain:
                filename: "/run/secrets/curieproxysslcrt"
              private_key:
                filename: "/run/secrets/curieproxysslkey"
# START EXTRA SECTION
            alpn_protocols: ["h2,http/1.1"]
# END EXTRA SECTION
  - name: tlsjuice
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 444
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: auto
          use_remote_address: true
          skip_xff_append: false
          access_log:
            name: "envoy.access_loggers.tcp_grpc"
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.grpc.v3.HttpGrpcAccessLogConfig
              common_config:
                log_name: "test_GRPC_log"
                transport_api_version: "v3"
                grpc_service:
                  envoy_grpc:
                    cluster_name: grpc_log_cluster
          route_config:
            name: local_route
            virtual_hosts:
            - name: target_site_b
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: target_site_b
                metadata:
                  filter_metadata:
                    envoy.filters.http.lua:
                      xff_trusted_hops: 1
          http_filters:
          - name: envoy.filters.http.lua
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
              default_source_code:
                inline_string: |
                  local session = require "lua.session_envoy"
                  function envoy_on_request(handle)
                    session.inspect(handle)
                  end
                  function envoy_on_response(handle)
                    session.on_response(handle)
                  end
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
              - certificate_chain:
                  filename: "/run/secrets/curieproxysslcrt"
                private_key:
                  filename: "/run/secrets/curieproxysslkey"

Once the 2 files listed above have been changed by adding the extra sections visible on the code snippets, you need to re-build the image.

Cd to folder curiefense/curiefense/images and you fill find the following objects:

curiefense/curiefense/images drwxrwxr-x 16 pxcs-admin pxcs-admin 4.0K Dec 24 14:35 .
drwxrwxr-x  6 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 ..
-rwxrwxr-x  1 pxcs-admin pxcs-admin 3.9K Dec 23 15:46 build-docker-images.sh
drwxrwxr-x  4 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 confserver
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 curiefense-nginx-ingress
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 curiefense-rustbuild
drwxrwxr-x  3 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 curieproxy-envoy
drwxrwxr-x  3 pxcs-admin pxcs-admin 4.0K Dec 23 15:07 curieproxy-extproc
drwxrwxr-x  3 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 curieproxy-istio
drwxrwxr-x  3 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 curieproxy-nginx
drwxrwxr-x  3 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 curiesync
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 extproc
drwxrwxr-x  3 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 grafana
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 openresty
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 prometheus
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 redis
drwxrwxr-x  2 pxcs-admin pxcs-admin 4.0K Dec 22 15:08 traffic-metrics-exporter

The script build-docker-images.sh must be executed and it will look after re-building new image that will include changes you applied to envoy yaml configuration files.

As it is, the script rebuilds all images for all components part of curiefense stack. In case you need to re-build only curieproxy-envoy you will have to edit the script and comment all the rest:

#! /bin/bash
 
# Change directory to this script's location
cd "${0%/*}" || exit 1
 
# Parameters should be passed as environment variables.
# By default, builds and tags images locally, without pushing
# To push, set `PUSH=1`
# To specify a different repo, set `REPO=my.repo.tld`
 
REPO=${REPO:-curiefense}
BUILD_OPT=${BUILD_OPT:-}
BUILD_RUST=${BUILD_RUST:-yes}
 
declare -A status
 
GLOBALSTATUS=0
 
if [ -z "$DOCKER_TAG" ]
then
    GITTAG="$(git describe --tag --long --dirty)"
    DOCKER_DIR_HASH="$(git rev-parse --short=12 HEAD:curiefense)"
    DOCKER_TAG="${DOCKER_TAG:-$GITTAG-$DOCKER_DIR_HASH}"
fi
 
STOP_ON_FAIL=${STOP_ON_FAIL:-yes}
 
IFS=' ' read -ra RUST_DISTROS <<< "${RUST_DISTROS:-bionic focal}"
 
if [ -n "$TESTIMG" ]; then
    IMAGES=("$TESTIMG")
    OTHER_IMAGES_DOCKER_TAG="$DOCKER_TAG"
    DOCKER_TAG="test"
    echo "Building only image $TESTIMG"
else
# SECTION BELOW DEFINES WHICH IMAGES WILL BE RE_BUILT
#    IMAGES=(confserver curieproxy-istio curieproxy-envoy \
#        curieproxy-nginx curiesync grafana prometheus extproc \
#        redis traffic-metrics-exporter)
        IMAGES=(curieproxy-envoy)
fi
. . .

Once you adapted the script according to your needs, you can run it (as root).

Once the execution completes, stdout will show the version of image just built (e.g. v1.5.0-824-gc904993f-dirty-88950e011065).

Now, since we decided to re-build only this specific image and not all images for all containers, we need to define a new environment variable (to be included into .env file) and then refer the same into curiefense-envoy service part of docker-compose.yaml.

.env:

ENVOY_UID=0
DOCKER_TAG=main
 
# BELOW THE IMAGE VERSION TO BE USED FOR ENVOY_PROXY
DOCKER_TAG_ENVOY_PROXY="v1.5.0-824-gc904993f-dirty-88950e011065"
 
XFF_TRUSTED_HOPS=1
ENVOY_LOG_LEVEL=error
EXTPROC_LOG_LEVEL=info
ELASTICSEARCH="--elasticsearch http://elasticsearch:9200/"
FILEBEAT=yes
CURIE_BUCKET_LINK=file:///bucket/prod/manifest.json

docker-compose.yaml (extract):

version: "3.7"
services:
 
  curieproxyenvoy:
    container_name: curieproxyenvoy
    hostname: curieproxyenvoy
    #image: "curiefense/curieproxy-envoy:${DOCKER_TAG}"
    image: "curiefense/curieproxy-envoy:${DOCKER_TAG_ENVOY_PROXY}"   # <-- CUSTOM IMAGE VERSION
    restart: always
    volumes:
      - curieproxy_config:/cf-config
      - ./filebeat/ilm.json:/usr/share/filebeat/ilm.json
      - ./filebeat/template.json:/usr/share/filebeat/template.json
    environment:
      - ENVOY_UID
      - TARGET_ADDRESS_A=${TARGET_ADDRESS_A:-pxcs-service.sandbox.diit.health}
      - TARGET_PORT_A=${TARGET_PORT_A:-443}
      - TARGET_ADDRESS_B=${TARGET_ADDRESS_B:-juiceshop}
      - TARGET_PORT_B=${TARGET_PORT_B:-3000}
      - XFF_TRUSTED_HOPS
      - ENVOY_LOG_LEVEL
      - FILEBEAT
      - FILEBEAT_LOG_LEVEL
      - ELASTICSEARCH_URL=${ELASTICSEARCH_URL:-http://elasticsearch:9200}
      - KIBANA_URL=${KIBANA_URL:-http://kibana:5601}
    networks:
      curiemesh:
        aliases:
          - curieproxy
    ports:
      - "80:80"
      - "443:443"
      - "8001:8001"
    secrets:
      - curieproxysslcrt
      - curieproxysslkey
. . .

Caveats

Replacing elasticsearch with Grafana-loki does not seem to be possible at the moment I am writing this post.

The only item I could find which refers to a possible implementation relates to this post: https://github.com/curiefense/curiefense/issues/4

DevOps, Grafana, Kubernetes, Monitoring Tools

Kubernetes observability – log aggregation – Grafana-loki deployment and configuration

17/12/2022 /

Table of Contents

Intro
Requirements
Loki deployment
Promtail deployment
Loki datasource configuration on Grafana admin UI
Browsing logs from Grafana UI

Intro

This page describes how to deploy and apply basic configurations – including retention policies – to a promtail-loki stack.

Loki is a log storage solution tightly integrated with Grafana. It can ingest logs from multiple sources (in our case, containers), index them and makes them accessible via Grafana UI.

Its functionalities overlap with elasticsearch. Grafana-loki is more lightweight since it indexes only entries metadata and not the entire content of each log line.

Data can be pushed into loki with multiple solutions (e.g. promtail, fluent bit, fluentd, logstash, etc.). See https://grafana.com/docs/loki/latest/clients/

This page describes how to use promtail for such purpose.

The following setup is not meant to be used on production environments.

Requirements

A k8s cluster including Grafana
all appropriate configurations to use kubectl command line tool

Loki deployment

Add loki helm chart repo

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Create a file values.yaml to store all chart settings that must be overridden from the default values

loki:
  commonConfig:
    replication_factor: 1
  storage:
    type: 'filesystem'
  compactor:
    working_directory: /var/loki/data/retention
    shared_store: filesystem
    compaction_interval: 10m
    retention_enabled: true
    retention_delete_delay: 2h
    retention_delete_worker_count: 150
  schema_config:
    configs:
      - from: "2022-12-01"
        index:
            period: 24h
            prefix: loki_index_
        object_store: filesystem
        schema: v11
        store: boltdb-shipper
  storage_config:
    boltdb_shipper:
        active_index_directory: /var/loki/data/index
        cache_location: /var/loki/data/boltdb-cache
        shared_store: filesystem
  limits_config:
    retention_period: 24h
write:
  replicas: 1
read:
  replicas: 1

Create the namespace

kubectl create namespace loki

Create 2 PersistentVolumes that will be used by loki read / write components

apiVersion: v1
kind: PersistentVolume
metadata:
  name: loki-pv-1
  namespace: loki
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  persistentVolumeReclaimPolicy: Retain
  local:
    path: [YOUR_NODE_LOCAL_STORAGE_FOLDER_1]
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - [YOUR_NODE_NAME]
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: loki-pv-2
  namespace: loki
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  persistentVolumeReclaimPolicy: Retain
  local:
    path: [YOUR_NODE_LOCAL_STORAGE_FOLDER_2]
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - [YOUR_NODE_NAME]

Install the helm chart

helm install --values values.yaml loki --namespace=loki grafana/loki-simple-scalable

Once all components are started up, you should have the following scenario:

[rockylinux@test-vm grafana-loki]$ kubectl get all -n loki
NAME                                               READY   STATUS    RESTARTS   AGE
pod/loki-gateway-55fccf8654-vcxqt                  1/1     Running   0          23h
pod/loki-grafana-agent-operator-684b478b77-vwh9g   1/1     Running   0          23h
pod/loki-logs-wwcp5                                2/2     Running   0          23h
pod/loki-read-0                                    1/1     Running   0          32m
pod/loki-write-0                                   1/1     Running   0          32m
 
NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/loki-gateway          ClusterIP   10.106.191.67    <none>        80/TCP              23h
service/loki-memberlist       ClusterIP   None             <none>        7946/TCP            23h
service/loki-read             ClusterIP   10.103.120.150   <none>        3100/TCP,9095/TCP   23h
service/loki-read-headless    ClusterIP   None             <none>        3100/TCP,9095/TCP   23h
service/loki-write            ClusterIP   10.98.226.44     <none>        3100/TCP,9095/TCP   23h
service/loki-write-headless   ClusterIP   None             <none>        3100/TCP,9095/TCP   23h
 
NAME                       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/loki-logs   1         1         1       1            1           <none>          23h
 
NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/loki-gateway                  1/1     1            1           23h
deployment.apps/loki-grafana-agent-operator   1/1     1            1           23h
 
NAME                                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/loki-gateway-55fccf8654                  1         1         1       23h
replicaset.apps/loki-grafana-agent-operator-684b478b77   1         1         1       23h
 
NAME                          READY   AGE
statefulset.apps/loki-read    1/1     23h
statefulset.apps/loki-write   1/1     23h

Promtail deployment

Promtail tails log files and pushed them into loki.

To deploy all required components, apply the following yaml:

--- # Daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: promtail-daemonset
spec:
  selector:
    matchLabels:
      name: promtail
  template:
    metadata:
      labels:
        name: promtail
    spec:
      serviceAccount: promtail-serviceaccount
      containers:
      - name: promtail-container
        image: grafana/promtail
        args:
        - -config.file=/etc/promtail/promtail.yaml
        env:
        - name: 'HOSTNAME' # needed when using kubernetes_sd_configs
          valueFrom:
            fieldRef:
              fieldPath: 'spec.nodeName'
        volumeMounts:
        - name: logs
          mountPath: /var/log
        - name: promtail-config
          mountPath: /etc/promtail
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
      volumes:
      - name: logs
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: promtail-config
        configMap:
          name: promtail-config
--- # Daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: promtail-daemonset
spec:
  selector:
    matchLabels:
      name: promtail
  template:
    metadata:
      labels:
        name: promtail
    spec:
      serviceAccount: promtail-serviceaccount
      containers:
      - name: promtail-container
        image: grafana/promtail
        args:
        - -config.file=/etc/promtail/promtail.yaml
        env:
        - name: 'HOSTNAME' # needed when using kubernetes_sd_configs
          valueFrom:
            fieldRef:
              fieldPath: 'spec.nodeName'
        volumeMounts:
        - name: logs
          mountPath: /var/log
        - name: promtail-config
          mountPath: /etc/promtail
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
      volumes:
      - name: logs
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: promtail-config
        configMap:
          name: promtail-config -- # configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: promtail-config
data:
  promtail.yaml: |
    server:
      http_listen_port: 9080
      grpc_listen_port: 0
 
    clients:
      - url: http://loki-write.loki.svc.cluster.local:3100/loki/api/v1/push
        tenant_id: 1
 
    positions:
      filename: /tmp/positions.yaml
    target_config:
      sync_period: 10s
    scrape_configs:
    - job_name: pod-logs
      kubernetes_sd_configs:
        - role: pod
      pipeline_stages:
        - docker: {}
      relabel_configs:
        - source_labels:
            - __meta_kubernetes_pod_node_name
          target_label: __host__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - action: replace
          replacement: $1
          separator: /
          source_labels:
            - __meta_kubernetes_namespace
            - __meta_kubernetes_pod_name
          target_label: job
--- # Daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: promtail-daemonset
spec:
  selector:
    matchLabels:
      name: promtail
  template:
    metadata:
      labels:
        name: promtail
    spec:
      serviceAccount: promtail-serviceaccount
      containers:
      - name: promtail-container
        image: grafana/promtail
        args:
        - -config.file=/etc/promtail/promtail.yaml
        env:
        - name: 'HOSTNAME' # needed when using kubernetes_sd_configs
          valueFrom:
            fieldRef:
              fieldPath: 'spec.nodeName'
        volumeMounts:
        - name: logs
          mountPath: /var/log
        - name: promtail-config
          mountPath: /etc/promtail
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
      volumes:
      - name: logs
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: promtail-config
        configMap:
          name: promtail-config -- # configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: promtail-config
data:
  promtail.yaml: |
    server:
      http_listen_port: 9080
      grpc_listen_port: 0
 
    clients:
      - url: http://loki-write.loki.svc.cluster.local:3100/loki/api/v1/push
        tenant_id: 1
 
    positions:
      filename: /tmp/positions.yaml
    target_config:
      sync_period: 10s
    scrape_configs:
    - job_name: pod-logs
      kubernetes_sd_configs:
        - role: pod
      pipeline_stages:
        - docker: {}
      relabel_configs:
        - source_labels:
            - __meta_kubernetes_pod_node_name
          target_label: __host__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - action: replace
          replacement: $1
          separator: /
          source_labels:
            - __meta_kubernetes_namespace
            - __meta_kubernetes_pod_name
          target_label: job         - action: replace
          source_labels:
            - __meta_kubernetes_namespace
          target_label: namespace
        - action: replace
          source_labels:
            - __meta_kubernetes_pod_name
          target_label: pod
        - action: replace
          source_labels:
            - __meta_kubernetes_pod_container_name
          target_label: container
        - replacement: /var/log/pods/*$1/*.log
          separator: /
          source_labels:
            - __meta_kubernetes_pod_uid
            - __meta_kubernetes_pod_container_name
          target_label: __path__
--- # Clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: promtail-clusterrole
rules:
  - apiGroups: [""]
    resources:
    - nodes
    - services
    - pods
    verbs:
    - get
    - watch
    - list
 
--- # ServiceAccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: promtail-serviceaccount
 
--- # Rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: promtail-clusterrolebinding
subjects:
    - kind: ServiceAccount
      name: promtail-serviceaccount
      namespace: default
roleRef:
    kind: ClusterRole
    name: promtail-clusterrole
    apiGroup: rbac.authorization.k8s.io

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Step 1: Create a new ServiceAccount

Step 2: Create a new ClusterRole with sufficient permissions to access api-server metrics endpoint via HTTP GET

Step 3: Create new ClusterRoleBinding to bind the ServiceAccount with ClusterRole

Step 4: Export ServiceAccount’s token Secret’s name

Step 5: Extract Bearer token from Secret and decode it

Step 6: Extract, decode and write the ca.crt to a temporary location

Final step: Test access to metrics endpoint

Configuring as additional scrape target on Prometheus

You May Also Like

Intro

Preparing all hosts (master/worker nodes)

Preparing the Load Balancer

Installing client tools

CFSSL (Cloud Flare SSL tool)

Generating TLS certificate

Creating the certificate for the Etcd cluster

Etcd installation and configuration (only Master nodes)

Master nodes initialisation

Master node #1

Master node #2

Master node #3

Worker nodes initialisation

Checking nodes status

Possible issues

Node NotReady – Pod CIDR not available

Assigning role to worker nodes

You May Also Like

Intro

kube-proxy operating modes

kube-proxy – iptables mode

External references

You May Also Like

Intro

Overview

Deploying as NGINX-Ingress

Step-by-step guide

Deploying with docker-compose

Application components URLs

Adding Elasticsearch data source on Grafana

Testing the traffic filtering rules

Configuring as edge reverse proxy

Caveats

You May Also Like

Intro

Requirements

Loki deployment

Promtail deployment

Loki datasource configuration on Grafana admin UI

Browsing logs from Grafana UI

You May Also Like