Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
This the multi-page printable view of this section. Click here to print.
Design Docs
- 1: ARM
- 1.1: CentOS ARM Analysis
- 1.2: RedHat ARM Analysis
- 1.3: Ubuntu ARM Analysis
- 2: Autoscaling
- 3: AWS
- 4: Backup
- 4.1: Operational
- 4.2: Cloud
- 5: Cache Storage
- 6: CI/CD
- 7: Command Line
- 8: Harbor Registry
- 9: Health Monitor
- 10: Infrastructure
- 11: Kubernetes/Vault Integration
- 12: Kafka Authentication
- 13: Kafka Monitoring Tools
- 14: Kubernetes HA
- 15: Leader Election Pod
- 16: Modularization
- 17: Offline Upgrade
- 18: Persistence Storage
- 19: PostgreSQL
- 20: Ceph (Rook)
1 - ARM
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack ARM design document
Affected version: 1.1.x
Goals
This document outlines an aproach to add (partial) ARM support to LambdaStack. The requirements:
- ARMv8/ARM64 architecture
- Centos 7
- "any" provider as we do not want to provide ARM infrastructure on any cloud providers yet through Terraform
- LambdaStack components needed ordered by priority:
- Kubernetes
- Kafka
- HAProxy
- Keycloak (This is the only deployment we need to support from the
applications
role) - PostgreSQL (Would only be used by keycloak and does not needs to support a single deployment)
- RabbitMQ
- Logging (ELK + filebeat)
- Monitoring (Prometheus + Grafana + Exporters) Initial research here shows additional information about available packages and effected roles for each component.
Approach
The 2 high level approaches that have been opted so far:
- Add “architecture” flag when using LambdaStack
- Add new OS (CentosARM64 fe.)
Have 2 big disadvanges from the start:
- Will require an additional input which makes things more confusing as they will need supply not only the OS but also Architecture for (offline) install. This should not be needed as we can detect the architecture we are working on, on all required levels.
- Does not require additional input but this will lead to code duplication in the
repository
role as we need to maintaindownload-requirements.sh
for each OS and architecture then.
That is why I opt for an approach where we don't add any architecture flag or new additional OS. The architecture we can handle on the code level and on the OS level only the requirements.txt
might be different for each as indicated by initial research here.
Changes required
Repostitory role
In the repository role we need to change the download of the requirements to support additional architectures as download requirements might be different as:
- Some components/roles might not have packages/binaries/containers that support ARM
- Some filenames for binaries will be different per architecture
- Some package repositories will have different URLs per architecture
Hence we should make a requirements.txt for each architecture we want to support, for example:
- requirements_x86_64.txt (Should be the default and present)
- requirements_arm64.txt
The download-requirements.sh
script should be able to figure out which one to select based on the output of:
uname -i
Download role
In the download role, which is used to download plain files from the repository, we should add support for filename patterns and automatically look for current architecture (optionally with regex based suffix like linux[_-]amd64\.(tar\.gz|tar|zip)
):
For example select between:
- haproxy_exporter-0.12.0.linux-
x86_64
.tar.gz - haproxy_exporter-0.12.0.linux-
arm64
.tar.gz
based on ansible_architecture
fact.
Note that this should be optional as some filenames do not contain architecture like Java based packages for example.
Artitecture support for each component/role
As per current requirements not every LambdaStack component is required to support ARM and there might be cases that a component/role can't support ARM as indicated by initial research here.
Thats why every component/role should be marked which architecture it supports. Maybe something in <rolename>/defaults/main.yml
like:
supported_architectures:
- all ?
- x86_64
- arm64
We can assume the role/component will support everything if all
is defined or if supported_architectures
is not present.
Pre-flight check
The preflight
should be expanded to check if all the components/roles we want to install from the inventory actually support the architecture we want to use. We should be able to do this with the definition from the above point. This way we will make sure people can only install components on ARM which we actually support.
Replace Skopeo with Crane
Currently we use Skopeo to download the image requirements. Skopeo however has the following issues with newer versions:
- No support anymore for universal Go binaries. Each OS would need to have each own build version
- Sketchy support for ARM64
That is why we should replace it with Crane.
- This tool can do the same as Skopeo:
./skopeo --insecure-policy copy docker://kubernetesui/dashboard:v2.3.1 docker-archive:skopeodashboard:v2.3.1
./crane pull --insecure kubernetesui/dashboard:v2.3.1 dashboard.tar
The above will produce the same Docker image package.
- Supports the universal cross distro binary.
- Has support for both ARM64 and x86_64.
- Has official pre-build binaries, unlike Skopeo.
1.1 - CentOS ARM Analysis
CentOS requirements.txt ARM analysis
Packages
Name | ARM Supported | Info | Required |
---|---|---|---|
apr | + | + | |
apr-util | + | + | |
centos-logos | + | ? | |
createrepo | + | + | |
deltarpm | + | + | |
httpd | + | + | |
httpd-tools | + | + | |
libxml2-python | + | + | |
mailcap | + | + | |
mod_ssl | + | + | |
python-chardet | + | + | |
python-deltarpm | + | + | |
python-kitchen | + | + | |
yum-utils | + | + | |
audit | + | + | |
bash-completion | + | + | |
c-ares | + | --- | |
ca-certificates | + | + | |
cifs-utils | + | + | |
conntrack-tools | + | + | |
containerd.io | + | + | |
container-selinux | + | ? | |
cri-tools-1.13.0 | + | ? | |
curl | + | + | |
dejavu-sans-fonts | + | + | |
docker-ce-19.03.14 | + | + | |
docker-ce-cli-19.03.14 | + | + | |
ebtables | + | + | |
elasticsearch-curator-5.8.3 | --- | elasticsearch-curator-3.5.1 (from separate repo v3) | + |
elasticsearch-oss-7.9.1 | + | + | |
erlang-23.1.4 | + | + | |
ethtool | + | + | |
filebeat-7.9.2 | + | + | |
firewalld | + | + | |
fontconfig | + | + | |
fping | + | + | |
gnutls | + | + | |
grafana-7.3.5 | + | + | |
gssproxy | + | + | |
htop | + | + | |
iftop | + | + | |
ipset | + | + | |
java-1.8.0-openjdk-headless | + | + | |
javapackages-tools | + | + | |
jq | + | + | |
libini_config | + | + | |
libselinux-python | + | + | |
libsemanage-python | + | + | |
libX11 | + | + | |
libxcb | + | + | |
libXcursor | + | + | |
libXt | + | + | |
logrotate | + | + | |
logstash-oss-7.8.1 | + | + | |
net-tools | + | + | |
nfs-utils | + | + | |
nmap-ncat | + | ? | |
opendistro-alerting-1.10.1* | + | + | |
opendistro-index-management-1.10.1* | + | + | |
opendistro-job-scheduler-1.10.1* | + | + | |
opendistro-performance-analyzer-1.10.1* | + | + | |
opendistro-security-1.10.1* | + | + | |
opendistro-sql-1.10.1* | + | + | |
opendistroforelasticsearch-kibana-1.10.1* | --- | opendistroforelasticsearch-kibana-1.13.0 | + |
openssl | + | + | |
perl | + | + | |
perl-Getopt-Long | + | + | |
perl-libs | + | + | |
perl-Pod-Perldoc | + | + | |
perl-Pod-Simple | + | + | |
perl-Pod-Usage | + | + | |
pgaudit12_10 | + | --- | |
pgbouncer-1.10.* | --- | --- | |
pyldb | + | + | |
python-firewall | + | + | |
python-kitchen | + | + | |
python-lxml | + | + | |
python-psycopg2 | + | + | |
python-setuptools | + | ? | |
python-slip-dbus | + | + | |
python-ipaddress | + | ? | |
python-backports | + | ? | |
quota | + | ? | |
rabbitmq-server-3.8.9 | + | + | |
rh-haproxy18 | --- | --- | |
rh-haproxy18-haproxy-syspaths | --- | --- | |
postgresql10-server | + | + | |
repmgr10-4.0.6 | --- | --- | |
samba-client | + | + | |
samba-client-libs | + | + | |
samba-common | + | + | |
samba-libs | + | + | |
sysstat | + | + | |
tar | + | + | |
telnet | + | + | |
tmux | + | + | |
urw-base35-fonts | + | + | |
unzip | + | + | |
vim-common | + | + | |
vim-enhanced | + | + | |
wget | + | + | |
xorg-x11-font-utils | + | + | |
xorg-x11-server-utils | + | + | |
yum-plugin-versionlock | + | + | |
yum-utils | + | + | |
rsync | + | + | |
kubeadm-1.18.6 | + | + | |
kubectl-1.18.6 | + | + | |
kubelet-1.18.6 | + | + | |
kubernetes-cni-0.8.6-0 | + | + | |
Files
Images
Name | ARM Supported | Info | Required |
---|---|---|---|
haproxy:2.2.2-alpine | + | arm64v8/haproxy | + |
kubernetesui/dashboard:v2.3.1 | + | + | |
kubernetesui/metrics-scraper:v1.0.7 | + | + | |
registry:2 | + | ||
hashicorp/vault-k8s:0.7.0 | --- | https://hub.docker.com/r/moikot/vault-k8s / custom build | --- |
vault:1.7.0 | + | --- | |
apacheignite/ignite:2.9.1 | --- | https://github.com/apache/ignite/tree/master/docker/apache-ignite / custom build | --- |
bitnami/pgpool:4.1.1-debian-10-r29 | --- | --- | |
brainsam/pgbouncer:1.12 | --- | --- | |
istio/pilot:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
istio/proxyv2:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
istio/operator:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
jboss/keycloak:4.8.3.Final | --- | + | |
jboss/keycloak:9.0.0 | --- | + | |
rabbitmq:3.8.9 | + | + | |
coredns/coredns:1.5.0 | + | + | |
quay.io/coreos/flannel:v0.11.0 | + | + | |
calico/cni:v3.8.1 | + | + | |
calico/kube-controllers:v3.8.1 | + | + | |
calico/node:v3.8.1 | + | + | |
calico/pod2daemon-flexvol:v3.8.1 | + | + | |
k8s.gcr.io/kube-apiserver:v1.18.6 | + | k8s.gcr.io/kube-apiserver-arm64:v1.18.6 | + |
k8s.gcr.io/kube-controller-manager:v1.18.6 | + | k8s.gcr.io/kube-controller-manager-arm64:v1.18.6 | + |
k8s.gcr.io/kube-scheduler:v1.18.6 | + | k8s.gcr.io/kube-scheduler-arm64:v1.18.6 | + |
k8s.gcr.io/kube-proxy:v1.18.6 | + | k8s.gcr.io/kube-proxy-arm64:v1.18.6 | + |
k8s.gcr.io/coredns:1.6.7 | --- | coredns/coredns:1.6.7 | + |
k8s.gcr.io/etcd:3.4.3-0 | + | k8s.gcr.io/etcd-arm64:3.4.3-0 | + |
k8s.gcr.io/pause:3.2 | + | k8s.gcr.io/pause-arm64:3.2 | + |
Custom builds
Build multi arch image for Keycloak 9:
Clone repo: https://github.com/keycloak/keycloak-containers/
Checkout tag: 9.0.0
Change dir to: keycloak-containers/server
Create new builder: docker buildx create --name mybuilder
Switch to builder: docker buildx use mybuilder
Inspect builder and make sure it supports linux/amd64, linux/arm64: docker buildx inspect --bootstrap
Build and push container: docker buildx build --platform linux/amd64,linux/arm64 -t repo/keycloak:9.0.0 --push .
Additional info:
https://hub.docker.com/r/jboss/keycloak/dockerfile
https://github.com/keycloak/keycloak-containers/
https://docs.docker.com/docker-for-mac/multi-arch/
Components to roles mapping
Component name | Roles |
---|---|
Repository | repository image-registry node-exporter firewall filebeat docker |
Kubernetes | kubernetes-master kubernetes-node applications node-exporter haproxy_runc kubernetes_common |
Kafka | zookeeper jmx-exporter kafka kafka-exporter node-exporter |
ELK (Logging) | logging elasticsearch elasticsearch_curator logstash kibana node-exporter |
Exporters | node-exporter kafka-exporter jmx-exporter haproxy-exporter postgres-exporter |
PostgreSQL | postgresql postgres-exporter node-exporter |
Keycloak | applications |
RabbitMQ | rabbitmq node-exporter |
HAProxy | haproxy haproxy-exporter node-exporter haproxy_runc |
Monitoring | prometheus grafana node-exporter |
Except above table, components require following roles to be checked:
- upgrade
- backup
- download
- firewall
- filebeat
- recovery (n/a kubernetes)
1.2 - RedHat ARM Analysis
RedHat requirements.txt ARM analysis
Packages
Name | ARM Supported | Info | Required |
---|---|---|---|
apr | + | + | |
apr-util | + | + | |
redhat-logos | + | ? | |
createrepo | + | + | |
deltarpm | + | + | |
httpd | + | + | |
httpd-tools | + | + | |
libxml2-python | + | + | |
mailcap | + | + | |
mod_ssl | + | + | |
python-chardet | + | + | |
python-deltarpm | + | + | |
python-kitchen | + | + | |
yum-utils | + | + | |
audit | + | + | |
bash-completion | + | + | |
c-ares | + | --- | |
ca-certificates | + | + | |
cifs-utils | + | + | |
conntrack-tools | + | + | |
containerd.io | + | + | |
container-selinux | + | ? | |
cri-tools-1.13.0 | + | ? | |
curl | + | + | |
dejavu-sans-fonts | + | + | |
docker-ce-19.03.14 | + | + | |
docker-ce-cli-19.03.14 | + | + | |
ebtables | + | + | |
elasticsearch-curator-5.8.3 | --- | elasticsearch-curator-3.5.1 (from separate repo v3) | + |
elasticsearch-oss-7.10.2 | + | + | |
ethtool | + | + | |
filebeat-7.9.2 | + | + | |
firewalld | + | + | |
fontconfig | + | + | |
fping | + | + | |
gnutls | + | + | |
grafana-7.3.5 | + | + | |
gssproxy | + | + | |
htop | + | + | |
iftop | + | + | |
ipset | + | + | |
java-1.8.0-openjdk-headless | + | + | |
javapackages-tools | + | + | |
jq | + | + | |
libini_config | + | + | |
libselinux-python | + | + | |
libsemanage-python | + | + | |
libX11 | + | + | |
libxcb | + | + | |
libXcursor | + | + | |
libXt | + | + | |
logrotate | + | + | |
logstash-oss-7.8.1 | + | + | |
net-tools | + | + | |
nfs-utils | + | + | |
nmap-ncat | + | ? | |
opendistro-alerting-1.13.1* | + | + | |
opendistro-index-management-1.13.1* | + | + | |
opendistro-job-scheduler-1.13.1* | + | + | |
opendistro-performance-analyzer-1.13.1* | + | + | |
opendistro-security-1.13.1* | + | + | |
opendistro-sql-1.13.1* | + | + | |
opendistroforelasticsearch-kibana-1.13.1* | + | + | |
unixODBC | + | + | |
openssl | + | + | |
perl | + | + | |
perl-Getopt-Long | + | + | |
perl-libs | + | + | |
perl-Pod-Perldoc | + | + | |
perl-Pod-Simple | + | + | |
perl-Pod-Usage | + | + | |
pgaudit12_10 | ? | --- | |
pgbouncer-1.10.* | ? | --- | |
policycoreutils-python | + | + | |
pyldb | + | + | |
python-cffi | + | + | |
python-firewall | + | + | |
python-kitchen | + | + | |
python-lxml | + | + | |
python-psycopg2 | + | + | |
python-pycparser | + | + | |
python-setuptools | + | ? | |
python-slip-dbus | + | + | |
python-ipaddress | + | ? | |
python-backports | + | ? | |
quota | + | ? | |
rabbitmq-server-3.8.9 | + | + | |
rh-haproxy18 | --- | --- | |
rh-haproxy18-haproxy-syspaths | --- | --- | |
postgresql10-server | + | + | |
repmgr10-4.0.6 | --- | --- | |
samba-client | + | + | |
samba-client-libs | + | + | |
samba-common | + | + | |
samba-libs | + | + | |
sysstat | + | + | |
tar | + | + | |
telnet | + | + | |
tmux | + | + | |
urw-base35-fonts | ? | Need to be verified, no package found | + |
unzip | + | + | |
vim-common | + | + | |
vim-enhanced | + | + | |
wget | + | + | |
xorg-x11-font-utils | + | + | |
xorg-x11-server-utils | + | + | |
yum-plugin-versionlock | + | + | |
yum-utils | + | + | |
rsync | + | + | |
kubeadm-1.18.6 | + | + | |
kubectl-1.18.6 | + | + | |
kubelet-1.18.6 | + | + | |
kubernetes-cni-0.8.6-0 | + | + | |
Files
Images
Name | ARM Supported | Info | Required |
---|---|---|---|
haproxy:2.2.2-alpine | + | arm64v8/haproxy | + |
kubernetesui/dashboard:v2.3.1 | + | + | |
kubernetesui/metrics-scraper:v1.0.7 | + | + | |
registry:2 | + | ||
hashicorp/vault-k8s:0.7.0 | --- | https://hub.docker.com/r/moikot/vault-k8s / custom build | --- |
vault:1.7.0 | + | --- | |
lambdastack/keycloak:9.0.0 | + | custom build | + |
bitnami/pgpool:4.1.1-debian-10-r29 | --- | --- | |
brainsam/pgbouncer:1.12 | --- | --- | |
istio/pilot:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
istio/proxyv2:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
istio/operator:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
jboss/keycloak:4.8.3.Final | --- | --- | |
jboss/keycloak:9.0.0 | --- | --- | |
rabbitmq:3.8.9 | --- | --- | |
coredns/coredns:1.5.0 | + | + | |
quay.io/coreos/flannel:v0.11.0 | + | + | |
calico/cni:v3.8.1 | + | + | |
calico/kube-controllers:v3.8.1 | + | + | |
calico/node:v3.8.1 | + | + | |
calico/pod2daemon-flexvol:v3.8.1 | + | + | |
k8s.gcr.io/kube-apiserver:v1.18.6 | + | k8s.gcr.io/kube-apiserver-arm64:v1.18.6 | + |
k8s.gcr.io/kube-controller-manager:v1.18.6 | + | k8s.gcr.io/kube-controller-manager-arm64:v1.18.6 | + |
k8s.gcr.io/kube-scheduler:v1.18.6 | + | k8s.gcr.io/kube-scheduler-arm64:v1.18.6 | + |
k8s.gcr.io/kube-proxy:v1.18.6 | + | k8s.gcr.io/kube-proxy-arm64:v1.18.6 | + |
k8s.gcr.io/coredns:1.6.7 | --- | coredns/coredns:1.6.7 | + |
k8s.gcr.io/etcd:3.4.3-0 | + | k8s.gcr.io/etcd-arm64:3.4.3-0 | + |
k8s.gcr.io/pause:3.2 | + | k8s.gcr.io/pause-arm64:3.2 | + |
Custom builds
Build multi arch image for Keycloak 9:
Clone repo: https://github.com/keycloak/keycloak-containers/
Checkout tag: 9.0.0
Change dir to: keycloak-containers/server
Create new builder: docker buildx create --name mybuilder
Switch to builder: docker buildx use mybuilder
Inspect builder and make sure it supports linux/amd64, linux/arm64: docker buildx inspect --bootstrap
Build and push container: docker buildx build --platform linux/amd64,linux/arm64 -t repo/keycloak:9.0.0 --push .
Additional info:
https://hub.docker.com/r/jboss/keycloak/dockerfile
https://github.com/keycloak/keycloak-containers/
https://docs.docker.com/docker-for-mac/multi-arch/
Components to roles mapping
Component name | Roles |
---|---|
Repository | repository image-registry node-exporter firewall filebeat docker |
Kubernetes | kubernetes-master kubernetes-node applications node-exporter haproxy_runc kubernetes_common |
Kafka | zookeeper jmx-exporter kafka kafka-exporter node-exporter |
ELK (Logging) | logging elasticsearch elasticsearch_curator logstash kibana node-exporter |
Exporters | node-exporter kafka-exporter jmx-exporter haproxy-exporter postgres-exporter |
PostgreSQL | postgresql postgres-exporter node-exporter |
Keycloak | applications |
RabbitMQ | rabbitmq node-exporter |
HAProxy | haproxy haproxy-exporter node-exporter haproxy_runc |
Monitoring | prometheus grafana node-exporter |
Except above table, components require following roles to be checked:
- backup
- recovery (n/a kubernetes)
Known issues:
- Postgresql repository need to be verify : "https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7Server-aarch64/repodata/repomd.xml: [Errno 14] HTTPS Error 404 - Not Found"
- Additional repositories need to be enabled: "rhel-7-for-arm-64-extras-rhui-rpms" and "rhel-7-for-arm-64-rhui-rpms"
- No package found for urw-base35-fonts
- Only RHEL-7.6 and 8.x images are available for AWS
1.3 - Ubuntu ARM Analysis
Ubuntu requirements.txt ARM analysis
Packages
Name | ARM Supported | Info | Required |
---|---|---|---|
adduser | + | + | |
apt-transport-https | + | + | |
auditd | + | + | |
bash-completion | + | + | |
build-essential | + | + | |
ca-certificates | + | + | |
cifs-utils | + | + | |
containerd.io | + | + | |
cri-tools | + | + | |
curl | + | + | |
docker-ce | + | + | |
docker-ce-cli | + | + | |
ebtables | + | + | |
elasticsearch-curator | + | + | |
elasticsearch-oss | + | + | |
erlang-asn1 | + | + | |
erlang-base | + | + | |
erlang-crypto | + | + | |
erlang-eldap | + | + | |
erlang-ftp | + | + | |
erlang-inets | + | + | |
erlang-mnesia | + | + | |
erlang-os-mon | + | + | |
erlang-parsetools | + | + | |
erlang-public-key | + | + | |
erlang-runtime-tools | + | + | |
erlang-snmp | + | + | |
erlang-ssl | + | + | |
erlang-syntax-tools | + | + | |
erlang-tftp | + | + | |
erlang-tools | + | + | |
erlang-xmerl | + | + | |
ethtool | + | + | |
filebeat | + | + | |
firewalld | + | + | |
fping | + | + | |
gnupg2 | + | + | |
grafana | + | + | |
haproxy | + | + | |
htop | + | + | |
iftop | + | + | |
jq | + | + | |
libfontconfig1 | + | + | |
logrotate | + | + | |
logstash-oss | + | + | |
netcat | + | + | |
net-tools | + | + | |
nfs-common | + | + | |
opendistro-alerting | + | + | |
opendistro-index-management | + | + | |
opendistro-job-scheduler | + | + | |
opendistro-performance-analyzer | + | + | |
opendistro-security | + | + | |
opendistro-sql | + | + | |
opendistroforelasticsearch-kibana | + | + | |
openjdk-8-jre-headless | + | + | |
openssl | + | + | |
postgresql-10 | + | + | |
python-pip | + | + | |
python-psycopg2 | + | + | |
python-selinux | + | + | |
python-setuptools | + | + | |
rabbitmq-server | + | + | |
smbclient | + | + | |
samba-common | + | + | |
smbclient | + | + | |
software-properties-common | + | + | |
sshpass | + | + | |
sysstat | + | + | |
tar | + | + | |
telnet | + | + | |
tmux | + | + | |
unzip | + | + | |
vim | + | + | |
rsync | + | + | |
libcurl4 | + | + | |
libnss3 | + | + | |
libcups2 | + | + | |
libavahi-client3 | + | + | |
libavahi-common3 | + | + | |
libjpeg8 | + | + | |
libfontconfig1 | + | + | |
libxtst6 | + | + | |
fontconfig-config | + | + | |
python-apt | + | + | |
python | + | + | |
python2.7 | + | + | |
python-minimal | + | + | |
python2.7-minimal | + | + | |
gcc | + | + | |
gcc-7 | + | + | |
g++ | + | + | |
g++-7 | + | + | |
dpkg-dev | + | + | |
libc6-dev | + | + | |
cpp | + | + | |
cpp-7 | + | + | |
libgcc-7-dev | + | + | |
binutils | + | + | |
gcc-8-base | + | + | |
libodbc1 | + | + | |
apache2 | + | + | |
apache2-bin | + | + | |
apache2-utils | + | + | |
libjq1 | + | + | |
gnupg | + | + | |
gpg | + | + | |
gpg-agent | + | + | |
smbclient | + | + | |
samba-libs | + | + | |
libsmbclient | + | + | |
postgresql-client-10 | + | + | |
postgresql-10-pgaudit | + | + | |
postgresql-10-repmgr | + | + | |
postgresql-common | + | + | |
pgbouncer | + | + | |
ipset | + | + | |
libipset3 | + | + | |
python3-decorator | + | + | |
python3-selinux | + | + | |
python3-slip | + | + | |
python3-slip-dbus | + | + | |
libpq5 | + | + | |
python3-psycopg2 | + | + | |
python3-jmespath | + | + | |
libpython3.6 | + | + | |
python-cryptography | + | + | |
python-asn1crypto | + | + | |
python-cffi-backend | + | + | |
python-enum34 | + | + | |
python-idna | + | + | |
python-ipaddress | + | + | |
python-six | + | + | |
kubeadm | + | + | |
kubectl | + | + | |
kubelet | + | + | |
kubernetes-cni | + | + | |
Files
Images
Name | ARM Supported | Info | Required |
---|---|---|---|
haproxy:2.2.2-alpine | + | arm64v8/haproxy | + |
kubernetesui/dashboard:v2.3.1 | + | + | |
kubernetesui/metrics-scraper:v1.0.7 | + | + | |
registry:2 | + | ||
hashicorp/vault-k8s:0.7.0 | --- | https://hub.docker.com/r/moikot/vault-k8s / custom build | --- |
vault:1.7.0 | + | --- | |
apacheignite/ignite:2.9.1 | --- | https://github.com/apache/ignite/tree/master/docker/apache-ignite / custom build | --- |
bitnami/pgpool:4.1.1-debian-10-r29 | --- | --- | |
brainsam/pgbouncer:1.12 | --- | --- | |
istio/pilot:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
istio/proxyv2:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
istio/operator:1.8.1 | --- | https://github.com/istio/istio/issues/21094 / custom build | --- |
jboss/keycloak:4.8.3.Final | --- | + | |
jboss/keycloak:9.0.0 | --- | + | |
rabbitmq:3.8.9 | + | + | |
coredns/coredns:1.5.0 | + | + | |
quay.io/coreos/flannel:v0.11.0 | + | + | |
calico/cni:v3.8.1 | + | + | |
calico/kube-controllers:v3.8.1 | + | + | |
calico/node:v3.8.1 | + | + | |
calico/pod2daemon-flexvol:v3.8.1 | + | + | |
k8s.gcr.io/kube-apiserver:v1.18.6 | + | k8s.gcr.io/kube-apiserver-arm64:v1.18.6 | + |
k8s.gcr.io/kube-controller-manager:v1.18.6 | + | k8s.gcr.io/kube-controller-manager-arm64:v1.18.6 | + |
k8s.gcr.io/kube-scheduler:v1.18.6 | + | k8s.gcr.io/kube-scheduler-arm64:v1.18.6 | + |
k8s.gcr.io/kube-proxy:v1.18.6 | + | k8s.gcr.io/kube-proxy-arm64:v1.18.6 | + |
k8s.gcr.io/coredns:1.6.7 | --- | coredns/coredns:1.6.7 | + |
k8s.gcr.io/etcd:3.4.3-0 | + | k8s.gcr.io/etcd-arm64:3.4.3-0 | + |
k8s.gcr.io/pause:3.2 | + | k8s.gcr.io/pause-arm64:3.2 | + |
Custom builds
Build multi arch image for Keycloak 9:
Clone repo: https://github.com/keycloak/keycloak-containers/
Checkout tag: 9.0.0
Change dir to: keycloak-containers/server
Create new builder: docker buildx create --name mybuilder
Switch to builder: docker buildx use mybuilder
Inspect builder and make sure it supports linux/amd64, linux/arm64: docker buildx inspect --bootstrap
Build and push container: docker buildx build --platform linux/amd64,linux/arm64 -t repo/keycloak:9.0.0 --push .
Additional info:
https://hub.docker.com/r/jboss/keycloak/dockerfile
https://github.com/keycloak/keycloak-containers/
https://docs.docker.com/docker-for-mac/multi-arch/
Components to roles mapping
Component name | Roles |
---|---|
Repository | repository image-registry node-exporter firewall filebeat docker |
Kubernetes | kubernetes-master kubernetes-node applications node-exporter haproxy_runc kubernetes_common |
Kafka | zookeeper jmx-exporter kafka kafka-exporter node-exporter |
ELK (Logging) | logging elasticsearch elasticsearch_curator logstash kibana node-exporter |
Exporters | node-exporter kafka-exporter jmx-exporter haproxy-exporter postgres-exporter |
PostgreSQL | postgresql postgres-exporter node-exporter |
Keycloak | applications |
RabbitMQ | rabbitmq node-exporter |
HAProxy | haproxy haproxy-exporter node-exporter haproxy_runc |
Monitoring | prometheus grafana node-exporter |
Except above table, components require following roles to be checked:
- upgrade
- backup
- download
- firewall
- filebeat
- recovery (n/a kubernetes)
2 - Autoscaling
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack Autoscaling
Affected version: 0.7.x
1. Goals
We want to provide automatic scale up / down feature for cloud-based LambdaStack clusters (currently Azure and AWS).
- Clusters will be resized in reaction to the resource utilisation (CPU and Memory).
- Existing LambdaStack automation will be reused and optimized for the purpose of autoscaling.
- Additional nodes will be added (removed) to (from) running Kubernetes clusters.
- Horizontal Pod Autoscaler will be used to control number of pods for particular deployment.
2. Design proposal
PHASE 1: Adding ability to scale-down the pool of worker nodes.
- Current LambdaStack codebase does not allow to scale-down Kubernetes clusters in the nice / proper way.
- This is crucial for autoscaling to work, as we need to properly drain and delete physically-destroyed nodes from Kuberentes.
- Also this step needs to be performed before terraform code is executed (which requires a refactor of lambdastack code).
PHASE 2: Moving terraform's state and lambdastack-cluster-config to a shared place in the cloud.
- Currently LambdaStack keeps state files and cluster configs in the
build/xxx/
directories, which causes them not to be shared easily. - To solve the issue, terraform beckends can be used: for Azure and for AWS.
- For simplicity the same "bucket" can be used to store and share lambdastack-cluster-config.
PHASE 3: Building packer images to quickly add new Kubernetes nodes.
- Autoscaling is expected to react reasonably quickly. Providing pre-built images should result in great speed-ups.
- Packer code should be added to the lambdastack codebase somewhere "before" the terraform code executes.
PHASE 4: Realistic provisioning minimalization and speedup.
- Currently LambdaStack's automation takes lots of time to provision clusters.
- Limits and tags can be used to filter-out unnecessary plays from ansible execution (for now, narrowing it just to the Kubernetes node provisioning).
PHASE 5: Adding ability to authenticate and run lambdastack from a pod.
- To be able to execute lambdastack form a running LambdaStack cluster, it is required to deploy SSH keys and cloud access configuration (ie. Service Principal).
- SSH keys can be created and distributed automatically (in Ansible) just for the purpose of autoscaling.
- For now, it seems resonable to store them in Kubernetes secrets (later the Hashicorp Vault will be used).
PHASE 6: Introducing python application that will execute lambdastack from a pod (in reaction to performance metrics) to scale the pool of worker nodes.
- Metrics can be obtained from the metrics server.
- For simplicity, standard CPU / Memory metrics will be used, but later it should be posible to introduce custom metrics taken from Prometheus.
- Best way to package and deploy the application would be to use Helm (v3).
- The docker image for the application can be stored in a public docker registry.
PHASE 7: Introducing standard Horizontal Pod Autoscaler to scale pods in LambdaStack clusters.
- To scale Kubernetes pods in LambdaStack clusters the Horizontal Pod Autoscaler will be used.
- This step will be dependent and the user / customer (user will deploy and configure proper resources inside Kubernetes).
3 - AWS
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack AWS support design document
Affected version: 0.3.0
Goals
Provide AWS support:
- Infrastructure setup automation
- AWS OS images support (RHEL, Ubuntu)
- Cluster security based on rules
- Virtual machines should be able to belong to different subnets within the LambdaStack cluster. Requirement is to have at least two subnets - one for Load Balancing (internet facing) and one for other components.
- Virtual machines should have data disk (when configured to have such)
- Components (Kafka, Postgresql, Prometheus, ElasticSearch) should be configured to use data disk space
- Cluster should not use any public IP except
Load Balancer
Use cases
Support AWS cloud to not rely only on single provider.
Proposed network design
LambdaStack on AWS will create Resource Group
that will contain all cluster components. One of the resources will be Amazon VPC (Virtual Private Cloud) that is isolated section of AWS cloud.
Inside of VPC, many subnets will be provisioned by LambdaStack automation - based on data provided by user or using defaults. Virtual machines and data disks will be created and placed inside a subnet.
4 - Backup
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack backup design document
Affected version: 0.4.x
Goals
Provide backup functionality for LambdaStack - cluster created using lambdastack tool.
Backup will cover following areas:
-
1.1 etcd database
1.2 kubeadm config
1.3 certificates
1.4 persistent volumes
1.5 applications deployed on the cluster
-
2.1 Kafka topic data
2.2 Kafka index
2.3 Zookeeper settings and data
-
3.1 Elasticsearch data
3.2 Kibana settings
-
4.1 Prometheus data
4.2 Prometheus settings (properties, targets)
4.3 Alertmanager settings
4.4 Grafana settings (datasources, dashboards)
-
5.1 All databases from DB
Use cases
User/background service/job is able to backup whole cluster or backup selected parts and store files in desired location. There are few options possible to use for storing backup:
- S3
- Azure file storage
- local file
- NFS
Application/tool will create metadata file that will be definition of the backup - information that can be useful for restore tool. This metadata file will be stored within backup file.
Backup is packed to zip/gz/tar.gz file that has timestamp in the name. If name collision occurred name+'_1'
will be used.
Example use
lsbackup -b /path/to/build/dir -t /target/location/for/backup
Where -b
is path to build folder that contains Ansible inventory and -t
contains target path to store backup.
Backup Component View
User/background service/job executes lsbackup
(code name) application. Application takes parameters:
-b
: build directory of existing cluster. Most important is ansible inventory existing in this directory - so it can be assumed that this should be folder of Ansible inventory file.-t
: target location of zip/tar.gz file that will contain backup files and metadata file.
Tool when executed looks for the inventory file in -b
location and executes backup playbooks. All playbooks are optional, in MVP version it can try to backup all components (it they exists in the inventory). After that, some components can be skipped (by providing additional flag, or parameter to cli).
Tool also produces metadata file that describes backup with time, backed up components and their versions.
1. Kubernetes cluster backup
There are few ways of doing backups of existing Kuberntes cluster. Going to take into further research two approaches.
First: Backup etcd database and kubeadm config of single master node. Instruction can be found here. Simple solution for that will backup etcd which contains all workload definitions and settings.
Second: Use 3rd party software to create a backup like Heptio Velero - Apache 2.0 license, Velero GitHub
2. Kafka backup
Possible options for backing up Kafka broker data and indexes:
-
Mirror using Kafka Mirror Maker. It requires second Kafka cluster running independently that will replicate all data (including current offset and consumer groups). It is used mostly for multi-cloud replication.
-
Kafka-connect – use Kafka connect to get all topic and offset data from Kafka an save to it filesystem (NFS, local, S3, ...) called Sink connector.
2.1 Confluent Kafka connector – that use Confluent Kafka Community License Agreement
2.2 Use another Open Source connector like kafka-connect-s3 (BSD) or kafka-backup (Apache 2.0) -
File system copy: take Kafka broker and ZooKeeper data stored in files and copy it to backup location. It requires Kafka Broker to be stopped. Solution described in Digital Ocean post.
3. Elastic stack backup
Use built-in features of Elasticsearch to create backup like:
PUT /_snapshot/my_unverified_backup?verify=false
{
"type": "fs",
"settings": {
"location": "my_unverified_backup_location"
}
}
More information can be found here.
OpenDistro uses similar way of doing backups - it should be compatible. OpenDistro backups link.
4. Monitoring backup
Prometheus from version 2.1 is able to create data snapshot by doing HTTP request:
curl -XPOST http://localhost:9090/api/v1/admin/tsdb/snapshot
Snapshot will be created in <data-dir>/snapshots/SNAPSHOT-NAME-RETURNED-IN-RESPONSE
Files like targets and Prometheus/AlertManager settings should be also copied to backup location.
5. PostgreSQL backup
Relational DB backup mechanisms are the most mature ones. Simplest solution is to use standard PostgreSQL backup funtions. Valid option is also to use pg_dump.
6. RabbitMQ settings and user data
RabbitMQ has standard way of creating backup.
7. HAProxy settings backup
Copy HAProxy configuration files to backup location.
4.1 - Operational
LambdaStack backup design document with details
Affected version: 0.7.x
Goals
This document is extension of high level design doc: LambdaStack backup design document and describes more detailed, operational point-of-view of this case. Document does not include Kubernetes and Kafka stack
Components
lsbackup application
Example use:
lambdastack backup -b build_dir -t target_path
Where -b
is path to build folder that contains Ansible inventory and -t
contains target path to store backup.
backup
runs tasks from ansible backup role
build_dir
contains cluster's ansible inventory
target_path
location to store backup, see Storage section below.
Consider to add disclaimer for user to check whether backup location has enough space to store whole backup.
Storage
Location created on master node to keep backup files. This location might be used to mount external storage, like:
- Amazon S3
- Azure blob
- NFS
- Any external disk mounted by administrator
In cloud configuration blob or S3 storage might be mounted directly on every machine in cluster and can be configured by LambdaStack. For on-prem installation it's up to administrator to attach external disk to backup location on master node. This location should be shared with other machines in cluster as NFS.
Backup scripts structure:
Role backup
Main role for backup
contains ansible tasks to run backups on cluster components.
Tasks:
-
Elasticsearch & Kibana
1.1. Create local location where snapshot will be stored: /tmp/snapshots 1.2. Update elasticsearch.yml file with backup location
```bash path.repo: ["/tmp/backup/elastic"] ```
1.3. Reload configuration 1.4. Register repository:
curl -X PUT "https://host_ip:9200/_snapshot/my_backup?pretty" \n -H 'Content-Type: application/json' -d ' { "type": "fs", "settings": { "location": "/tmp/backup/elastic" } } '
1.5. Take snapshot:
curl -X GET "https://host_ip:9200/_snapshot/my_repository/1" \n -H 'Content-Type: application/json'
This command will create snapshot in location sent in step 1.2
1.5. Backup restoration:
curl -X POST "https://host_ip:9200/_snapshot/my_repository/2/_restore" -H 'Content-Type: application/json'
Consider options described in opendistro documentation
1.6. Backup configuration files:
/etc/elasticsearch/elasticsearch.yml /etc/kibana/kibana.yml
-
Monitoring
2.1.1 Prometheus data
Prometheus
delivers solution to create data snapshot. Admin access is required to connect to application api with admin privileges. By default admin access is disabled, and needs to be enabled before snapshot creation. To enable admin access--web.enable-admin-api
needs to be set up while starting service:service configuration: /etc/systemd/system/prometheus.service systemctl daemon-reload systemctl restart prometheus
Snapshot creation:
curl -XPOST http://localhost:9090/api/v1/admin/tsdb/snapshot
By default snapshot is saved in data directory, which is configured in Prometheus service configuration file as flag:
--storage.tsdb.path=/var/lib/prometheus
Which means that snapshot directory is creted under:
/var/lib/prometheus/snapshots/yyyymmddThhmmssZ-*
After snapshot admin access throuh API should be reverted.
Snapshot restoration process is just pointing
--storage.tsdb.path
parameter to snaphot location and restart Prometheus.2.1.2. Prometheus configuration
Prometheus configurations are located in:
/etc/prometheus
2.2. Grafana backup and restore
Copy files from grafana home folder do desired location and set up correct permissions:
location: /var/lib/grafana content: - dashboards - grafana.db - plugins - png (contains renederes png images - not necessary to back up)
2.3 Alert manager
Configuration files are located in:
/etc/prometheus
File
alertmanager.yml
should be copied in step 2.1.2 if exists -
PostgreSQL
3.1. Basically PostgreSQL delivers two main tools for backup creation: pg_dump and pg_dumpall
pg_dump
create dump of selected database:pg_dump dbname > dbname.bak
pg_dumpall
- create dump of all databases of a cluster into one script. This dumps also global objects that are common to all databases like: users, groups, tablespaces and properties such as access permissions (pg_dump does not save these objects)pg_dumpall > pg_backup.bak
3.2. Database resotre: psql or pg_restore:
psql < pg_backup.bak pgrestore -d dbname db_name.bak
3.3. Copy configuration files:
/etc/postgresql/10/main/* - configuration files .pgpass - authentication credentials
-
RabbitMQ
4.1. RabbitMQ definicions might be exported using API (rabbitmq_management plugins need to be enabled):
rabbitmq-plugins enable rabbitmq_management curl -v -X GET http://localhost:15672/api/definitions -u guest:guest -H "content-type:application/json" -o json
Import backed up definitions:
curl -v -X POST http://localhost:15672/api/definitions -u guest:guest -H "content-type:application/json" --data backup.json
or add backup location to configuration file and restart rabbitmq:
management.load_definitions = /path/to/backup.json
4.2 Backing up RabbitMQ messages To back up messages RabbitMQ must be stopped. Copy content of rabbitmq mnesia directory:
RABBITMQ_MNESIA_BASE ubuntu: /var/lib/rabbitmq/mnesia
Restoration: place these files to similar location
4.3 Backing up configuration:
Copy
/etc/rabbitmq/rabbitmq.conf
file -
HAProxy
Copy /etc/haproxy/
to backup location
Copy certificates stored in /etc/ssl/haproxy/
location.
4.2 - Cloud
LambdaStack cloud backup design document
Affected version: 0.5.x
Goals
Provide backup functionality for LambdaStack - cluster created using lambdastack tool.
Use cases
Creating snapshots of disks for all elements in environment created on cloud.
Example use
lsbackup --disks-snapshot -f path_to_data_yaml
Where -f
is path to data yaml file with configuration of environment. --disks-snapshot
informs about option that will create whole disks snapshot.
Backup Component View
User/background service/job executes lsbackup
(code name) application. Application takes parameters:
-f
: path to data yaml file with configuration of environment.--disks-snapshot
: option to create whole disk snapshot
Tool when executed takes resource group from file provided with -f
flag and create snapshots of all elements in resource group.
Tool also produces metadata file that describes backup with time and the name of disks for which snapshot has been created.
5 - Cache Storage
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack cache storage design document
Affected version: 0.4.x
Goals
Provide in-memory cache storage that will be capable of store large amount of data with hight performance.
Use cases
LambdaStack should provide cache storage for key-value stores, latest value taken from queue (Kafka).
Architectural decision
Considered options are:
- Apache Ignite
- Redis
Description | Apache Ignite | Redis |
---|---|---|
License | Apache 2.0 | three clause BSD license |
Partition method | Sharding | Sharding |
Replication | Yes | Control Plane-Node - yes, Control Plane - Control Plane - only enterprise version |
Transaction concept | ACID | Optimistic lock |
Data Grid | Yes | N/A |
In-memory DB | Distributed key-value store, in-memory distributed SQL database | key-value store |
Integration with RDBMS | Can integrate with any relational DB that supports JDBC driver (Oracle, PostgreSQL, Microsoft SQL Server, and MySQL) | Possible using 3rd party software |
Integration with Kafka | Using Streamer (Kafka Streamer, MQTT Streamer, ...) possible to insert to cache |
Required 3rd party service |
Machine learning | Apache Ignite Machine Learning - tools for building predictive ML models | N/A |
Based on above - Apache Ignite is not just scalable in-memory cache/database but cache and processing platform which can run transactional, analytical and streaming workloads. While Redis is simpler, Apache Ignite offers lot more features with Apache 2.0 licence.
Choice: Apache Ignite
Design proposal
[MVP] Add Ansible role to lambdastack
that installs Apache Ignite and sets up cluster if there is more than one instance. Ansible playbook is also responsible for adding more nodes to existing cluster (scaling).
Possible problems while implementing Ignite clustering:
- Ignite uses multicast for node discovery which is not supported on AWS. Ignite distribution comes with
TcpDiscoveryS3IpFinder
so S3-based discovery can be used.
To consider:
- Deploy Apache Ignite cluster in Kubernetes
6 - CI/CD
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
Comparision of CI/CD tools
Research of available solutions
After some research I found below tools. I group them by categories in columns:
name | paid | open source | self hosted | cloud hosted |
---|---|---|---|---|
jenkin-x | 0 | 1 | 1 | 0 |
tekton | 0 | 1 | 1 | 0 |
jenkins | 0 | 1 | 1 | 0 |
gitlabCI | 0 | 1 | 1 | 0 |
goCD | 0 | 1 | 1 | 0 |
bazel | 0 | 1 | 1 | 0 |
argoCD | 0 | 1 | 1 | 0 |
spinnaker | 0 | 1 | 1 | 0 |
buildBot | 0 | 1 | 1 | 0 |
Travis | 0 | 0 | 0 | 1 |
buddy | 1 | 0 | 1 | 1 |
circleCI | 1 | 0 | 1 | 1 |
TeamCity | 1 | 0 | 1 | 1 |
CodeShip | 1 | 0 | 0 | 1 |
azureDevOps | 1 | 0 | 0 | 1 |
Bamboo | 1 | 0 | 1 | 0 |
First for recognition goes only open source and free (at least in our usage model) tools.
Closer look on choosen tools
name | paid | open source | self hosted | cloud hosted | comment |
---|---|---|---|---|---|
jenkins-x | 0 | 1 | 1 | 0 | |
tekton | 0 | 1 | 1 | 0 | |
jenkins | 0 | 1 | 1 | 0 | |
gitlabCi | 0 | 1 | 1 | 0 | requires use GitLab |
goCD | 0 | 1 | 1 | 0 | |
argoCD | 0 | 1 | 1 | 0 | CD tool requie other CI tool |
bazel | 0 | 1 | 1 | 0 | this is build engine not a build server |
spinnaker | 0 | 1 | 1 | 0 | mostly used for CD purposes |
buildBot | 0 | 1 | 1 | 0 | looks worse then previous tools |
Travis | 0/1 | 0 | 0 | 1 | In our usage model we will have to pay |
After closer look I consider this tools:
goCD
jenkins-x
tekton
jenkins
argoCD
- this is CD tools so it's not compared in table belowspinnaker
- wasn't tested because it is CD tools and we need also CI tool
Comparision
Run server on kubernetes
gocd: easily installed by helm chart, requires to be accesible from outside cluster if we want to access UI. Can be run on Linux systems also
jenkins: can be easily started on any cluster
jenkins-x: hard to set up on running cluster. I created new kubernetes cluster by their tool which generally is ok - but in my vision it will be good to use it on LambdaStack cluster (eat your own dog food vs drink your own champane). Many (probably all) services works based on DNS names so also I have to use public domain (use mine personal)
tekton: easily started on LambdaStack cluster.
Accesses
gocd: , OAuth, LDAP or internal database
jenkins: OIDC, LDAP, internal, etc.
jenkins-x: Jenkins X uses Role-Based Access Control (RBAC) policies to control access to its various resources
tekton: For building purposes there is small service which webhooks can connect and there predined pipeline is starting. For browsing purposes dashboard has no restrictions - it's open for everybody - this could be restricted by HAProxy or nginx. Only things you can do in dashbord is re-run pipeline or remove historical builds. Nothing more can be done.
Pipeline as a Code
gocd: possible and looks ok, pipeline code can be in different repository
jenkins: possible and looks ok
jenkins-x: possible looks ok (Tekton)
tekton: pipelines are CRD so can be only as a code
Build in pods
gocd: Elastic agent concepts. Can create many groups (probably on different clusters - not tested yet) and assigned them to proper pipelines
jenkins: plugin for building in kubernetes
jenkins-x: building in pods in cluster jenkins-x is installed. Possible to install many jenkins-x servers (according to documentation per each team in different namespace). Able to run in multi cluster mode
tekton: building in cluster easily. Not possible to build on different server - but I didn't any sence in that use case. Possible to deploy on other kubernetes service.
Secrets
gocd: Plugins for secrets from: hashicorp vault, kubernetes secrets, file based
jenkins: plugins for many options: hashicorp vault, kubernetes secrets, internal secrets, etc
jenkins-x: Providers for secrets from: hashicorp vault, kubernetes secrets
tekton: Use secrets from kubernetes so everything what is inside kubernetes can be read
Environment varaibles:
gocd: multiple level of variables: environment, pipeline, stage, job
jenkins: environment variables can be overriden
jenkins-x: Didn't find any information but expect it will not be worst than in gocd
tekton: You can read env variables from any config map so this is kind of overriding.
Plugins
gocd: not big number of plugins (but is this really bad?) but very of them really usefull (LDAP, running in pods, vault, k8s secrets, docker registry, push to S3, slack notification, etc)
jenkins: many plugins. But if there is too much of them they start making serious issues. Each plugin has different quality and each can breake the server and has its own security issues so we have to be very careful with them.
jenkins-x: plugins are called app. There are few of them and this app are helm charts. Jenkins-x uses embeded nexus, chartmuseum and monocular services. I don't know if the is option to get rid of them.
tekton: tekton itself is kind of plugin for building. You can create whatever you want in different pod and get it.
Personal conclusion
gocd:
- This looks like really good CI/CD central server which can be use by many teams.
- Really mature application. Older release on github from Nov 2014. According to wiki first release in 2007.
- very intuitive
- Working really good in kubernetes
- Good granuality of permission.
- Good documentation
- Small amount of help in Internet (compare to jenkins)
- Small community
GoCD can be easily set up for our organizations. Adding new customers should not be big deal. Working with is very intuitive - old school concept of CICD.
jenkins:
- Production ready
- The most search CI/CD tool in google - so almost every case is describe somwhere
- Very simple
- Working very good in kubernetes
- After using it for some time pipelines are getting bigger and harder to maintain
- Good granuality of permission
- XML configuration for many plugins
- Big amount of information in Internet
- Big community
The most popular CI/CD tool. Small and simple. You can do everything as a code or by GUI - which is not good because it's temptation to fix it right now and then probably do not put it to repository. A lot of plugins which and each of them is single point of failure. Hard to configure some plugin as a code - but still possible.
jenkins-x:
- There is new sheriff in town - new way of maintainig CICD server
- New application still under heavy development (don't know what exactly but number of commits is really big)
- New concept of CICD, a lot of magic doing under the hood, GitOps and ChatOps
- Designed to work inside oif kubernetes
- Still don't know how to manage permissions
- Big community (CDFoundation is under Linux Foundation)
Jenkins-x is definetly new sheriff in town. But to enable it in big existing organization with new way of CICD process requires changing the way of thinking about all process. So it's really hot topic, but is it ok for us to pay that price.
tekton:
- New concept of CI - serverless.
- Tekton is young (first release 20 Feb 2019).
- Is a part of jenkins-x so it's simpler when you starting playing with it and still you can configure everything as in jenkins-x by yourself.
- Easy to install in LambdaStack cluster - kubernetes CRD
- Easy to install triggers which allow to build when request is comming.
- It should be separate namespace for every team. Builds will be running in one cluster using the same hosts.
- No permission to dashboard. It has to be resolve by properly configure HAProxy or nginx in front of dashboard. Dashboard is running as kubernetes service.
- Big community.
- Smal but good enough help regarding tekton itself. Under the hood it's kubernetes so you can configure it as you want.
Comparing it previous solutions jenkins-x is using tekton. So it has less features then jenkins-x - and thanks to that is simpler - but by deafult I was not able to configure really usefull feature building on push. There is such possibility by running tekton triggers which is realy simple. This project is under CDFoundation and has a big community which is really good. My personal choice.
Another concept CI and CD tool
Use separate tools for Continious Integration and Continious Deployment. In this concept I recognized Tekton for building and ArgoCD for delivery purposes.
ArgoCD
In ArgoCD you can easily deploy one of your applications described as kubernetes resources into one of your kubernetes clusters. In that case recommended option is to have two repos one for code and one for configuration. Thanks to that you can easily separate code from configuration. It also works with one repo where you keep code and configuration in one repo.
When Argo detect changes in configuration it runs new configuration on cluster. It's simple like that.
User management
Possible to use: local users, SSO with Bundled Dex OIDC provider, SSO with Existing OIDC provider
Secrets
- Bitnami Sealed Secrets
- Godaddy Kubernetes External Secrets
- Hashicorp Vault
- Banzai Cloud Bank-Vaults
- Helm Secrets
- Kustomize secret generator plugins
- aws-secret-operator
- KSOPS
Conclusion
ArgoCD looks very good if you have a really big number of clusters you are managing. Thanks to that you can deploy whatever you want wherever you need. But this is needed for really for big scale.
7 - Command Line
This directory contains design documents related to cli functionality itself.
7.1 - CLI
LambdaStack CLI design document
Affected version: 0.2.1
Goals
Provide a simple to use CLI program that will:
- provide input validation (cmd arguments and data file)
- maintain LambdaStack cluster state (json file, binary, tbd)
- allow to create empty project (via command-line and data file)
- maintain information about LambdaStack version used on each machine (unique identifier generation?)
- allow to add/remove resources via data file.
- separate infrastructure data files from configuration
- internal file with default values will be created
- allow to add resources via command-line (networks, vpn, servers, roles, etc.)
- allow all messages from cli to be convertible to json/yaml (like -o yaml, -o json)
- plugable storage/vault for LambdaStack state and Terraform state
Use cases
CLI deployments/management usage
Create empty cluster:
> LambdaStack create cluster --name='lambdastack-first-cluster'
Add resources to cluster:
> LambdaStack add machine --create --azure --size='Standard_DS2_v2' --name='master-vm-hostname'
> LambdaStack add master -vm 'master-vm-hostname'
> ...
Read information about cluster:
> LambdaStack get cluster-info --name='lambdastack-first-cluster'
CLI arguments should override default values which will be provided almost for every aspect of the cluster.
Data driven deployments/management usage - Configuration and Infrastructure definition separation
While CLI usage will be good for ad-hoc operations, production environments should be created using data files.
Data required for creating infrastructure (like network, vm, disk creation) should be separated from configuration (Kubernetes, Kafka, etc.).
Each data file should include following header:
kind: configuration/component-name # configuration/kubernetes, configuration/kafka, configuration/monitoring, ...
version: X.Y.Z
title: my-component-configuration
specification:
# ...
Many configuration files will be handled using ---
document separator. Like:
kind: configuration/kubernetes
# ...
---
kind: configuration/kafka
# ...
Creating infrastructure will be similar but it will use another file kinds. It should look like:
kind: infrastructure/server
version: X.Y.Z
title: my-server-infra-specification
specification:
# ...
One format to rule them all
Same as many configurations can be enclosed in one file with ---
separator, configuration and infrastructure yamls
should also be treated in that way.
Example:
kind: configuration/kubernetes
# ...
---
kind: configuration/kafka
# ...
---
kind: infrastructure/server
#...
Proposed design - Big Picture
Input
LambdaStack engine console application will be able to handle configuration files and/or commands.
Commands and data files will be merged with default values into a model that from now on will be used for configuration. If data file (or command argument) will contain some values, those values should override defaults.
Infrastructure
Data file based on which the infrastructure will be created. Here user can define VMs, networks, disks, etc. or just specify a few required values and defaults will be used for the rest. Some of the values - like machine IPs (and probably some more) will have to be determined at runtime.
Configuration
Data file for cluster components (e.g. Kubernetes/Kafka/Prometheus configuration). Some of the values will have to be retrieved from the Infrastructure config.
State
The state will be a result of platform creation (aka build). It should be stored in configured location (storage, vault, directory). State will contain all documents that took part in platform creation.
7.2 - CLI UX
LambdaStack CLI UX
Affected version: unknown
Goals
This document aim is to improve user experience with lambdastack tool with strong emphasis to lower entry level for new users. It provides idea for following scenarios:
- lambdastack installation
- environment initialization and deployment
- environment component update
- cli tool update
- add component to existing environment
Assumptions
Following scenarios assume:
- there is component version introduced - lambdastack version is separated from component version. It means that i.e. lambdastack v0.0.1 can provide component PostgreSQL 10.x and/or PostgreSQL 11.x.
- there is server-side component - LambdaStack environment is always equipped with server side daemon component exposing some API to lambdastack.
Convention
I used square brackets with dots inside:
[...]
to indicate processing or some not important for this document output.
Story
lambdastack installation
To increase user base we need to provide brew formulae to allow simple installation.
> brew install lambdastack
environment initialization and deployment
init
As before user should be able to start interaction with lambdastack with lambdastack init
command. In case of no parameters interactive version would be opened.
> lambdastack init
What cloud provider do you want to use? (Azure, AWS): AWS
Is that a production environment? No
Do you want Single Node Kubernetes?: No
How many Kubernetes Control Planes do you want?: 1
How many Kubernetes Nodes do you want?: 2
Do you want PostgreSQL relational database?: Yes
Do you want RabbitMQ message broker?: No
Name your new LambdaStack environment: test1
There is already environment called test1, please provide another name: test2
[...]
Your new environment configuration was generated! Go ahead and type: 'lambdastack status' or 'lambdastack apply.
It could also be lambdastack init -p aws -t nonprod -c postgresql ....
or lambdastack --no-interactive -p aws
for non-interactive run.
inspect .lambdastack/
Previous command generated files in ~/.lambdastack directory.
> ls –la ~/.lambdastack
config
environemts/
> ls –la ~/.lambdastack/environments/
test2/
> ls –la ~/.lambdastack/environments/test2/
test2.yaml
> cat ~/.lambdastack/config
version: v1
kind: Config
preferences: {}
environments:
- environment:
name: test2
localStatus: initialized
remoteStatus: unknown
users:
- name: aws-admin
contexts:
- context:
name: test2-aws-admin
user: aws-admin
environment: test2
current-context: test2-admin
status after init
Output from lambdastack init
asked to run lambdastack status
.
> lambdastack status
Client Version: 0.5.3
Environment version: unknown
Environment: test2
User: aws-admin
Local status: initialized
Remote status: unknown
Cloud:
Provider: AWS
Region: eu-central-1
Authorization:
Type: unknown
State: unknown
Components:
Kubernetes:
Local status: initialized
Remote status: unknown
Nodes: ? (3)
Version: 1.17.1
PostgreSQL:
Local status: initialized
Remote status: unknown
Nodes: ? (1)
Version: 11.2
---
You are not connected to your environment. Please type 'lambdastack init cloud' to provide authorization informations!
As output is saying for now this command only uses local files in ~/.lambdastack directory.
init cloud
Follow instructions to provide cloud provider authentication.
> lambdastack init cloud
Provide AWS API Key: HD876KDKJH9KJDHSK26KJDH
Provide AWS API Secret: ***********************************
[...]
Credentials are correct! Type 'lambdastack status' to check environment.
Or in non-interactive mode something like: lambdastack init cloud -k HD876KDKJH9KJDHSK26KJDH -s dhakjhsdaiu29du2h9uhd2992hd9hu
.
status after init cloud
Follow instructions.
> lambdastack status
Client Version: 0.5.3
Environment version: unknown
Environment: test2
User: aws-admin
Local status: initialized
Remote status: unknown
Cloud:
Provider: AWS
Region: eu-central-1
Authorization:
Type: key-secret
State: OK
Components:
Kubernetes:
Local status: initialized
Remote status: unknown
Nodes: ? (3)
Version: 1.17.1
PostgreSQL:
Local status: initialized
Remote status: unknown
Nodes: ? (1)
Version: 11.2
---
Remote status is unknown! Please type 'lambdastack status update' to synchronize status with remote.
status update
As lambdastack was able to connect to cloud but it doesn't know remote state it asked to update state.
> lambdastack status update
[...]
Remote status updated!
> lambdastack status
Client Version: 0.5.3
Environment version: unknown
Environment: test2
User: aws-admin
Local status: initialized
Remote status: uninitialized
Cloud:
Provider: AWS
Region: eu-central-1
Authorization:
Type: key-secret
State: OK
Components:
Kubernetes:
Local status: initialized
Remote status: uninitialized
Nodes: 0 (3)
Version: 1.17.1
PostgreSQL:
Local status: initialized
Remote status: uninitialized
Nodes: 0 (1)
Version: 11.2
---
Your cluster is uninitialized. Please type 'lambdastack apply' to start cluster setup.
Please type 'lambdastack status update' to synchronize status with remote.
It connected to cloud provider and checked that there is no cluster.
apply
> lambdastack apply
[...]
---
Environment 'test2' was initialized successfully! Plese type 'lambdastack status' to see status or 'lambdastack components' to list components. To login to kubernetes cluster as root please type 'lambdastack components kubernetes login'.
Command 'lambdastack status' will synchronize every time now, so no need to run 'lambdastack status update'
lambdastack knows now that there is cluster and it will connect for status every time user types lambdastack status
unless some additional preferences are used.
status after apply
Now it connects to cluster to check status. That relates to assumption from the beginning of this document that there is some server-side component providing status. Other way lambdastack status
would have to call multiple services for status.
> lambdastack status
[...]
Client Version: 0.5.3
Environment version: 0.5.3
Environment: test2
User: aws-admin
Status: OK
Cloud:
Provider: AWS
Region: eu-central-1
Authorization:
Type: key-secret
State: OK
Components:
Kubernetes:
Status: OK
Nodes: 3 (3)
Version: 1.17.1
PostgreSQL:
Status: OK
Nodes: 1 (1)
Version: 11.2
---
Your cluster is fully operational! Plese type 'lambdastack components' to list components. To login to kubernetes cluster as root please type 'lambdastack components kubernetes login'.
kubernetes login
> lambdastack components kubernetes login
[...]
You can now operate your kubernetes cluster via 'kubectl' command!
Content is added to ~/.kube/config file. To be agreed how to do it.
> kubectl get nodes
[...]
components
RabbitMQ is here on the list but with “-“ because it is not installed.
> lambdastack components
[...]
+kubernetes
+postgresql
- rabbitmq
component status
> lambdastack components kubernetes status
[...]
Status: OK
Nodes: 3 (3)
Version: 1.17.1 (current)
Running containers: 12
Dashboard: http://12.13.14.15:8008/
environment component update
3 months passed and new version of LambdaStack component was released. There is no need to update client and there is no need to update all components at once. Every component is upgradable separately.
component status
lambdastack status
command will notify user that there is new component version available.
> lambdastack components kubernetes status
[...]
Status: OK
Nodes: 3 (3)
Version: 1.17.1 (outdated)
Running containers: 73
Dashboard: http://12.13.14.15:8008/
---
Run 'lambdastack components kubernetes update' to update to 1.18.1 version! Use '--dry-run' flag to check update plan.
component update
> lambdastack components kubernetes update
[...]
Kubernetes was successfully updated from version 1.17.1 to 1.18.1!
It means that it updated ONLY one component. User could probably write something like lambdastack components update
or even lambdastack update
but there is no need to go all in, if one does not want to.
cli tool update
User typed brew update
in and lambdastack was updated to newest version.
status
> lambdastack status
[...]
Client Version: 0.7.0
Environment version: 0.5.3
Environment: test2
User: aws-admin
Status: OK
Cloud:
Provider: AWS
Region: eu-central-1
Authorization:
Type: key-secret
State: OK
Components:
Kubernetes:
Status: OK
Nodes: 3 (3)
Version: 1.18.1
PostgreSQL:
Status: OK
Nodes: 1 (1)
Version: 11.2
---
Your cluster is fully operational! Plese type “lambdastack components” to list components. To login to kubernetes cluster as root please type “lambdastack components kubernetes login”.
Your client version is newer than environment version. You might consider updating environment metadata to newest version. Read more at https://lambdastack.github.io/environment-version-update.
It means that there is some metadata on cluster with information that it was created and governed with lambdastack version 0.5.3 but new version of lambdastack binary can still communicate with environment.
add component to existing environment
There is already existing environment and we want to add new component to it.
component init
> lambdastack components rabbitmq init
[...]
RabbitMQ config was added to your local configuration. Please type “lambdastack apply” to apply changes.
Component configuration files were generated in .lambdastack directory. Changes are still not applied.
apply
> lambdastack apply
[...]
---
Environment “test2” was updated! Plese type “lambdastack status” to see status or “lambdastack components” to list components. To login to kubernetes cluster as root please type “lambdastack components kubernetes login”.
Command “lambdastack status” will synchronize every time now, so no need to run “lambdastack status update”
Daemon
We should also consider scenario with web browser management tool. It might look like:
> lambdastack web
open http://127.0.0.1:8080 to play with environments configuration. Type Ctrl-C to finish ...
[...]
User would be able to access tool via web browser based UI to operate it even easier.
Context switching
Content of ~/.lambdastack
directory indicates that if user types lambdastack init -n test3
there will be additional content generated and user will be able to do something like lambdastack context use test3
and lambdastack context use test2
.
7.3 -
This directory contains design documents related to cli functionality itself.
- document LambdaStack CLI design document describes general idea for lambdastack tool. This document is outdated and does not reflect current functionality.
- document LambdaStack CLI UX describes scenario with improved user experience.
8 - Harbor Registry
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
Docker Registry implementation design document
Goals
Provide Docker container registry as a LambdaStack service. Registry for application containers storage, docker image signs and docker image security scanning.
Use cases
Store application docker images in private registry. Sign docker images with passphrase to be trusted. Automated security scanning of docker images which are pushed to the registry.
Architectural decision
Comparison of the available solutions
Considered options:
- Harbor https://github.com/goharbor/
- Quay https://github.com/quay/
- Portus https://github.com/SUSE/Portus
Feature comparison table
Feature | Harbor | Quay.io | Portus |
---|---|---|---|
Ability to Determine Version of Binaries in Container | Yes | Yes | Yes |
Audit Logs | Yes | Yes | Yes |
Content Trust and Validation | Yes | Yes | Yes |
Custom TLS Certificates | Yes | Yes | Yes |
Helm Chart Repository Manager | Yes | Partial | Yes |
Open source | Yes | Partial | Yes |
Project Quotas (by image count & storage consumption) | Yes | No | No |
Replication between instances | Yes | Yes | Yes |
Replication between non-instances | Yes | Yes | No |
Robot Accounts for Helm Charts | Yes | No | Yes |
Robot Accounts for Images | Yes | Yes | Yes |
Tag Retention Policy | Yes | Partial | No |
Vulnerability Scanning & Monitoring | Yes | Yes | Yes |
Vulnerability Scanning Plugin Framework | Yes | Yes | No |
Vulnerability Whitelisting | Yes | No | No |
Complexity of the installation process | Easy | Difficult | Difficult |
Complexity of the upgrade process | Medium | Difficult | Difficult |
Source of comparison: https://goharbor.io/docs/1.10/build-customize-contribute/registry-landscape/ and also based on own experience (stack installation and upgrade).
Design proposal
Harbor services architecture
Implementation architecture
Additional components are required for Harbor implementation.
- Shared storage volume between kubernetes nodes (in example NFS),
- Component for TLS/SSL certificate request (maybe cert-manager?),
- Component for TLS/SSL certificate store and manage certificate validation (maybe Vault?),
- Component for TLS/SSL certificate share between server and client (maybe Vault?).
- HELM component for deployment procedure.
Diagram for TLS certificate management:
Kubernetes deployment diagram:
Implementation steps
- Deploy shared storage service (in example NFS) for K8s cluster (M/L)
- Deploy Helm3 package manager and also Helm Charts for offline installation (S/M)
- Deploy Hashicorp Vault for self-signed PKI for Harbor (external task + S for Harbor configuration)
- Deploy "cert request/management" service and integrate with Hashicorp Vault - require research (M/L)
- Deploy Harbor services using Helm3 with self-signed TLS certs (for non-production environments) (L)
- Deploy Harbor services using Helm3 with commercial TLS certs (for prod environments) (M/L)
9 - Health Monitor
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack Health Monitor service design proposal
Affected version: 0.6.x/0.7.x
Goals
Provide service that will be monitoring components (Kubernetes, Docker, Kafka, EFK, Prometheus, etc.) deployed using LambdaStack.
Use cases
Service will be installed and used on Virtual Machines/Bare Metal on Ubuntu and RedHat (systemd service). Health Monitor will check status of components that were installed on the cluster. Combinations of those components can be different and will be provided to the service through configuration file.
Components that Health Monitor should check:
- Kubernetes (kubelet)*
- Query Kubernetes health endpoint (/healthz)*
- Docker*
- Query Docker stats*
- PostgreSQL
- HAProxy
- Prometheus
- Kafka
- ZooKeeper
- ElasticSearch
- RabbitMQ
*
means MVP version.
Health Monitor exposes endpoint that is compliant with Prometheus metrics format and serves data about health checks. This endpoint should listen on the configurable port (default 98XX).
Design proposal
TODO
10 - Infrastructure
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
Cloud resources naming convention
This document describes recommendations how to name infrastructure resources that are usually created by Terraform. Unifying resource names allows easily identify and search for any resource even if no specific tags were provided.
Listed points are based on development of LambdaStack modules and best practices provided by Microsoft Azure.
In general resource name should match following schema:
<prefix>-<resource_type>-<index>
Prefix
LambdaStack modules are developed in the way that allows user specify a prefix for created resources. This approach gives
such benefits as ordered sorting and identifying who is the owner of the resource. Prefix can include following parts
with a dash -
as a delimiter.
Type | Required | Description | Examples |
---|---|---|---|
Owner | yes | The name of the person or team which resource belongs to | LambdaStack |
Application or service name | no | Name of the application, workload, or service that the resource is a part of | kafka, ignite, opendistro |
Environment | no | The stage of the development lifecycle for the workload that the resource supports | prod, dev, qa |
VM group | no | The name of VM group that resource is created for | group-0 |
Resource type
Resource type is a short name of resource that is going to be created. Examples:
rg
: resource groupnsg
: network security grouprt-private
: route table for private networking
Index
Index is a serial number of the resource. If single resource is created, 0
is used as a value.
11 - Kubernetes/Vault Integration
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack Kubernetes with Hashicorp Vault integration
Affected version: 0.7.x
1. Introduction
We want to provide integration of Kubernetes with Hashicorp Vault with couple of different modes:
- vault - prod/dev mode without https
- vault - prod/dev mode with https
- vault - cluster with raft storage
We are not providing vault in vault development mode as this doesn't provide data persitency.
If user would like then can use automatic injecting of secrets into Kubernetes pods with usage of sidecar integration provided by Hashicorp Vault agent. Sidecar will based on annotations for pods inject secrets as files to annotated pods.
2. Goal
In LambdaStack you can use Kubernetes secrets stored in etcd. We want to provide integration with Hashicorp Vault to provide additional security for secrets used inside applications running in LambdaStack and also provide possibilty of usage safely secrets for components that are running outside of Kubernetes cluster.
3. Design proposals
In all deployment models vault is installed outside Kubernetes cluster as a separate service. There is a possibility of usage Hashicorp Vault deployed on Kubernetes cluster but this scenario is not covered in this document.
Integration between Kubernetes and Hashicorp Vault can be achieved via Hashicorp Vault Agent that is deployed on Kubernetes cluster using Helm. Also to provide this Hashicorp Vault needs to be configured with proper policies and enabling kubernetes method of authentication.
In every mode we want to provide possibility to perform automatic unseal via script, but this solution is better suited for development scenario. In production however to maximize security level unseal should be performed manually.
In all scenarios machine on which Hashicorp Vault will be running swap will be disabled and Hashicorp Vault will run under user with limited privileges (e.g. vault). User under which Hashicorp Vault will be running will have ability to use the mlock syscall In configuration from LambdaStack side we want to provide possibility to turn off dumps at the system level (turned off by default), use auditing (turned on by default), expose UI (by default set to disabled) and disable root token after configuration (by default root token will be disabled after deployment).
We want to provide three scenarios of installing Hashicorp Vault:
- vault - prod/dev mode without https
- vault - prod/dev mode with https
- vault - cluster with raft storage
1. vault - prod/dev mode without https
In this scenario we want to use file storage for secrets. Vault can be set to manual or automatic unseal with script. In automatic unseal mode file with unseal keys is stored in file in safe location with permission to read only by vault user. In case of manual unseal vault post-deployment configuration script needs to be executed against vault. Vault is installed as a service managed by systemd. Traffic in this scenario is served via http, which make possible to perform man in the middle attacks, so this option should be only used in development scenarios.
2. vault - prod/dev mode with https
This scenario differs from previous with usage of https. In this scenario we should cover also generation of keys with usage of PKI, to provide certificate and mutual trust between the endpoints.
3. vault - cluster with raft storage
In this scenario we want to use raft storage for secrets. Raft storage is used for cluster setup and doesn't require additional Consul component what makes configuration easier and requires less maintenance. It also limit network traffic and increase performance. In this scenario we can also implement auto-unseal provided with Transient Secrets from Hashicorp Vault.
In this scenario at least 3 nodes are required, but preferable is 5 nodes setup to provide quorum for raft protocol. This can cover http and also https traffic.
4. Further extensions
We can provide additional components for vault unsealing - like integration with pgp keys to encrypt services and auto-unsealing with Transient Secrets from Hashicorp Vault. We can also add integration with Prometheus to share statistics with it.
12 - Kafka Authentication
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack Kafka authentication design document
Affected version: 0.5.x
Goals
Provide authentication for Kafka clients and brokers using: 1). SSL 2). SASL-SCRAM
Use cases
1). SSL - Kafka will be authorizing clients based on certificate, where certificate will be signed by common CA root certificate and validated against . 2). SASL-SCRAM - Kafka will be authorizing clients based on credentials and validated using SASL and with SCRAM credentials stored in Zookeeper
Design proposal
Add to LambdaStack configuration/kafka field that will select authentication method - SSL or SASL with SCRAM. Based on this method of authentication will be selected with available settings (e.g. number of iterations for SCRAM).
For SSL option CA certificate will be fetched to machine where LambdaStack has been executed, so the user can sign his client certificates with CA certificate and use them to connect to Kafka.
For SASL with SCRAM option LambdaStack can create also additional SCRAM credentials creations, that will be used for client authentication.
13 - Kafka Monitoring Tools
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
KAFKA MONITORING TOOLS - RESEARCH RESULTS
1. CONFLUENT CONTROL CENTER link
- Commercial feature, only trial version for free
- Out of the box UI
- Managing and monitoring Kafka cluster (including view consumer offset)
- Possibility to set up alerts
- Detailed documentation, lots of tutorials, blog articles and a wide community
- All-in-one solution with additional features through Confluent Platform/Cloud
2. LENSES link
- Commercial feature, only trial version for free
- Out of the box UI
- Deliver monitoring of Kafka data pipelines
- Managing and monitoring Kafka cluster (including view consumer offset)
- Possibility to set up alerts
- Smaller community, fewer articles and tutorials around Lenses compared to the Control Center
3. SEMATEXT link
- Commercial feature, only trial version for free
- ChatOps integrations
- Out of the box UI
- Built-in anomaly detection, threshold, and heartbeat alerts
- Managing and monitoring Kafka cluster (including view consumer offset)
- Possibility to set up alerts
4. DATADOG KAFKA link
- Commercial feature, only trial version for free
- Out of the box Kafka monitoring dashboards
- Monitoring tool (including view consumer offset). Displays key metrics for Kafka brokers, producers, consumers and Apache Zookeeper. Less focused on cluster state
- Possibility to set up alerts
5. CLOUDERA MANAGER link
- Commercial feature, only trial version for free
- Less rich monitoring tool compared to Confluent, Lenses and Datadog but is very convenient for companies that are already customers of Cloudera and need their monitoring mechanisms under the same platform
6. KAFKA TOOL link
- Commercial feature, only trial version for free
- Out of the box UI
- Monitoring tool (including view consumer offset)
- Poor documentation
- In latest changelogs, only support for kafka 2.1 mentioned
- Some of opensource projects looks much more better that this one
7. KADECK link
- Commercial feature, only trial version for free
- Out of the box UI
- Focused on filtering the messages within the topics and the creation of custom views
- No possibility to set up alerts
- Focuses more on business monitoring than on technical monitoring like Control Center or Lenses
- KaDeck could be used in addition to the other monitoring tools
8. YAHOO CLUSTER MANAGER link
- Opensource project, Apache-2.0 License
- Managing and monitoring Kafka cluster (including view consumer offset)
- Out of the box UI
- No possibility to set up alerts
9. LINKEDIN CRUISE CONTROL link
- Opensource project, BSD 2-Clause "Simplified" License
- Managing and monitoring Kafka cluster (not possible to view consumer offset :warning:)
- Possible to track resource utilization for brokers, topics, and partitions, query cluster state, to view the status of partitions, to monitor server capacity (i.e. CPU, network IO, etc.)
- Anomaly Detection and self-healing and rebalancing
- No possibility to set up alerts
- UI as seperated component link
- It can use metrics reporter from LinkedIn (necessary to add jar file to kafka lib directory) but it is also possible to uses Prometheus for metric aggregation
10. LINKEDIN BURROW link
- Opensource project, Apache-2.0 License
- Provides consumer lag checking as a service without the need for specifying thresholds. It monitors committed offsets for all consumers and calculates the status of those consumers on demand
- It does not monitor anything related to the health of the brokers
- Possibility to set up alerts
11. KAFKA DROP 3 link
- Opensource project, Apache-2.0 License, reboot of Kafdrop 2.x
- Monitoring tool (including view consumer offset)
- Out of the box UI
- No possibility to set up alerts
12. KAFKA MONITOR link
- Opensource project, Apache-2.0 License
- Kafka monitor is a framework to implement and execute long-running Kafka system tests in a real cluster
- It plays a role as a passive observer and reports what it observes (broker availability, produce/consume latency, etc) by emitting metrics. In other words, it pretends to be a Kafka user and keeps reporting metrics from the user's PoV
- It is more a load generation and reporting tool
- UI does not exist
- No possibility to set up alerts
13. OTHERS
Things like on the list below are there as well, but usually such smaller projects and have little or no development activity:
14. CONCLUSIONS
Currently in LambdaStack monitoring and getting metrics from Kafka are based on:
In real scenarios, based on some use cases and opinions from internal teams:
- Kafka Exporter is used in order to get consumer offset and lag
- JMX Exporter is used in order to get some standard broker's metrics such as cpu, memory utilization and so on
If it is possible to pay for a commercial license, Confluent, Lenses and Sematext offer more rich functionality compared to the other monitoring tools and they are very similar.
As far as the open source project is considered:
- LinkedIn Cruise Control looks like the winner. Provides not only managing and monitoring Kafka cluster but also some extra features such as rebalancing, anomaly detection or self-healing
- Yahoo Cluster Manager looks like good competitor but only for managing and monitoring Kafka cluster. However in compare to Cruise Control, during the installation some issues were met and it was not able to recieve some consumer data and a few issues are already reported in official repository related to the problem link. The project does not have good spirit of open source software at all.
- LinkedIn Burrow looks like good additional tool for LinkedIn Cruise Control when it comes to consumer lag checking service and can be used instead of Kafka exporter plugin which cause some outstanding issues
14 - Kubernetes HA
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack Kubernetes HA design document
Affected version: 0.6.x
1. Goals
Provide highly-available control-plane version of Kubernetes.
2. Cluster components
2.1 Load balancer
2.1.1 External
Kubernetes HA cluster needs single TCP load-balancer to communicate from nodes to masters and from masters to masters (all internal communication has to go through the load-balancer).
PROS:
- standard solution
CONS:
- it's not enough just to create one instance of such load-balancer, it needs failover logic (like virtual IP), so in the end for fully highly-available setup we need automation for whole new service
- requires additional dedicated virtual machines (at least 2 for HA) even in the case of single-control-plane cluster
- probably requires infrastructure that can handle virtual IP (depending on the solution for failover)
2.1.2 Internal
Following the idea from kubespray's HA-mode we can skip creation of dedicated external load-balancer (2.1.1).
Instead, we can create identical instances of lightweight load-balancer (like HAProxy) on each master and each kubelet node.
PROS:
- no need for creation of dedicated load-balancer clusters with failover logic
- since we could say that internal load-balancer is replicated, it seems to be highly-available by definition
CONS:
- increased network traffic
- longer provisioning times as (in case of any changes in load-balancer's configs) provisioning needs to touch every node in the cluster (master and kubelet node)
- debugging load-balancer issues may become slightly harder
2.2 Etcd cluster
2.2.1 External
PROS:
- in the case of high network / system load external etcd cluster deployed on dedicated premium quality virtual machines will behave more stable
CONS:
- requires automation for creation and distribution of etcd's server and client PKI certificates
- upgrading etcd is difficult and requires well-tested autmation that works on multiple nodes at once in perfect coordination - in the case when etcd's quorum fails, it is unable to auto-heal itself and it requires to be reconstructed from scratch (where data loss or discrepancy seems to be likely)
2.2.2 Internal
PROS:
- adding / removing etcd nodes is completely automated and behaves as expected (via kubeadm)
- etcd's PKI is automatically re-distributed during joining new masters to control-plane
CONS:
- etcd is deployed in containers alongside other internal components, which may impact its stability when system / network load is high
- since etcd is containerized it may be prone to docker-related issues
3. Legacy single-master solution
After HA logic is implemented, it is probably better to reuse new codebase also for single-master clusters.
In the case of using internal load-balancer (2.1.2) it makes sense to use scaled-down (to single node) HA cluster (with single-backended load-balancer) and drop legacy code.
4. Use cases
The LambdaStack delivers highly-available Kubernetes clusters deploying them across multiple availability zones / regions to increase stability of production environments.
5. Example use
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: any
name: "k8s1"
build_path: # Dynamically built
specification:
name: k8s1
admin_user:
name: ubuntu
key_path: id_ed25519
path: # Dynamically built
components:
kubernetes_master:
count: 3
machines:
- default-k8s-master1
- default-k8s-master2
- default-k8s-master3
kubernetes_node:
count: 2
machines:
- default-k8s-node1
- default-k8s-node2
logging:
count: 0
monitoring:
count: 0
kafka:
count: 0
postgresql:
count: 0
load_balancer:
count: 0
rabbitmq:
count: 0
---
kind: infrastructure/machine
provider: any
name: default-k8s-master1
specification:
hostname: k1m1
ip: 10.10.1.148
---
kind: infrastructure/machine
provider: any
name: default-k8s-master2
specification:
hostname: k1m2
ip: 10.10.2.129
---
kind: infrastructure/machine
provider: any
name: default-k8s-master3
specification:
hostname: k1m3
ip: 10.10.3.16
---
kind: infrastructure/machine
provider: any
name: default-k8s-node1
specification:
hostname: k1c1
ip: 10.10.1.208
---
kind: infrastructure/machine
provider: any
name: default-k8s-node2
specification:
hostname: k1c2
ip: 10.10.2.168
6. Design proposal
As for the design proposal, the simplest solution is to take internal load-balancer (2.1.2) and internal etcd (2.2.2) and merge them together, then carefully observe and tune network traffic comming from haproxy instances for big number of worker nodes.
Example HAProxy config:
global
log /dev/log local0
log /dev/log local1 notice
daemon
defaults
log global
retries 3
maxconn 2000
timeout connect 5s
timeout client 120s
timeout server 120s
frontend k8s
mode tcp
bind 0.0.0.0:3446
default_backend k8s
backend k8s
mode tcp
balance roundrobin
option tcp-check
server k1m1 10.10.1.148:6443 check port 6443
server k1m2 10.10.2.129:6443 check port 6443
server k1m3 10.10.3.16:6443 check port 6443
Example ClusterConfiguration:
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.14.6
controlPlaneEndpoint: "localhost:3446"
apiServer:
extraArgs: # https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/
audit-log-maxbackup: "10"
audit-log-maxsize: "200"
audit-log-path: "/var/log/apiserver/audit.log"
enable-admission-plugins: "AlwaysPullImages,DenyEscalatingExec,NamespaceLifecycle,ServiceAccount,NodeRestriction"
profiling: "False"
controllerManager:
extraArgs: # https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/
profiling: "False"
terminated-pod-gc-threshold: "200"
scheduler:
extraArgs: # https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/
profiling: "False"
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
certificatesDir: /etc/kubernetes/pki
To deploy first master run (Kubernetes 1.14):
$ sudo kubeadm init --config /etc/kubernetes/kubeadm-config.yml --experimental-upload-certs
To add one more master run (Kubernetes 1.14):
$ sudo kubeadm join localhost:3446 \
--token 932b4p.n6teb53a6pd1rinq \
--discovery-token-ca-cert-hash sha256:bafb8972fe97c2ef84c6ac3efd86fdfd76207cab9439f2adbc4b53cd9b8860e6 \
--experimental-control-plane --certificate-key f1d2de1e5316233c078198a610c117c65e4e45726150d63e68ff15915ea8574a
To remove one master run (it will properly cleanup config inside Kubernetes - do not use kubectl delete node
):
$ sudo kubeadm reset --force
In later versions (Kubernetes 1.17) this feature became stable and "experimental" word in the commandline paremeters was removed.
7. Post-implementation erratum
- It turned out, that init-phase upload-certs does not take into account etcd encryption feature and does not copy such configuration to newly joined masters.
- Instead, for consistency, master joining has been implemented via automated replication of Kubernetes PKI in ansible.
15 - Leader Election Pod
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
Leader election in Kubernetes
Components of control plane such as controller manager or scheduler use endpoints to select the leader. Instance which firstly create the endpoint of this service at the very beginning add annotation to the endpoint with the leader information.
Package leaderelection.go
is used for leader election process which leverages above Kubernetes endpoint resource as some sort of LOCK
primitive to prevent any follower to create the same endpoint in this same Namespace.
Leader election for pods
As far as leader election for pods is considered there are possible a few solutions:
- Since Kubernetes introduced in 1.14 version (March, 2019)
coordination.k8s.io
group API, it is possible to create in the cluster lease object which can hold the lock for the set of pods. It is necessary to implement a simple code into the application using packageleaderelection.go
in order to handle the leader election mechanism.
Helpful article:
This is the recommended solution, simple, based on existing API group and lease object and not dependent on any external cloud object.
- Kubernetes already uses Endpoints to represent a replicated set of pods so it is possible to use the same object for the purposes. It is possible to use already existing leader election framework from Kubernetes which implement simple mechanism. It is necessary to run leader-election container as sidecar container for replication set of application pods. Using the leader-election sidecar container, endpoint will be created which will be responsible for locking leader for one pod. Thanks to that, creating deployment with 3 pods, only one container with application will be in ready state - the one that works inside the pod leader. For application container, it is necessary to add readiness probe to the sidecar container:
Helpful article:
This solution was recommended by Kubernetes in 2016 and looks a little bit outdated, is complex and require some work.
- Microsoft and Google come up with a proposal to use cloud native storage with single object that contain the leader data but it requires to read that file by each node what can be in some situations problematic.
Helpful articles:
- https://cloud.google.com/blog/topics/developers-practitioners/implementing-leader-election-google-cloud-storage
- https://docs.microsoft.com/en-us/azure/architecture/patterns/leader-election
It is not recommended solution since the single object is a potential single point of failure.
16 - Modularization
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
This directory contains design documents related to modularization of LambdaStack.
- document LambdaStack modular design document describes python level modularization.
- document LambdaStack modularization approaches compares 3 discussed modularization scenarios.
- document Component Provider Template presents example implementation on very early stage.
- document Offline modes in modularised LambdaStack proposes high level definition for offline modes
16.1 -
Basic Infra ModuleS VS LambdaStack Infra
Basic overview
This represents the current status on: 05-25-2021
:heavy_check_mark: : Available :x: : Not available :heavy_exclamation_mark: Check the notes
LambdaStack Azure | LambdaStack AWS | Azure BI | AWS BI | ||
---|---|---|---|---|---|
Network | Virtual network | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
Private subnets | :heavy_exclamation_mark: | :heavy_exclamation_mark: | :heavy_check_mark: | :heavy_check_mark: | |
Public subnets | :heavy_exclamation_mark: | :heavy_exclamation_mark: | :heavy_check_mark: | :heavy_check_mark: | |
Security groups with rules | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | |
Possibility for Bastion host | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | |
Possibility to connect to other infra (EKS, AKS) | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | |
VM | "Groups" with similar configuration | :heavy_check_mark: | :heavy_exclamation_mark: | :heavy_check_mark: | :heavy_check_mark: |
Data disks | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | |
Shared storage (Azure Files, EFS) | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | |
Easy configuration | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: |
Notes
- On LambdaStack AWS/Azure infrastructure we can either have a cluster with private or public subnets. As public IP`s can only be applied cluster wide and not on a VM "group" basis.
- On LambdaStack AWS we use Auto Scaling Groups to represent groups of similar VM`s. This approach however has lots of issues when it comes to scaling the group/component.
Missing for Modules
- Currently, the Azure BI module does not have a way to implement security groups per subnets with rules configuration. An issue already exists for that here.
- Both BI modules currently only gives a default configuration, which makes it hard to create a full component layout for a full cluster.
16.2 -
Context
This design document presents findings on what are important pieces of modules communication in Dockerized Custom Modules approach described here.
Plan
Idea is to have something running and working mimicking real world modules. I used GNU make to perform this. With GNU make I was able to easily implement “run” logic. I also wanted to package everything into docker images to experience real world containers limitations of communication, work directory sharing and other stuff.
Dependencies problem
First list of modules is presented here:
version: v1
kind: Repository
components:
- name: c1
type: docker
versions:
- version: 0.1.0
latest: true
image: "docker.io/hashicorp/terraform:0.12.28"
workdir: "/terraform"
mounts:
- "/terraform"
commands:
- name: init
description: "initializes terraform in local directory"
command: init
envs:
TF_LOG: WARN
- name: apply
description: "applies terraform in local directory"
command: apply
envs:
TF_LOG: DEBUG
args:
- -auto-approve
... didn't have any dependencies section. We know that some kind of dependencies will be required very soon. I created idea of how to define dependencies between modules in following mind map:
It shows following things:
- every module has some set of labels. I don't think we need to have any "obligatory" labels. If you create very custom ones you will be very hard to find.
- module has
requires
section with possible subsectionsstrong
andweak
. A strong requirement is one has to be fulfilled for the module to be applied. A weak requirement, on the other hand, is something we can proceed without, but it is in some way connected when present.
It's worth co notice each requires
rule. I used kubernetes matchExpressions approach as main way of defining dependencies, as one of main usage here would be "version >= X", and we cannot use simple labels matching mechanism without being forced to update all modules using my module every time I release a new version of that module.
Influences
I started to implement example docker based mocked modules in tests directory, and I found a 3rd section required: influences
. To explain this lets notice one folded module in upper picture: "BareMetalMonitoring". It is Prometheus based module so, as it works in pull mode, it needs to know about addresses of machines it should monitor. Let's imagine following scenario:
- I have Prometheus already installed, and it knows about IP1, IP2 and IP3 machines to be monitored,
- in next step I install, let's say
BareMetalKafka
module, - so now, I want Prometheus to monitor Kafka machines as well,
- so, I need
BareMetalKafka
module to "inform" in some wayBareMetalMonitoring
module to monitor IP4, IP5 and IP6 addresses to addition of what it monitors already.
This example explains "influences" section. Mocked example is following:
labels:
version: 0.0.1
name: Bare Metal Kafka
short: BMK
kind: stream-processor
core-technology: apache-kafka
provides-kafka: 2.5.1
provides-zookeeper: 3.5.8
requires:
strong:
- - key: kind
operator: eq
values: [infrastructure]
- key: provider,
operator: in,
values:
- azure
- aws
weak:
- - key: kind
operator: eq
values:
- logs-storage
- - key: kind
operator: eq
values:
- monitoring
- key: core-technology
operator: eq
values:
- prometheus
influences:
- - key: kind
operator: eq
values:
- monitoring
As presented there is influences
section notifying that "there is something what that I'll do to selected module (if it's present)". I do not feel urge to define it more strictly at this point in time before development. I know that this kind of influences
section will be required, but I do not know exactly how it will end up.
Results
During implementation of mocks I found that:
influences
section would be required- name of method
validate-config
(or later justvalidate
) should in fact beplan
- there is no need to implement method
get-state
in module container provider as state will be local and shared between modules. In fact some state related operations would be probably implemented on cli wrapper level. - instead, there is need of
audit
method which would be extremely important to check if no manual changes were applied to remote infrastructure
Required methods
As already described there would be 5 main methods required to be implemented by module provider. Those are described in next sections.
Metadata
That is simple method to display static YAML/JSON (or any kind of structured data) information about the module. In fact information from this method should be exactly the same to what is in repo file section about this module. Example output of metadata
method might be:
labels:
version: 0.0.1
name: Bare Metal Kafka
short: BMK
kind: stream-processor
core-technology: apache-kafka
provides-kafka: 2.5.1
provides-zookeeper: 3.5.8
requires:
strong:
- - key: kind
operator: eq
values: [infrastructure]
- key: provider,
operator: in,
values:
- azure
- aws
weak:
- - key: kind
operator: eq
values:
- logs-storage
- - key: kind
operator: eq
values:
- monitoring
- key: core-technology
operator: eq
values:
- prometheus
influences:
- - key: kind
operator: eq
values:
- monitoring
Init
init
method main purpose is to jump start usage of module by generating (in smart way) configuration file using information in state. In example Makefile which is stored here you can test following scenario:
make clean
make init-and-apply-azure-infrastructure
- observe what is in
./shared/state.yml
file:
it mocked that it created some infrastructure with VMs having some fake IPs.azi: status: applied size: 5 provide-pubips: true nodes: - privateIP: 10.0.0.0 publicIP: 213.1.1.0 usedBy: unused - privateIP: 10.0.0.1 publicIP: 213.1.1.1 usedBy: unused - privateIP: 10.0.0.2 publicIP: 213.1.1.2 usedBy: unused - privateIP: 10.0.0.3 publicIP: 213.1.1.3 usedBy: unused - privateIP: 10.0.0.4 publicIP: 213.1.1.4 usedBy: unused
- change IP manually a bit to observe what I mean by "smart way"
azi: status: applied size: 5 provide-pubips: true nodes: - privateIP: 10.0.0.0 publicIP: 213.1.1.0 usedBy: unused - privateIP: 10.0.0.100 <---- here publicIP: 213.1.1.100 <---- and here usedBy: unused - privateIP: 10.0.0.2 publicIP: 213.1.1.2 usedBy: unused - privateIP: 10.0.0.3 publicIP: 213.1.1.3 usedBy: unused - privateIP: 10.0.0.4 publicIP: 213.1.1.4 usedBy: unused
make just-init-kafka
- observe what was generated in
./shared/bmk-config.yml
it used what it found in state file and generated config to actually work with given state.bmk: size: 3 clusterNodes: - privateIP: 10.0.0.0 publicIP: 213.1.1.0 - privateIP: 10.0.0.100 publicIP: 213.1.1.100 - privateIP: 10.0.0.2 publicIP: 213.1.1.2
make and-then-apply-kafka
- check it got applied to state file:
azi: status: applied size: 5 provide-pubips: true nodes: - privateIP: 10.0.0.0 publicIP: 213.1.1.0 usedBy: bmk - privateIP: 10.0.0.100 publicIP: 213.1.1.100 usedBy: bmk - privateIP: 10.0.0.2 publicIP: 213.1.1.2 usedBy: bmk - privateIP: 10.0.0.3 publicIP: 213.1.1.3 usedBy: unused - privateIP: 10.0.0.4 publicIP: 213.1.1.4 usedBy: unused bmk: status: applied size: 3 clusterNodes: - privateIP: 10.0.0.0 publicIP: 213.1.1.0 state: created - privateIP: 10.0.0.100 publicIP: 213.1.1.100 state: created - privateIP: 10.0.0.2 publicIP: 213.1.1.2 state: created
So init
method is not just about providing "default" config file, but to actually provide "meaningful" configuration file. What is significant here, is that it's very easily testable if that method generates desired state when given different example state files.
Plan
plan
method is a method to:
- validate that config file has correct structure,
- get state file, extract module related piece and compare it to config to "calculate" if there are any changes required and if yes, than what are those.
This method should be always started before apply by cli wrapper.
General reason to this method is that after we "smart initialized" config, we might have wanted to change some values some way, and then it has to be validated. Another scenario would be influence
mechanism I described in Influences
section. In that scenario it's easy to imagine that output of BMK module would produce proposed changes to BareMetalMonitoring
module or even apply them to its config file. That looks obvious, that automatic "apply" operation on BareMetalMonitoring
module is not desired option. So we want to suggest to the user "hey, I applied Kafka module, and usually it influences the configuration of Monitoring module, so go ahead and do plan
operation on it to check changes". Or we can even do automatic "plan" operation and show what are those changes.
Apply
apply
is main "logic" method. Its purpose is to do 2 things:
- apply module logic (i.e.: install software, modify a config, manage service, install infrastructure, etc.),
- update state file.
In fact, you might debate which of those is more important, and I could argue that updating state file is more important.
To perform its operations it uses config file previously validated in plan
step.
Audit
audit
method use case is to check how existing components is "understandable" by component provider logic. A standard situation would be upgrade procedure. We can imagine following history:
- I installed
BareMetalKafka
module in version 0.0.1 - Then I manually customized configuration on cluster machines
- Now I want to update
BareMetalKafka
to version 0.0.2 because it provides something I need
In such a scenario, checking if upgrade operation will succeed is critical one, and that is duty of audit
operation. It should check on cluster machines if "known" configuration is still "known" (whatever it means for now) and that upgrade operation will not destroy anything.
Another use case for audit method is to reflect manually introduced changes into the configuration (and / or state). If I manually upgraded minor version of some component (i.e.: 1.2.3 to 1.2.4) it's highly possible that it might be easily reflected in state file without any trouble to other configuration.
Optional methods
There are also already known methods which would be required to have most (or maybe all) modules, but are not core to modules communication. Those are purely "internal" module business. Following examples are probably just subset of optional methods.
Backup / Restore
Provide backup and restore functionalities to protect data and configuration of installed module.
Update
Perform steps to update module components to newer versions with data migration, software re-configuration, infrastructure remodeling and any other required steps.
Scale
Operations related to scale up and scale down module components.
Check required methods implementation
All accessible methods would be listed in module metadata as proposed here. That means that it's possible to:
- validate if there are all required methods implemented,
- validate if required methods return in expected way,
- check if state file is updated with values expected by other known modules.
All that means that we would be able to automate modules release process, test it separately and validate its compliance with modules requirements.
Future work
We should consider during development phase if and how to present in manifest what are external fields that module requires for apply operation. That way we might be able to catch inconsistencies between what one module provide and what another module require form it.
Another topic to consider is some standardization over modules labeling.
16.3 -
Ansible based module
Purpose
To provide separation of concern on middleware level code we need to have consistent way to produce ansible based modules.
Requirements
There are following requirements for modules:
- Allow two-ways communication with other modules via Statefile
- Allow a reuse of ansible roles between modules
Design
Components
- Docker – infrastructure modules are created as Docker containers so far so this approach should continue
- Ansible – we do have tons of ansible code which could be potentially reused. Ansible is also a de facto industry standard for software provisioning, configuration management, and application deployment.
- Ansible-runner – due to need of automation we should use ansible-runner application which is a wrapper for ansible commands (i.e.: ansible-playbook) and provides good code level integration features (i.e.: passing of variables to playbook, extracting logs, RC and facts cache). It is originally used in AWX.
- E-structures – we started to use e-structures library to simplify interoperability between modules.
- Ansible Roles – we need to introduce more loosely coupled ansible code while extracting it from main LambdaStack code repository. To achieve it we need to utilize ansible roles in “ansible galaxy” way, which means each role should be separately developed, tested and versioned. To coordinate multiple roles between they should be connected in a modules single playbook.
Commands
Current state of understanding of modules is that we should have at least two commands:
- Init – would be responsible to build configuration file for the module. In design, it would be exactly the same as “init” command in infrastructure modules.
- Apply – that command would start ansible logic using following order:
- Template inventory file – command would get configuration file and using its values, would generate ansible inventory file with all required by playbook variables.
- Provide ssh key file – command would copy provided in “shared” directory key into expected location in container
There is possibility also to introduce additional “plan” command with usage of “—diff” and “—check” flags for ansible playbook but:
- It doesn't look like required step like in terraform-based modules
- It requires additional investigation for each role how to implement it.
Structure
Module repository should have structure similar to following:
- Directory “cmd” – Golang entrypoint binary files should be located here.
- Directory “resources” – would be root for ansible-runner “main” directory
- Subdirectory “project” – this directory should contain entrypoint.yml file being main module playbook.
- Subdirectory “roles” – this optional directory should contain local (not shared) roles. Having this directory would be considered “bad habit”, but it's possible.
- Subdirectory “project” – this directory should contain entrypoint.yml file being main module playbook.
- Files in “root” directory – Makefile, Dockerfile, go.mod, README.md, etc.
16.4 -
LambdaStack modular design document
Affected version: 0.4.x
Goals
Make lambdastack easier to work on with multiple teams and make it easier to maintain/extend by:
- Splitting up the monotithic LambdaStack into seperate modules which can run as standalone CLI tools or be linked together through LambdaStack.
- Create an extendable plug and play system for roles which can be assigned to components based on certain tasks: apply, upgrade, backup, restore, test etc
Architectural design
The current monolithic lambdastack will be split up into the following modules.
Core
Shared code between other modules and not executable as standalone. Responsible for:
- Config
- Logging
- Helpers
- Data schema handling: Loading, defaults, validating etv.
- Build output handling: Loading, saving, updating etc.
- Ansible runner
Infrastructure
Module for creating/destroying cloud infrastructure on AWS/Azure/Google... + "Analysing" existing infrastructure. Maybe at a later time we want to split up the different cloud providers into plugins as well.
Functionality (rough outline and subjected to change):
- template:
"lambdastack infra template -f outfile.yaml -p awz/azure/google/any (--all)" "infra template -f outfile.yaml -p awz/azure/google/any (--all)"? "Infrastructure.template(...)" Task: Generate a template yaml with lambdastack-cluster definition + possible infra docs when --all is defined Input: File to output data, provider and possible all flag Output: outfile.yaml template
- apply:
"lambdastack infra apply -f data.yaml" "infra apply -f data.yaml"? "Infrastructure.apply(...)" Task: Create/Update infrastucture on AWS/Azure/Google... Input: Yaml with at least lambdastack-cluster + possible infra docs Output: manifest, ansible inventory and terrafrom files
- analyse:
"lambdastack infra analyse -f data.yaml" "infra analyse -f data.yaml"? "Infrastructure.analyse(...)" Task: Analysing existing infrastructure Input: Yaml with at least lambdastack-cluster + possible infra docs Output: manifest, ansible inventory
- destroy:
"lambdastack infra destroy -b /buildfolder/" "infra destroy -b /buildfolder/"? "Infrastructure.destroy(...)" Task: Destroy all infrastucture on AWS/Azure/Google? Input: Build folder with manifest and terrafrom files Output: Deletes the build folder.
Repository
Module for creating and tearing down a repo + preparing requirements for offline installation.
Functionality (rough outline and subjected to change):
- template:
"lambdastack repo template -f outfile.yaml (--all)" "repo template -f outfile.yaml (--all)"? "Repository.template(...)" Task: Generate a template yaml for a repository Input: File to output data, provider and possible all flag Output: outfile.yaml template
- prepare:
"lambdastack repo prepare -os (ubuntu-1904/redhat-7/centos-7)" "repo prepare -o /outputdirectory/"? "Repo.prepare(...)" Task: Create the scripts for downloading requirements for a repo for offline installation for a certain OS. Input: Os which we want to output the scripts for: (ubuntu-1904/redhat-7/centos-7) Output: Outputs the scripts scripts
- create:
"lambdastack repo create -b /buildfolder/ (--offline /foldertodownloadedrequirements)" "repo create -b /buildfolder/"? "Repo.create(...)" Task: Create the repository on a machine (either by running requirement script or copying already prepared ) and sets up the other VMs/machines to point to said repo machine. (Offline and offline depending on --offline flag) Input: Build folder with manifest and ansible inventory and posible offline requirements folder for onprem installation. Output: repository manifest or something only with the location of the repo?
- teardown:
"lambdastack repo teardown -b /buildfolder/" "repo teardown -b /buildfolder/"? "Repo.teardown(...)" Task: Disable the repository and resets the other VMs/machines to their previous state. Input: Build folder with manifest and ansible inventory Output: -
Components
Module for applying a command on a component which can contain one or multiple roles. It will take the Ansible inventory to determine which roles should be applied to which component. The command each role can implement are (rough outline and subjected to change):
- apply: Command to install roles for components
- backup: Command to backup roles for components
- restore: Command to backup roles for components
- upgrade: Command to upgrade roles for components
- test: Command to upgrade roles for components
The apply
command should be implemented for every role but the rest is optional. From an implementation perspective each role will be just a seperate folder inside the plugins directory inside the components
module folder with command folders which will contain the ansible tasks:
components-|
|-plugins-|
|-master-|
| |-apply
| |-backup
| |-restore
| |-upgrade
| |-test
|
|-node-|
| |-apply
| |-backup
| |-restore
| |-upgrade
| |-test
|
|-kafka-|
| |-apply
| |-upgrade
| |-test
Based on the Ansible inventory and the command we can easily select which roles to apply to which components. For the commands we probably also want to introduce some extra flags to only execute commands for certain components.
Finally we want to add support for an external plugin directory where teams can specify there own role plguins which are not (yet) available inside LambdaStack itself. A feature that can also be used by other teams to more easily start contributing developing new components.
LambdaStack
Bundles all executable modules (Infrastructure, Repository, Component) and adds functions to chain them together:
Functionality (rough outline and subjected to change):
- template:
"lambdastack template -f outfile.yaml -p awz/azure/google/any (--all)" "LambdaStack.template(...)" Task: Generate a template yaml with lambdastack-cluster definition + possible infrastrucure, repo and component configurations Input: File to output data, provider and possible all flag Output: outfile.yaml with templates
- apply:
"lambdastack apply -f input.yaml" "LambdaStack.template(...)" Task: Sets up a cluster from start to finish Input: File to output data, provider and possible all flag Output: Build folder with manifest, ansible inventory, terrafrom files, component setup.
...
16.5 -
Intent
This document tries to compare 3 existing propositions to implement modularization.
Compared models
To introduce modularization in LambdaStack we identified 3 approaches to consider. Following sections will describe briefly those 3 approaches.
Dockerized custom modules
This approach would look following way:
- Each component management code would be packaged into docker containers
- Components would need to provide some known call methods to expose metadata (dependencies, info, state, etc.)
- Each component would be managed by one management container
- Components (thus management containers) can depend on each other in ‘pre-requisite’ manner (not runtime dependency, but order of executions)
- Separate wrapper application to call components execution and process metadata (dependencies, info, state, etc.)
All that means that if we would like to install following stack:
- On-prem Kubernetes cluster
- Rook Operator with Ceph cluster working on that on-prem cluster
- PostgreSQL database using persistence provided by Ceph cluster,
Then steps would need to look somehow like this:
- CLI command to install PostgreSQL
- It should check pre-requisites and throw an error that it cannot be installed because there is persistence layer missing
- CLI command to search persistence layer
- It would provide some possibilities
- CLI command to install rook
- It should check pre-requisites and throw an error that it cannot be installed because there is Kubernetes cluster missing
- CLI command to search Kubernetes cluster
- It would provide some possibilities
- CLI command to install on-prem Kubernetes
- It should perform whole installation process
- CLI command to install rook
- It should perform whole installation process
- CLI command to install PostgreSQL
- It should perform whole installation process
Terraform providers
This approach would mean following:
- We reuse most of terraform providers to provide infrastructure
- We reuse Kubernetes provider to deliver Kubernetes resources
- We provide “operator” applications to wrap ansible parts in terraform-provider consumable API (???)
- Separate wrapper application to instantiate “operator” applications and execute terraform
All that means that if we would like to install following stack:
- On-prem Kubernetes cluster
- Rook Operator with Ceph cluster working on that on-prem cluster
- PostgreSQL database using persistence provided by Ceph cluster,
Then steps would need to look somehow like this:
- Prepare terraform configuration setting up infrastructure containing 3 required elements
- CLI command to execute that configuration
- It would need to find that there is on-prem cluster provider which does not have where to connect, and it needs to instantiate “operator” container
- It instantiates “operator” container and exposes API
- It executes terraform script
- It terminates “operator” container
Kubernetes operators
This approach would mean following:
- To run anything, we need Kubernetes cluster of any kind (local Minikube is good as well)
- We provide Kubernetes CR’s to operate components
- We would reuse some existing operators
- We would need to create some operators on our own
- There would be need to separate mechanism to create “on-prem” Kubernetes clusters (might be operator too)
All that means that if we would like to install following stack:
- On-prem Kubernetes cluster
- Rook Operator with Ceph cluster working on that on-prem cluster
- PostgreSQL database using persistence provided by Ceph cluster,
Then steps would need to look somehow like this:
- Start Minikube instance on local node
- Provide CRD of on-prem Kubernetes operator
- Deploy on-prem Kubernetes operator
- Wait until new cluster is deployed
- Connect to it
- Deploy rook operator definition
- Deploy PostgreSQL operator definition
Comparision
Question | Dockerized custom modules (DCM) | Terraform providers (TP) | Kubernetes operators (KO) |
---|---|---|---|
How much work does it require to package lambdastack to first module? | Customize entrypoint of current image to provide metadata information. | Implement API server in current image to expose it to TP. | Implement ansible operator to handle CR’s and (possibly?) run current image as tasks. |
Sizes: | 3XL | Too big to handle. We would need to implement just new modules that way. | 5XL |
How much work does it require to package module CNS? | From kubectl image, provide some parameters, provide CRD’s, provide CR’s | Use (possibly?) terraform-provider-kubernetes. Prepare CRD’s, prepare CR’s. No operator required. | Just deploy Rook CRD’s, operator, CR’s. |
Sizes: | XXL | XL | XL |
How much work does it require to package module AKS/EKS? | From terraform, provide some parameters, provide terraform scripts | Prepare terraform scripts. No operator required. | [there is something called rancher/terraform-controller and it tries to be what we need. It’s alpha] Use (possibly?) rancher terraform-controller operator, provide DO module with terraform scripts. |
Sizes: | XL | L | XXL |
How would be dependencies handled? | Not defined so far. It seems that using kind of “selectors” to check if modules are installed and in state “applied” or something like this. | Standard terraform dependencies tree. It’s worth to remember that terraform dependencies sometimes work very weird and if you change one value it has to call multiple places. We would need to assess how much dependencies there should be in dependencies. | It seems that embedding all Kubernetes resources into helm charts, and adding dependencies between them could solve a problem. |
Sizes: | XXL | XL | XXL |
Would it be possible to install CNS module on LambdaStack Kubernetes in version 0.4.4? | yes | yes | yes |
If I want to install CNS, how would dependencies be provided? | By selectors mechanism (that is proposition) | By terraform tree | By helm dependencies |
Let’s assume that in version 0.8.0 of LambdaStack PostgreSQL is migrated to new component (managed not in lambdastack config). How would migration from 0.7.0 to 0.8.0 on existing environments be processed? | Proposition is, that for this kind of operations we can create separate “image” to conduct just that upgrade operation. So for example ls-o0-08-upgrader. It would check that environment v0.7.x had PostgreSQL installed, then it would generate config for new PostgreSQL module, it would initialize that module and it would allow upgrade of lambdastack module to v0.8.x | It doesn’t look like there is a way to do it automatically by terraform. You would need to add new PostgreSQL terraform configuration and import existing state into it, then remove PostgreSQL configuration from old module (while preventing it from deletion of resources). If you are advanced terraform user it still might be tricky. I’m not sure if we are able to handle it for user. | We would need to implement whole functionality in operator. Basically very similar to DCM scenario, but triggered by CR change. |
Sizes: | XXL | Unknown | 3XL |
Where would module store it’s configuration? | Locally in ~/.e/ directory. In future we can implement remote state (like terraform remote backend) | All terraform options. | As Kubernetes CR. |
How would status of components be gathered by module? | We would need to implement it. | Standard terraform output and datasource mechanisms. | Status is continuously updated by operator in CR so there it is. |
Sizes: | XL | XS | S |
How would modules pass variables between each other? | CLI wrapper should be aware that one module needs other module output and it should call module1 get-output and pass that json or part of it to module2 apply |
Standard terraform. | Probably by Config resources. But not defined. |
Sizes: | XXL | XS | XL |
How would upstream module notify downstream that something changed in it’s values? | We would need to implement it. | Standard terraform tree update. Too active changes in a tree should be considered here as in dependencies. | It’s not clear. If upstream module can change downstream Config resource (what seems to be ridiculous idea) than it’s simple. Other way is that downstream periodically checks upstream Config for changes, but that introduces problems if we use existing operators. |
Sizes: | XXL | XL | XXL |
Sizes summary: | 1 3XL, 5 XXL, 2 XL | 1 Too big, 1 Unknown, 3 XL, 1 L, 2 XS | 1 5XL, 1 3XL, 3 XXL, 2 XL, 1 S |
Conclusions
Strategic POV
DCM and KO are interesting. TP is too strict and not elastic enough.
Tactic POV
DCM has the smallest standard deviation when you look at task sizes. It indicates the smallest risk. TP is on the opposite side of list with the biggest estimations and some significant unknowns.
Fast gains
If we were to consider only cloud provided resources TP is the fastest way. Since we need to provide multiple different resources and work on-prem it is not that nice. KO approach looks like something interesting, but it might be hard at the beginning. DCM looks like simplest to implement with backward compatibility.
Risks
DCM has significant risk of “custom development”. KO has risks related to requirement to use operator-framework and its concept, since very beginning of lsc work. TP has huge risks related to on-prem operational overhead.
Final thoughts
Risks related to DCM are smallest and learning curve looks best. We would also be able to be backward compatible in relatively simple way.
DCM looks like desired approach.
16.6 -
Offline modes in modularised LambdaStack
Context
Due to ongoing modularization process and introduction of middleware modules we need to decide how modules would obtain required dependencies for “offline” mode.
This document will describe installation and upgrade modes and will discuss ways to implement whole process considered during design process.
Assumptions
Each module has access to the “/shared” directory. Most wanted way to use modules is via “e” command line app.
Installation modes
There are 2 main identified ways (each with 2 mutations) to install LambdaStack cluster.
- Online - it means that one machine in a cluster has access to public internet. We would call this machine repository machine, and that scenario would be named "Jump Host". A specific scenario in this group is when all machines have access to internet. We are not really interested in that scenario because in all scenarios we want all cluster machines to download required elements from repository machine. We would call this scenario "Full Online"
- Offline - it means that none of machines in a cluster have access to public internet. There are also two versions of this scenario. First version assumes that installation process is initialized on operators machine (i.e.: his/her laptop). We would call this scenario "Bastion v1". Second scenario is when all installation initialization process is executed directly from "Downloading Machine". We would call that scenario "Bastion v2".
Following diagrams present high-level overview of those 4 scenarios:
Key machines
Described in the previous section scenarios show that there is couple machine roles identified in installation process. Following list explains those roles in more details.
- Repository - key role in whole lifecycle process. This is central cluster machine containing all the dependencies, providing images repository for the cluster, etc.
- Cluster machine - common cluster member providing computational resources to middleware being installed on it. This machine has to be able to see Repository machine.
- Downloading machine - this is a temporary machine required to download OS packages for the cluster. This is known process in which we download OS packages on a machine with access to public internet, and then we transfer those packages to Repository machine on which they are accessible to all the cluster machines.
- Laptop - terminal machine for a human operator to work on. There is no formal requirement for this machine to exist or be part of process. All operations performed on that machine could be performed on Repository or Downloading machine.
Downloading
This section describes identified ways to provide dependencies to cluster. There is 6 identified ways. All of them are described in following subsections with pros and cons.
Option 1
Docker image for each module has all required binaries embedded in itself during build process.
Pros
- There is no “download requirements” step.
- Each module has all requirements ensured on build stage.
Cons
- Module image is heavy.
- Possible licensing issues.
- Unknown versions of OS packages.
Option 2
There is separate docker image with all required binaries for all modules embedded in itself during build process.
Pros
- There is no “download requirements” step.
- All requirements are stored in one image.
Cons
- Image would be extremely large.
- Possible licensing issues.
- Unknown versions of OS packages.
Option 3
There is separate “dependencies” image for each module containing just dependencies.
Pros
- There is no “download requirements” step.
- Module image itself is still relatively small.
- Requirements are ensured on build stage.
Cons
- “Dependencies” image is heavy.
- Possible licensing issues.
- Unknown versions of OS packages.
Option 4
Each module has “download requirements” step and downloads requirements to some directory.
Pros
- Module is responsible for downloading its requirements on its own.
- Already existing “export/import” CLI feature would be enough.
Cons
- Offline upgrade process might be hard.
- Each module would perform the download process a bit differently.
Option 5
Each module has “download requirements” step and downloads requirements to docker named volume.
Pros
- Module is responsible for downloading its requirements on its own.
- Generic docker volume practices could be used.
Cons
- Offline upgrade process might be hard.
- Each module would perform the download process a bit differently.
Option 6
Each module contains “requirements” section in its configuration, but there is one single module downloading requirements for all modules.
Pros
- Module is responsible for creation of BOM and single “downloader” container satisfies needs of all the modules.
- Centralised downloading process.
- Manageable offline installation process.
Cons
- Yet another “module”
Options discussion
- Options 1, 2 and 3 are probably unelectable due to licenses of components and possibly big or even huge size of produced images.
- Main issue with options 1, 2 and 3 is that it would only work for containers and binaries but not OS packages as these are dependent on the targeted OS version and installation. This is something we cannot foresee or bundle for.
- Options 4 and 5 will introduce possibly a bit of a mess related to each module managing downloads on its own. Also upgrade process in offline mode might be problematic due to burden related to provide new versions for each module separately.
- Option 6 sounds like most flexible one.
Export
Its visible in offline scenarios that "export" process is as important as "download" process. For offline scenarios "export" has to cover following elements:
- downloaded images
- downloaded binaries
- downloaded OS packages
- defined modules images
- e command line app
- e environment configuration
All those elements have to be packaged to archive to be transferred to the clusters Repository machine.
Import
After all elements are packaged and transferred to Repository machine they have to be imported into Repository. It is current impression that repository module would be responsible for import operation.
Summary
In this document we provide high level definition how to approach offline installation and upgrade. Current understanding is:
- each module provide list of it's requirements
- separate module collects those and downloads required elements
- the same separate module exports all artefacts into archive
- after the archive is transferred, repository module imports its content
17 - Offline Upgrade
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack offline upgrade design document
Affected version: 0.4.x
Goals
Provide upgrade functionality for LambdaStack so Kubernetes and other components can be upgraded when working offline.
Use cases
LambdaStack should be upgradeable when there is no internet connection. It requires all packages and dependencies to be downloaded on machine that has internet connection and then moved to air-gap server.
Example use
lsupgrade -b /path/to/build/dir
Where -b
is path to build folder that contains Ansible inventory.
Design proposal
MVP for upgrade function will contain Kubernetes upgrade procedure to the latest supported version of Kubernetes. Later it will be extended to all other LambdaStack components.
lsupgrade
application or module takes build path location (directory path that contains Ansible inventory file).
First part of upgrade execution is to download/upload packages to repository so new packages will exist and be ready for upgrade process. When repository module will finish its work then upgrade Ansible playbooks will be executed.
Upgrade application/module shall implement following functions:
- [MVP]
apply
it will execute upgrade --plan
where there will be no changes made to the cluster - it will return list of changes that will be made during upgrade execution.
18 - Persistence Storage
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
Intent
This document aim is to initialize evaluation of possible persistence layers for Kubernetes cluster (a.k.a. Cloud Native Storage, CNS) in various setups.
Conditions
There is need to provide persistence layer for Kubernetes cluster installed as LambdaStack containers orchestrator. We need to consider performance of persistence layer as well. There is possibility to utilize external persistence solutions in future with managed Kubernetes clusters installations, but that is out of scope of this document.
OKR
This section proposes Objectives and Key Results for CNS.
- O1: Introduce Cloud Native Storage
- O1KR1: Have stable CNS released
- O1KR2: Have CNS performance tests automation
- O1KR3: Have CNS performance tests results
Possible solutions
As for now I can see following solutions:
- Ceph managed by Rook Operator
- GlusterFS (managed by Heketi or Kadalu, but that would need further assessment)
We should review more solutions presented here.
There are numerous other solutions possible to use over CSI, but they require separate management.
Requirements
- It has to be able to work on-premise
- It has to be able to work offline
- There need to be known difference in performance of middleware components
- Storage layer should be tightly integrated with Kubernetes
- As much as possible automation is required (zero-management)
Tests
- We need to have performance tests automated
- Tests have to be executed daily
- We should have PostgreSQL database performance tests automated
- We should have kafka performance tests automated
Initial Plan
- Have LambdaStack cluster with PostgreSQL database
- Create performance test running in Kubernetes pod using PostgreSQL in current setup (pgbench can be used)
- Deploy rook operator and create Ceph cluster
- Create PostgreSQL database running in Kubernetes pod using Ceph PVC
- Run performance test using Kubernetes PostgreSQL instance
- Compare results
19 - PostgreSQL
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
LambdaStack database connection design document
Affected version: 0.5.x
1. Introduction
Deploying PostgreSQL in a high-demand environment requires reliability and scalability. Even if you don't scale your infrastructure and you work only on one database node at some time you will reach connection limit. Number of connection to Postgres database is limited and is defined by max_connection
parameter. It's possible to extend this limit, but you shouldn't do that reckless - this depends of machine resources.
2. Use case
LambdaStack delivers solution to build master - slave database nodes configuration. This means that application by default connects to master database. Database replica is updated immediately when master is modified.
3. Assumptions
- Database replica is read only
- Write data only to Control Plane Node
- Select operations on replica
- There is no Pgpool-II software available for Ubuntu 18.04 - not officially supported
4. Design proposal
4.1. Minimal configuration
Minimal solution to meet with client requirements is to install Pgbouncer on database master node to maintain connection pool. This will partially solve problem with exceeded connection limits. All applications need to be reconfigure to connect not directly with database, but with Pgbouncer service which will redirect connection to database master. This solution we can deliver fast and it's quite easy to implement.
4.2. High Availability configuration
Above chart presents high availability database cluster. Pgbouncer and Pgpool are located in separate pods in Kubernetes cluster. PGbouncer maintains connection pool and redirect them to pgpool which is responsible for connection pooling between master and slave node. This allows to redirect write operations to master database node and read (select) operations to slave database node(s). Additionally repmgr takes care of databases availability (must be installed on every database node), and promote subsequent slave node to be master when previous master went down.
LambdaStack PostgreSQL auditing design document
Affected version: 0.5.x
Goals
Provide functionality to perform auditing of operations performed on PostgreSQL.
Use cases
For SOX and other regulations compliance platform should provide auditing function for PostgreSQL database. This should be set via LambdaStack automation in LambdaStack configuration yaml.
Example use
In configuration for PostgreSQL we can add additional parameters, that could configure additional properties of PostgreSQL. Config similar to proposed below can be used to configure auditing with using pgaudit.
kind: configuration/postgresql
title: PostgreSQL
name: default
specification:
...
extensions:
pgaudit:
enabled: false
shared_preload_libraries:
- pgaudit
config_file_parameters:
pgaudit.log: 'all, -misc'
log_connections: 'on'
log_disconnections: 'on'
log_line_prefix: "'%m [%p] %q%u@%d,host=%h '"
log_statement: 'none'
...
Design proposal
Add to PostgreSQL configuration additional settings, that would install and configure pgaudit extension. For RHEL we use PostgreSQL installed from Software Collections repository, which doesn't provide pgaudit package for PostgreSQL versions older than 12. For this reason, on RHEL pgaudit will be installed from PostgreSQL repository.
20 - Ceph (Rook)
Some of these date back to older versions but efforts are made to keep the most important - sometimes :)
Replication / configuration
Configuration data is stored in location: /var/lib/ceph Storage data is located on dedicated devices which are connected via OSD pods.
Replication: Like Ceph Clients, Ceph OSD Daemons use the CRUSH algorithm, but the Ceph OSD Daemon uses it to compute where replicas of objects should be stored (and for rebalancing). In a typical write scenario, a client uses the CRUSH algorithm to compute where to store an object, maps the object to a pool and placement group, then looks at the CRUSH map to identify the primary OSD for the placement group. The client writes the object to the identified placement group in the primary OSD. Then, the primary OSD with its own copy of the CRUSH map identifies the secondary and tertiary OSDs for replication purposes, and replicates the object to the appropriate placement groups in the secondary and tertiary OSDs (as many OSDs as additional replicas), and responds to the client once it has confirmed the object was stored successfully.
Prerequisite
Since version 1.4 lvm package present on the nodes is required. It applies for AWS machines (not tested on Ubuntu) Example installation command:
RHEL:
yum install lvm2 -y
Rook ceph design
https://rook.io/docs/rook/v1.4/ceph-storage.html
Cluster setup
Rook ceph cluster can be easily deployed using example default definitions from GH repo:
git clone --single-branch --branch release-1.4 https://github.com/rook/rook.git
open location:
rook/cluster/examples/kubernetes/ceph
and list examples:
-rw-r--r--. 1 root root 395 Jul 28 13:00 ceph-client.yaml
-rw-r--r--. 1 root root 1061 Jul 28 13:00 cluster-external-management.yaml
-rw-r--r--. 1 root root 886 Jul 28 13:00 cluster-external.yaml
-rw-r--r--. 1 root root 5300 Jul 28 13:00 cluster-on-pvc.yaml
-rw-r--r--. 1 root root 1144 Jul 28 13:00 cluster-test.yaml
-rw-r--r--. 1 root root 10222 Jul 28 14:47 cluster.yaml
-rw-r--r--. 1 root root 2143 Jul 28 13:00 common-external.yaml
-rw-r--r--. 1 root root 44855 Jul 28 13:00 common.yaml
-rw-r--r--. 1 root root 31424 Jul 28 13:00 create-external-cluster-resources.py
-rw-r--r--. 1 root root 2641 Jul 28 13:00 create-external-cluster-resources.sh
drwxr-xr-x. 5 root root 47 Jul 28 13:00 csi
-rw-r--r--. 1 root root 363 Jul 28 13:00 dashboard-external-https.yaml
-rw-r--r--. 1 root root 362 Jul 28 13:00 dashboard-external-http.yaml
-rw-r--r--. 1 root root 839 Jul 28 13:00 dashboard-ingress-https.yaml
-rw-r--r--. 1 root root 365 Jul 28 13:00 dashboard-loadbalancer.yaml
-rw-r--r--. 1 root root 1554 Jul 28 13:00 direct-mount.yaml
-rw-r--r--. 1 root root 3308 Jul 28 13:00 filesystem-ec.yaml
-rw-r--r--. 1 root root 780 Jul 28 13:00 filesystem-test.yaml
-rw-r--r--. 1 root root 4286 Jul 28 13:00 filesystem.yaml
drwxr-xr-x. 2 root root 115 Jul 28 13:00 flex
-rw-r--r--. 1 root root 4530 Jul 28 13:00 import-external-cluster.sh
drwxr-xr-x. 2 root root 183 Jul 28 13:00 monitoring
-rw-r--r--. 1 root root 1409 Jul 28 13:00 nfs.yaml
-rw-r--r--. 1 root root 495 Jul 28 13:00 object-bucket-claim-delete.yaml
-rw-r--r--. 1 root root 495 Jul 28 13:00 object-bucket-claim-retain.yaml
-rw-r--r--. 1 root root 2306 Jul 28 13:00 object-ec.yaml
-rw-r--r--. 1 root root 2313 Jul 28 13:00 object-openshift.yaml
-rw-r--r--. 1 root root 698 Jul 28 13:00 object-test.yaml
-rw-r--r--. 1 root root 488 Jul 28 13:00 object-user.yaml
-rw-r--r--. 1 root root 3573 Jul 28 13:00 object.yaml
-rw-r--r--. 1 root root 19075 Jul 28 13:00 operator-openshift.yaml
-rw-r--r--. 1 root root 18199 Jul 28 13:00 operator.yaml
-rw-r--r--. 1 root root 1080 Jul 28 13:00 pool-ec.yaml
-rw-r--r--. 1 root root 508 Jul 28 13:00 pool-test.yaml
-rw-r--r--. 1 root root 1966 Jul 28 13:00 pool.yaml
-rw-r--r--. 1 root root 410 Jul 28 13:00 rgw-external.yaml
-rw-r--r--. 1 root root 2273 Jul 28 13:00 scc.yaml
-rw-r--r--. 1 root root 682 Jul 28 13:00 storageclass-bucket-delete.yaml
-rw-r--r--. 1 root root 810 Jul 28 13:00 storageclass-bucket-retain-external.yaml
-rw-r--r--. 1 root root 681 Jul 28 13:00 storageclass-bucket-retain.yaml
-rw-r--r--. 1 root root 1251 Jul 28 13:00 toolbox.yaml
-rw-r--r--. 1 root root 6089 Jul 28 13:00 upgrade-from-v1.2-apply.yaml
-rw-r--r--. 1 root root 14957 Jul 28 13:00 upgrade-from-v1.2-crds.yaml
After creating basic setup (common.yaml
, operator.yaml
, cluster.yaml
) install toolbox (toolbox.yaml
) as well for checking the ceph cluster status.
IMPORTANT:
ensure the osd container is created and running. It requires a storage device to be available on the nodes.
During cluster startup it searches for the devices available and creates osd containers for them.
Kubelet nodes have to use a default flag enable-controller-attach-detach
set to true
. Otherwise PVC will not attach to the pod.
Location of the file where we can find the flag:
/var/lib/kubelet/kubeadm-flags.env
on every worker nodes with kubelet. After that we need to restart kubelet:
systemctl restart kubelet
If cluster is working we can create a storage which can be one of a type:
Block: Create block storage to be consumed by a pod
Object: Create an object store that is accessible inside or outside the
Kubernetes cluster
Shared Filesystem: Create a filesystem to be shared across multiple pods
Eg.
-> filesystem.yaml
and then
-> storageclass.yaml
CRD:
There are 2 ways cluster can be set up:
- Host-based Cluster
- PVC-based Cluster
PVC example:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: rook-ceph-block
Application using PVC example:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgresql
namespace: default
labels:
k8s-app: postgresql
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
matchLabels:
k8s-app: postgresql
template:
metadata:
labels:
k8s-app: postgresql
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: postgres
image: postgres:10.1
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: dbdb
- name: POSTGRES_USER
value: test
- name: POSTGRES_PASSWORD
value: test
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- mountPath: "/var/lib/postgresql/data"
name: "image-store"
volumes:
- name: image-store
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
Choosing Block Storage which allows a single pod to mount storage, be aware that if one node where Your application is hosted will crash, all the pods located on the crashed node will go into terminating state and application will be unavailable since terminating pods blocking access to ReadWriteOnce volume and new pod can't create. You have to manually delete volume attachment or use CephFS instead of RBD.
Related discussion: https://stackoverflow.com/questions/61186199/why-does-kubernetes-not-terminating-pods-after-a-node-crash
Internal k8s automated setup and tests
Step by step procedure for setting environment up and testing it (together with backup/restore) is available in the following repo: https://github.com/mkyc/k8s-rook-ceph
Useful links:
Good starting point:
https://rook.io/docs/rook/v1.4/ceph-quickstart.html
Toolbox for debugging:
https://rook.io/docs/rook/v1.4/ceph-toolbox.html
Filesystem storage:
https://rook.io/docs/rook/v1.4/ceph-filesystem.html
Custom Resource Definitions:
https://rook.io/docs/rook/v1.4/ceph-cluster-crd.html
Add/remove osd nodes: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/administration_guide/adding_and_removing_osd_nodes
Useful rook ceph guide: https://www.cloudops.com/2019/05/the-ultimate-rook-and-ceph-survival-guide/