How to deploy a k8s cluster on VmWare ESX

k8s, vmware, ansible, metalLB, prometheus, grafana, alertmanager, metrics-server, harbor, clair, notary

If you’re short on resources in your home lab and simply can’t deploy an OpenStack private cloud to play around with, this tutorial will walk you through setting up a highly-available k8s cluster on VMware ESX vms. The principles used to deploy them very much resemble those used to deploy OpenStack instances so you’ll be able to quickly and effectively walk through the very first step in this tutorial.

The k8s cluster we’ll be deploying will have 3x master and 3x worker nodes and will use metalLB for bare-metal load balancing, prometheus, grafana and alertmanager for monitoring, metrics and reporting and harbor with clair and notary for image repository, signing and scanning.

We will be:

  • checking out and customising ansible-deploy-vmware-vm
  • creating vms from templates
  • setting up:
    • the k8s cluster via kubespray
    • metalLB
    • prometheus, grafana and alertmanager
    • metrics-server
    • harbor
    • notary and clair

What we won’t be doing is:

  • setting up VMwareESX (see here)
  • creating a VMware template (see here)

INFO: the VMware templates were made after an Ubuntu 18.04 vm.


STEP 1 — deploy multiple VMWare virtual machines from a template

Deploy the vm’s via Ansible. First and foremost, checkout ansible-deploy-vmware-vm and cd to it.

Second, we need to tell ansible how to connect to our VMware ESX cluster. Edit or create the answerfile.yml and fill in the self explanatory blanks:

Define the vms we want to deploy. Sample vms-to-deploy:

Define the playbook. Sample deploy-k8s-vms-prod.yml:

Let’s deploy the vms:


STEP 2

Deploy the k8s cluster on top of the created vms.

sample hosts-prod.yaml:

sample inventory.ini (make sure the inventory file contains the vm’s proper name (ie. the ones defined under “ansible-deploy-vmware-vm/vms-to-deploy”))

Run the playbook to create the cluster:


STEP 3

MetalLB is a bare-metal load balancer for k8s that makes your current network extend into your k8s cluster.

Connect to one of the master nodes over ssh and setup your environment:

Deploy metalLB by performing the following on one of the master nodes:

MetalLB remains idle until configured. As such, we need to define an IP range the k8s cluster can use and that is outside of any DHCP pool.

Check the status of the pods with:


STEP 4

Install prometheus, grafana and alertmanager:

In order to access the dashboards via the LoadBalancer IPs, we need to change a few service types from “ClusterIP” to “LoadBalancer”:

In order to access the web interface of these services, use the following ports:

– grafana -> 3000 (default usr/pass is admin/admin)
– prometheus -> 9090
– alertmanager -> 9093

STEP 5

Installing the metrics-server or how to get “kubectl top nodes” and “kubectl top pods” to work.


STEP 6

Installing Harbor with clair and notary support.

ssh to your harbor vm and install docker and docker-compose:

Logout and login and check docker:

Time to generate SSL certificated:

Download and install Harbor:

Edit harbor.yml and change a few options:

Finally, install Harbor, enabling clair and notary:

Configure the docker daemon on each of your worker nodes and

Login to a k8s master node and create a secret object for harbor:

Login to the harbor web-interface and create a new repository called “private”.

To deploy images to Harbor, we need to pull them, tag them and push them. From the harbor machine (or any other that has the certificates in place as above), perform the following:

Deploy the kuard app on k8s. ssh to a node where you have access to the cluster and create kuard-deployment.yaml:

Apply it.


STEP 7

Signing docker images with Notary. Install it first:

If needed, copy the “ca.crt” to the client machine you’re working from.

Check if you can connect to the harbor server:

You should get something like this:

Now pull an image from docker hub and tag it but don’t push it just yet.

Let’s enable the Docker Content Trust and then push the image. Please note that when first pushing a signed image, you will be asked to create a password.

Ok, so now the Docker image is pushed in our Registry server and it is signed by the Notary server. We can verify

Test that you can pull from docker hub and harbor:

For using a completely private repository, leave the “DOCKER_CONTENT_TRUST_SERVER” and the “DOCKER_CONTENT_TRUST” environment variables set, on all the k8s cluster machines.

STEP 8

Clair is an open source project for the static analysis of vulnerabilities in application containers. To enable it go to the Harbor web-interface and click on “vulnerability” -> “edit” and select the scan frequency. Save and also click “scan now” if it’s your first time doing this.

ESX 6.5 and 6.7 on HPE G6/G7 server PSOD fix

Background

HP’s done it again. They’ve managed to break their custom ESX ISO on the G6 and G7 servers. I suspect it’s the same for G8.

If like me you found yourself perplexed by the PSOD (pink screen of death) after upgrading from ESX 6.0 to 6.5, keep on reading for the fix (skip to the bottom for the download link to an already fixed ISO).

The issue

It seems the “hpe-smx-provider” driver version 6.5.0 from the ESX 6.5 ISO is causing the PSOD.

The fix and the drawbacks

(Simply) Replace the driver with it’s older counterpart (version 6.0.0) and reinstall (see below for the procedure or check the “download link” section at the very end for an already fixed ISO).  Unfortunately if you decide to create a custom ISO with the previous version of “hpe-smx-provider”, you will no longer be able to upgrade to any future ESX version and will need to do a full installation every time (thank you HP!). That being said, here is the procedure to customize the ISO:

Customize your own HPE ESX ISO

  1. Install PowerCLI from here.  At the time of this writing, the latest version was 6.5.0R1.
  2. Download both the ESX 6.5 and 6.0 offline bundles and save them to a convenient place (ex: C:\HP) — download from this link. To make it easier, rename them to something like HPE_ESX6.5.zip and HPE_ESX6.0.zip.
  3. Open up PowerCLI and do the following:Add the 6.5 bundle

    # check that the profile was loaded and the vendor is HPE

    # clone the profile (feel free to specify whatever you like for the “vendor” when prompted)

    # double check the profile was indeed cloned

    # Remove the “hpe-smx-provider” driver from the clone

    # Add the 6.0 bundle

    # check the profile

    # check driver versions from both bundles

    # Add hpe-smx-provider version 6.00 to the custom profile

    # Export the profile to an ISO

The results

Download links

The following images have hpe-smx-provider version 600.03.11.00.9-2768847. Nothing else was changed. The 6.5.0d (build 5310538) is running on an HP Proliant DL380 G7 since March 2018 and was also tested on a DL380 G6 for a couple of months. For convenience, I’ve also built a 6.7.0 ISO and at the request of a reader, 6.7.0 Update1 from April of 2019:

VMware-ESXi-6.5.0-5310538-HPE-650.10.1.5.20-Oct2017_CUSTOM.iso
VMware-ESXi-6.7.0-9484548-HPE-Gen9plus-670.10.3.5.6-Sep2018_CUSTOM.iso
VMware-ESXi-6.7.0-Update1-11675023-HPE-Gen9plus-670.U1.10.4.0.19-Apr2019_CUSTOM.iso

Enjoy!