Category Archives: Server

For information related to servers.

Proxmox Monitoring

Proxmox Cluster Metrics
Proxmox Cluster Metrics

We set up a Grafana Dashboard to monitor our Proxmox Cluster. The main components in this monitoring stack include:

The following sections cover the setup and configuration of our Proxmox monitoring stack.

Set Up and Configuration

The following video explains how to set up a Grafana dashboard for Proxmox. This installation uses the Proxmox monitoring function to feed data to Influx DB.


Monitoring Proxmox with Grafana

And here is a video that explains setting up self-signed certificates –


Configuring Self-Signed Certificates

We are using the Proxmox [Flux] dashboard with our setup.

Server Cluster

Proxmox Cluster Configuration
Proxmox Cluster Configuration

Our server cluster consists of three servers. Our approach was to pair one high-capacity server (a Dell R740 dual-socket machine) with two smaller Supermicro servers.

NodeModelCPURAMStorageOOB Mgmt.Network
pve1Dell R7402 x Xeon Gold 6154 3.0 GHz
(36 Cores)
768 GB16 x 3.84 TB SSDsiDRAC2 x 10 GbE,
2 x 25 GbE
pve2Supermicro 5018D-FN4TXeon D-1540 2.0 GHz
(8 cores)
128GB2 x 7.68 TB SSDsIPMI2 x 1 GbE,
4 x 10 GbE
pve3Supermicro 5018D-FN4TXeon D-1540 2.0 GHz
(8 cores)
128 GB2 x 7.68 TB SSDsIPMI2 x 1 GbE,
4 x 10 GbE

Cluster Servers

This approach allows us to handle most of our workloads on the high-capacity server, have the advantages of HA availability, and move workloads to the smaller servers to prevent downtime during maintenance activities.

Server Networking Configuration

All three servers in our cluster have similar networking interfaces consisting of:

  • An OOB management interface (iDRAC or IPMI)
  • Two low-speed ports (1 GbE or 10 GbE)
  • Two high-speed ports (10 GbE or 25 GbE)
  • PVE2 and PVE3 each have an additional two high-speed ports (10 GbE) via an add-on NIC

The following table shows the interfaces on our three servers and how they are mapped to the various functions available via a standard set of bridges on each server.

Cluster NodeOOB Mgmt.PVE Mgmt.Low-Speed Svcs.High-Speed Svcs.Storage Svcs.
pve1 (R740)1 GbE iDRAC10 GbE Port 110 GbE Port 225 GbE Port 125 GbE Port 2
pve2 (5018D-FN4T)1 GbE IPMI10 GbE Port 11 GbE Ports 1 & 2 (LAG)10 GbE Port 3 & 4 (LAG)10 GbE Port 2
pve3 (5018D-FN4T)1 GbE IPMI10 GbE Port 1HS Svcs (LAG)10 GbE Port 3 & 4 (LAG)10 GbE Port 2

Each machine uses a combination of interfaces and bridges to realize a standard networking setup. PVE2 and PVE3 also utilize LACP bonds to provide higher capacity for the low-speed and high-speed service bridges.

You can see how we configured the LACP Bond interfaces in this video.

Network Bonding on Proxmox

We must add specific routes to ensure the separate Storage VLAN is used for Virtual Disk I/O. This is done via the following adjustments to the vmbr3 bridge in /etc/network/interfaces.

Finally, use the IP address the target NAS uses in the Storage VLAN when configuring the NFS share for PVE-storage. This ensures that the dedicated Storage VLAN will be used for Virtual Disk I/O by all nodes in our Proxmox Cluster. We ran

# traceroute <storage NAS IP>

from each of our servers to confirm that we have a direct LAN connection to PVE-Storage that does not go through our router.

Cluster Setup

We are currently running a three-server Proxmox cluster. Our servers consist of:

  • A Dell R740 Server
  • Two Supermicro 5018D-FN4T Servers

The first step was to prepare each server in the cluster as follows:

  • Install and configure Proxmox
  • Setup a standard networking configuration
  • Confirm that all servers can ping the shared storage NAS using the storage VLAN

We used the procedure in the following video to setup and configure our cluster –

The first step was to use the pve1 server to create a cluster. Next, we add the other servers to the cluster. If there are problems with connecting to shared stores, check the following:

  • Is the Storage VLAN connection using an address like 192.168.100.<srv>/32?
  • Is there a direct route for VLAN 1000 (Storage) that does not use the router? Check via traceroute  <storage-addr>
  • Is the target NAS drive sitting on the Storage VLAN with multiple gateways enabled
  • Can you ping the storage server from inside the server Proxmox instances?

Backups

For backups to work correctly, we need to modify the Proxmox /etc/vzdump.conf file to set the tmpdir to /var/tmp/ as follows:

# vzdump default settings

tmpdir:  /var/tmp/
#tmpdir: DIR
#dumpdir: DIR
...

This will cause our backups to use the Proxmox tmp file directory to create backup archives for all backups.

We later upgraded to Proxmox Backup Server. You can see how PBS was installed and configured here.

NFS Backup Mount

We set up an NFS backup mount on one of our NAS drives to store Proxmox backups.

An NFS share was set up on NAS-5 as follows:

  • Share PVE-backups (/volume2/PVE-backups)
  • Used the default Management Network

A Storage volume was configured in Proxmox to use for backups as follows:

NAS-5 NFS Share for PVE Backups
NAS-5 NFS Share for PVE Backups

A Note About DNS Load

Proxmox constantly does DNS lookups on the servers associated with NFS and other mounted filesystems, which can result in very high transaction loads on our DNS servers. To avoid this problem, we replaced the server domain names with the associated IP addresses. Note that this cannot be done for the virtual mount for the Proxmox Backup Server, as PBS uses a certificate to validate the domain name used to access it. These adjustments can be made by editing the storage configuration file at /etc/pve/storage.cfg on any node in the cluster (changes in this file are synced for all nodes).

NFS Virtual Disk Mount

We also created an NFS share for VM and LXC virtual disk storage. The volume chosen provides high-speed SSD storage on a dedicated Storage VLAN.

Global Backup Job

A Datacenter level backup job was set up to run daily at 1 am for all VMs and containers as follows (this was later replaced with Proxmox Backup Server backups as explained here):

Proxmox Backup Job
Proxmox Backup Job

The following retention policy was used:

Proxmox Backup Retention Policy
Proxmox Backup Retention Policy

Node File Backups

We installed the Proxmox Backup Client on each of our server’s nodes and created a corn schedule script that backs up the files on each node to our Proxmox Backup Server each day. The following video explains how to install and configure the PBS client.

For the installation to work properly, the locations of the PBS repository and access credentials must be set in both the script and the login bash shell. We also need to create a cron job to run the backup script daily.

Setup SSL Certificates

We use the procedure in the video below to set up signed SSL certificates for our three server nodes and the Proxmox Backup server.

This approach uses a Let’s Encrypt DNS-01 challenge via Cloudflare DNS to authenticate with Let’s Encrypt and obtain a signed certificate for each server node in the cluster and for PBS.

Setup SSH Keys

A public/private key pair is created and set up for Proxmox VE and all VMs and LXC to ensure secure SSH access. The following procedure is used to do this. The public keys are installed on each server using the ssh-copy-id username@host command.

Setup Remote Syslog

Sending logs to a remote syslog server requires the manual installation of the rsyslog service as follows –

# apt update && apt install rsyslog

Once the service is installed, you can create the following file to set the address of your remote Syslog server –

# vi /etc/rsyslog.d/remote-logging.conf
...
# Setup remote syslog
*.*  @syslog.mydomain.com:514
...
# systemctl restart rsyslog

High Availability (HA)

Proxmox can support automatic failover (High Availability) of VMs and Containers to any node in a cluster. The steps to configure this are:

  • Move the virtual disks for all VMs and LXC containers to shared storage. In our case, this is PVE-storage. Note that our TrueNAS VM must run on pve1 as it uses disks that are only available on pve1.
  • Enable HA for all VMs and LXCs (except TrueNAS)
  • Setup an HA group to govern where the VMs and LXC containers migrate to if a node fails
Cluster Failover Configuration – VMs & LXCs

We generally run all of our workloads on pve1 since it is our cluster’s highest performance and capacity node. Should this node fail, we want to migrate the pve1 workload to distribute it between the pve2 and pve3 nodes evenly. We can do this by setting up a HA Failover Group as follows:

HA Failover Group Configuration
HA Failover Group Configuration

The nofallback option is set so workloads don’t automatically migrate back to pve1 when we manually migrate them to other nodes to support maintenance operations.

Proxmox Backup Server

This page covers the installation of the Proxmox Backup Server in our HomeLab. Our approach will be to run the Proxmox Backup Server (PBS) in a VM on our server and use shared storage on one of our NAS drives to store backups.

Proxmox Backup Server Installation

We used the following procedure to install PBS on our server.

PBS was created using the recommended VM settings in the video. The VM is created with the following resources:

  • 4 CPUs
  • 4096 KB Memory
  • 32 GB SSD Storage (Shared PVE-storage)
  • HS Services Network

Once the VM is created, the next step is to run the PBS installer.

After the PBS install is complete, PBS is booted, the QEMU Guest Agent is installed, and the VM is updated using the following commands –

# apt update
# apt upgrade
# apt-get install qemu-guest-agent
# reboot

PBS can now be accessed via the web interface using the following URL –

https://<PBS VM IP Address>:8007

Create a Backup Datastore on a NAS Drive

The steps are as follows –

  • Install CIFS utils
# Install NFS share package on Proxmox
apt install cifs-utils
  • Create  a mount point for the NAS PBS store
mkdir /mnt/pbs-store
  • Create a Samba credentials file to enable logging into NAS share
vi /etc/samba/.smbcreds
...
username=<NAS Share User Name>
password=<NAS Share Password>
...
chmod 400 /etc/samba/.smbcreds
  • Test mount the NAS share in PBS  and make a directory to contain the PBS backups
mount -t cifs -o rw,vers=3.0, \
    credentials=/etc/samba/.smbcreds, \
    uid=backup,gid=backup \
    //<nas-#>.anita-fred.n et/PBS-backups \
    /mnt/pbs-store
mkidr /mnt/pbs-store/pbs-backups
  • Make the NAS share mount permanent by adding it to /etc/fstab
vi /etc/fstab
...after the last line add the following line
# Mount PBS backup store from NAS
//nas-#.anita-fred.net/PBS-backups /mnt/pbs-store cifs vers=3.0,credentials=/etc/samba/.smbcreds,uid=backup,gid=backup,defaults 0 0
  • Create a datastore to hold the PBS backups in the Proxmox Backup Server as follows. The datastore will take some time to create (be patient).
PBS Datastore Configuration
PBS Datastore Configuration
PBS Datastore Prune Options
PBS Datastore Prune Options
  • Add the PBS store as storage at the Proxmox datacenter level. Use the information from the PBS dashboard to set the fingerprint.
PBS Storage in Proxmox VE
PBS Storage in Proxmox VE
  • The PBS-backups store can now be used as a target in Proxmox backups. NOTE THAT YOU CANNOT BACK UP THE PBS VM TO PBS-BACKUPS.

Setup Boot Delay

The NFS share for the Proxmox Backup store needs time to start before the Backup server starts on boot. This can be set for each node under System/Options/Start on Boot delay. A 30-second delay seems to work well.

Setup Backup, Pruning, and Garbage Collection

The overall schedule for Proxmox backup operations is as follows:

  • 03:00 – Run Pruning on the PBS-backups store
  • 03:30 – Run PBS Backups on all VMs and LXCs EXCEPT for the PBS Backup Server VM
  • 04:00 – Run a standard PVE Backup on the PBS Backup Server VM (run in suspend mode; stop mode causes problems)
  • 04:30 – Run Garage Collection on the PBS-backups store
  • 05:00 – Verify all backups in the PBS-backups store

Local NTP Servers

We want Proxmox and Proxmox Backup Server to use our local NTP servers for time synchronization. To do this, modify/etc/chrony/chrony.conf to use our servers for the pool. This must be done on each server individually and inside the Proxmox Backup Server VM. See the following page for details.

Backup Temp Directory

Proxmox backups use vzdump to create compressed backups. By default, backups use /var/tmp, which lives on the boot drive of each node in a Proxmox Cluster. To ensure adequate space for vzdump and reduce the load on each server’s boot drive, we have configured a temp directory on the local ZFS file systems on each of our Proxmox servers. The tmp directory configuration needs to be done on each node in the cluster (details here). The steps to set this up are as follows:

# Create a tmp directory on local node ZFS stores
# (do this once for each server in the cluster)
cd /zfsa
mkdir tmp

# Turn on and verify ACL for ZFSA store
zfs get acltype zfsa
zfs set acltype=posixacl zfsa
zfs get acltype zfsa

# Configure vzdump to use the ZFS tmp dir'
# add/set tmpdir as follows 
# (do on each server)
cd /etc
vi vzdump.conf
tmpdir: /zfsa/tmp
:wq

Proxmox VE

This page covers the Proxmox install and setup on our server. You can find a great deal of information about Proxmox in the Proxmox VE Administrator’s Guide.

Proxmox Installation/ZFS Storage

Proxmox was installed on our server using the steps in the following video:

The Proxmox boot images are installed on MVMe drives (ZFS RAID1 on our Dell Sever BOSS Card, or ZFS single on the MNVe drives on our Supermicro Servers). This video also covers the creation of a ZFS storage pool and filesystem. A single filesystem called zfsa was set up using RAID10 and lz4 compression using four SSD disks on each server.

A Community Proxmox VE License was purchased and installed for each node. The Proxmox installation was updated on each server using the Enterprise Repository.

Linux Configuration

I like to install a few additional tools to help me manage our Proxmox installations. They include the nslookup and ifconfig commands and the tmux terminal multiplexor. The commands to install these tools are found here.

Cluster Creation

With these steps done, we can create a 3-node cluster. See our Cluster page for details.

ZFS Snapshots

Creating ZFS snapshots of the Proxmox installation can be useful before making changes. This enables rollback to a previous version of the filesystem should any changes need to be undone. Here are some useful commands for this purpose:

zfs list -t snapshot
zfs list
zfs snapshot rpool/ROOT/<node-name>@<snap-name>
zfs rollback rpool/ROOT/<node-name>t@<snap-name>
zfs destroy rpool/ROOT/<node-name>@<snap-name>

Be careful to select the proper dataset – snapshots on the pool that contain the dataset don’t support this use case. Also, you can only roll back to the latest snapshot directly. If you want to roll back to an earlier snapshot, you must first destroy all of the later snapshots.

In the case of a Proxmox cluster node, the shared files in the associated cluster filesystem will not be included in the snapshot. You can learn more about the Proxmox cluster file system and its shared files here.

You can view all of the snapshots inside the invisible /.zfs directory on the host filesystem as follows:

# cd /.zfs/snapshot/<name>
# ls -la

Local NTP Servers

We want Proxmox and Proxmox Backup Server to use our local NTP servers for time synchronization. To do this, we need to modify/etc/chrony/chrony.conf to use our servers for the pool. This needs to be done on each server individually and inside the Proxmox Backup Server VM. See the following page for details.

The first step before following the configuration procedures above is to install chrony on each node –

apt install chrony

Mail Forwarding

We used the following procedure to configure postfix to support forwarding e-mail through smtp2go. Postfix does not seem to work with passwords containing a $ sign. A separate login was set up in smtp2go for forwarding purposes.

Some key steps in the process include:

# Install postfix and the supporting modules
# for smtp2go forwarding
sudo apt-get install postfix
sudo apt-get install libsasl2-modules

# Install mailx
sudo apt -y install bsd-mailx
sudo apt -y install mailutils

# Run this command to configure postfix
# per the procedure above
sudo dpkg-reconfigure postfix

# Use a working prototype of main.cf to edit
sudo vi /etc/postfix/main.cf

# Setup /etc/mailname -
#   use version from working server
#   MAKE SURE mailname is lower case/matches DNS
sudo uname -n > /etc/mailname

# Restart postfix
sudo systemctl reload postfix
sudo service postfix restart

# Reboot may be needed
sudo reboot

# Test
echo "Test" | mailx -s "PVE email" <email addr>

vGPU

Our servers each include a Nvidia TESLA P4 GPU. This GPU is sharable using Nvidia’s vGPU. The information on how to set up Proxmox for vGPU may be found here. This procedure also explains how to enable IOMMU for GPU pass-through (not sharing). We do not have IOMMU setup on our servers at this time.

You’ll need to install the git command and the cc compiler to use this procedure. This can be done with the following commands –

# apt update
# apt install git
# apt install build-essential

Now you can follow the procedure here. Be sure to include the steps to enable IOMMU. I downloaded and installed the 6.4 vGPU driver from the Nvidia site and did a final reboot of the server.

vGPU Types

The vGPU drivers support a number of GPU types. You’ll want to select the appropriate one in each VM. Note that multiple sizes of vGPUs are not allowed (i.e., if one GPU uses 2 GB of memory, all must). The following table shows the types available. (this data can be obtained by running mdevctl types on your system).

Q Profiles - Not Good for OpenGL/Games
vGPU TypeNameMemoryInstances
nvidia-63GRID P4-1Q1 GB8
nvidia-64GRID P4-2Q2 GB4
nvidia-65GRID P4-4Q4 GB2
nvidia-66GRID P4-8Q8 GB1
A Profiles - Windows VMs
vGPU TypeNameMemoryInstances
nvidia-67GRID P4-1A1 GB8
nvidia-68GRID P4-2A2 GB4
nvidia-69GRID P4-4A4 GB2
nvidia-70GRID P4-8A8 GB1
B Profiles - Linux VMs
vGPU TypeNameMemoryInstances
nvidia-17GRID P4-1B1 GB8
nvidia-243GRID P4-1B41 GB8
nvidia-157GRID P4-2B2 GB4
nvidia-243GRID PR-2B42 GB4

Problems with Out Of Date Keys on Server Nodes

I have occasionally seen problems with the SSH keys getting out of date on our servers. The fix for this is to run the following commands on all of the servers. A reboot is also sometimes necessary.

# Update certs and repload PVE proxy
pvecm updatecerts -F && systemctl restart pvedaemon pveproxy

# Reboot if needed
reboot

Welcome To Our Home Lab

Home Network Dashboard
Home Network Dashboard

This site is dedicated to documenting the setup, features, and operation of our Home Lab. Our Home Lab consists of several different components and systems, including:

  • A high-performance home network
  • A storage system that utilizes multiple NAS devices
  • An enterprise-grade server
  • Applications, services, and websites

Home Network

Gen 2 Home Network Rack
Gen 2 Home Network Core Rack

Our Home Network is a two-tiered structure with a core based upon high-speed 25 GbE capable aggregation switches and optically connected edge switches. We use UniFi equipment throughout. We have installed multiple OM4 fiber multi-mode fiber links from the core to each room in our house. The speed of these links ranges from 1 Gbps to 25 Gbps, with most connections running as dual-fiber LACP LAG links.

Telephone System

To be added

Surveillance System

To be added

Storage System

To be added

Enterprise Server

To be added

Backups

Daily backups for all VMs and LXC containers are configured as follows.

Applications, Services, and Websites

We are hosting several websites, including:

Set-up information for our self-hosted sites may be found here.