Tag Archives: Storage

Samba File Server

Samba File Serve
Samba File Serve

We have quite a bit of high-speed SSD storage available on the pve1 server in our Proxmox cluster. We made this storage available as a NAS drive using the Turnkey File Server.

Installing the Turkey File Server

We installed the Turnkey File Server in an LXC container that runs on our pve1 storage. This LSC will not be movable as it will be associated with SSD disks that are only available on pve1. The first step is to create a ZFS file system (zfsb) on pve1 to hold the LXC boot drive and storage.

The video below explains the procedure used to set up the File Server LXC and configure Samba shares.

The LXC container for our File Server was created with the following parameters –

  • 2 CPUs
  • 1 GB Memory
  • 8 GB Boot Disk in zfsb_mp
  • 8 TB Share Disk in zfsb_mp (mounted as /mnt/shares with PBS backups enabled.)
  • High-speed Services Network, VLAN Tab=10
  • The container is unprivileged

File Server LXC Configuration

The following steps were performed to configure our File Server –

  • Set the system name to nas-10
  • Configured postfix to forward email
  • Set the timezone
  • Install standard tools
  • Updated the system via apt update && apt upgrade
  • Installed SSL certificates using a variation of the procedures here and here.
  • Setup Samba users, groups, and shares per the video above

Backups

Our strategy for backing up our file server is to run a Rsync job via the Cron inside the host LXC container. The Rsync copies the contents of our file shares to one of our NAS drives. The NAS drive then implements a 1-2-3 Backup Strategy for our data.

Raspberry Pi NAS

Raspberry Pi NAS

We’ve built a NAS and Docker Staging environment using a Raspberry Pi 5. Our NAS features a 2 TB NVMe SSD drive for fast shared storage on our network.

Hardware Components

Raspberry Pi 5 SBC

We use the following components to build our system –

Here’s a photo of the completed hardware assembly –

Pi NAS Internals
Pi NAS Internals

Software Components and Installation

We installed the following software on our system to create our NAS –

CassaOS

CasaOS GUI
CasaOS GUI

CasaOS is included to add a very nice GUI for managing each of our NUT servers. Here’s a useful video on how to install CasaOS on the Raspberry Pi –

Installation

The first step is to install the 64-bit Lite Version of Raspberry Pi OS. This is done by first installing a full desktop version on a flash card and then using Raspberry Pi Imager to install the lite version on our NVMe drive.

Once this installation was done, we used the Raspberry Pi Imager to install the same OS version on our NVMe SSD. After removing the flash card and booting to the NVMe SSD, the following configuration changes were made –

  • The system name is set to NAS-12
  • Enabled SSH
  • Set our user ID and password
  • Applied all available updates
  • We updated /boot/firmware/config.txt to enable PCIe Gen3 operation with our SSD

We used the process covered in the video above to install CasaOS.

CasaOS makes all of its shares public and does not password-protect shared folders. While this may be acceptable for home use where the network is isolated from the public Internet, it certainly is not a good security practice.

Fortunately, the Debian Linux-derived distro we are running includes Samba file share support, which we can use to protect our shares properly. This article explains the basics of how to do this.

Here’s an example of the information in smb.conf for one of our shares –

[Public]
    path = /DATA/Public
    browsable = yes
    writeable = Yes
    create mask = 0644
    directory mask = 0755
    public = no
    comment = "General purpose public share"

You will also need to create a Samba user for your Samba shares to work. Samba user privileges can be added to any of the existing Raspberry Pi OS users with the following command –

# sudo smbpasswd -a <User ID to add>

It’s also important to correctly set the shared folder’s owner, group, and modes.

We need to restart the Samba service anytime configuration changes are made. This can be done with the following command –

# sudo systemctl restart smbd

High-Availability Storage Cluster

Synology HA Storage Cluster
Synology HA Storage Cluster

We are building a High-Availability (HA) Storage Cluster to complement our Proxmox HA Server Cluster. Synology has a nice HA solution that we can use for this. To use Synology’s HA’s solution, one must have the following:

  • Two Identical Synology NAS devices (we are using a pair of RS1221+ rack-mounted Synology NAS’)
  • Both NAS devices must have identical memory and disk configurations.
  • Both NAS devices must have at least two network interfaces available (we are using dual 10 GbE network cards in both of our NAS devices)

The two NAS devices work in an active/standby configuration and present a single IP interface for access to storage and administration.

Synology HA Documentation

Synology provides good documentation for their HA system. Here are some useful links:

The video above provides a good overview of Synology HA and how to configure it.

Storage Cluster Hardware

Synology RS1221+ NAS
Synology RS1221+ NAS

We are using a pair of Synology RS1221+ rack-mounted NAS servers. Each one is configured with the following hardware options:

Networking

Our Proxmox Cluster will connect to our HA Storage Cluster via ethernet connections. We will be storing the virtual disk drives for our VMs and LXC in this cluster on our HA Storage Cluster. Maximizing these connections’ speed and minimizing latency is important to maximize our workload’s overall performance.

Each node in our Proxmox Cluster has dedicated high-speed connections (25 GbE for pve1, 10 GbE for pve2 and pve3)  to a dedicated Storage VLAN. These connections are made through a Unfi Switch – an Enterprise XG 24. This switch is supported by a large UPS that provides battery backup power for our Networking Rack.

Ubiquity EnterpriseXG 24 Switch
Ubiquity EnterpriseXG 24 Switch

This approach is taken to minimize latency as the storage traffic cluster is completely handled with a single switch.

Ideally, we would have a pair of these switches and redundant connections to our Proxmox and HA Storage clusters to maximize reliability. While this would be a nice enhancement, we have chosen to use a single switch for cost reasons.

The NAS drives in our HA Storage Cluster are configured to provide an interface to both our Storage VLAN. This approach ensures that the nodes in our Proxmox cluster can access the HA Storage Cluster directly without a routing hop through our firewall. We also set the MTU for this network to 9000 (Jumbo Frames) to minimize packet overhead.

Storage Design

Each Synology RS1221+ in our cluster has eight 960 GB Enterprise SSDs. The performance of the resulting storage system is important as we will be storing the disks for the VMs and LXCs in our Proxmox Cluster on our HA Storage System. The following are the criteria we used to select a storage pool configuration:

  • Performance – we want to be able to saturate the 10 GbE interfaces to our HA Storage Cluster
  • Reliability – we want to be protected against single-drive failures. We will keep spare drives and use backups to manage the chance of simultaneous multiple-drive failures.
  • Storage Capacity – we want to use the available SSD storage capacity efficiently.

We considered using either a RAID-10 or RAID-5 configuration.

Storage Devices – 960 GB Enterprise SSDs

Toshiba 960 GB SSD Performance
Toshiba 960 GB SSD Performance

Our SSD drives are enterprise models with good throughput and IO/s (IOPs) performance.

960 GB SSD Reliability Features
960 GB SSD Reliability Features

They also feature some desirable reliability features, including good write endurance and MTBF numbers. Our drives also feature sudden power-off features to maintain data integrity in the event of a power failure that cannot be backed up by our UPS system.

Performance Comparison – RAID-10 vs. RAID-5

We used a RAID performance calculator to estimate the performance of our storage system. Based on actual runtime data from our VMs and LXCs running in Proxmox, our IO workload is almost completely written operation-dominated. This is probably due to the fact that read caching handles most read operations from memory on our servers.

The first option we considered was RAID-10. The estimated performance for this configuration is shown below.

RAID-10 Throughput Performance
RAID-10 Throughput Performance

As you can see, this configuration’s throughput will more than saturate our 10 GbE connections to our HA Storage Cluster.

The next option we considered was RAID-5. The estimated performance for this configuration is shown below.

RAID-5 Throughput Performance
RAID-5 Throughput Performance

As you can see, performance is a substantial hit due to the need to generate and store parity data each time storage is written. The RAID-5 configuration should also be able to saturate our 10 GbE connections to the Storage Cluster.

The result is that the RAID-10 and RAID-5 configurations will provide the same performance level given our 10 GbE connections to our Storage Cluster.

Capacity Comparison – RAID-10 vs. RAID-5

The next step in our design process was to compare the usable storage capacity between RAID-10 and RAID-5 using Synology’s RAID Calculator.

RAID-10 vs. RAID-5 Usable Storage Capacity
RAID-10 vs. RAID-5 Usable Storage Capacity

Not surprisingly, the RAID-5 configuration creates roughly twice as much usable storage when compared to the RAID-10 configuration.

Chosen Configuration

We decided to formate our SSDs as a Btrfs storage pool configured as a RAID-5. We choose RAID-5 for the following reasons:

  • A good balance between write performance and reliability
  • Efficient use of available SSD storage space
  • Acceptable overall reliability (single disk failures) given the following:
    • Our storage pools are fully redundant between the primary and secondary NAS pools
    • We run regular automatic snapshots, replications, and backups via Synology’s Hyper Backup as well as server-side backups via Proxmox Backup Server.

The following shows the expected IO/s (IOPs) for our storage system.

RAID-5 IOPs Performance
RAID-5 IOPs Performance

This level of performance should be more than adequate for our three-node cluster’s workload.

Dataset / Share Configuration

The final dataset format that we will use for our vdisks is TBD at this point. We plan to test the performance of both iSCSI LUNs and NFS shares. If these perform roughly the same for our workloads, we will use NFS to gain better support for snapshots and replication features. At present, we are using an NFS dataset to store our vdisks.

HA Configuration

Configuring the pair of RS1212+ NAS servers for HAS was straightforward. Only minimal configurations are needed on the secondary NAS to get the storage and network configurations to match the primary NAS. The process that enables HA on the primary NAS will overwrite all of the settings on the secondary NAS.

Here are the steps that we used to do this.

  • Install all of the upgrades and SSDs in both units
  • Connect both units to our network and install an ethernet connection between the two units for heartbeats and synchronization
  • Install DSM on each unit and set a static IP address for the network-facing ethernet connections (we do not set IPs for the heartbeat connections – Synology HAS takes care of this)
  • Configure the network interfaces on both units to provide direct interfaces to our Storage VLAN (see the previous section)
  • Make sure that the MTU settings are identical on each unit. This includes the MTU setting for unused ethernet interfaces. We had to edit the /etc/synoinfo.conf file on each unit to set the MTU values for the inactive interfaces.
  • Ensure both units are running up-to-date versions of the DSM software
  • Configure the pair for HA (see the documentation above)
  • Complete the configuration of the cluster pair, including –
    • Shares
    • Backups
    • Snapshots and Replication
    • Install Apps

The following shows the completed configuration of our HA Storage Cluster.

Completed HA Cluster Configuration
Completed HA Cluster Configuration

The cluster uses a single IP address to present a GUI that configures and manages the primary and secondary NAS units as if they were a single NAS. The same IP address always points to the active NAS for file sharing and iSCSI I/O operations.

Voting Server

A voting server avoids split-brain scenarios where both units in the HA cluster try to act as the master. Any server that is always accessible via ping to both NAS drives in the cluster can serve as a Voting Server. We used the gateway for the Storage VLAN where the cluster is connected for this purpose.

Performance Benchmarking

We used the ATTO Disk Benchmarking Tool to perform benchmark tests on the complete HA cluster. The benchmarks were run from an M2 Mac Mini running macOS, which used an SMB share to access the Storage Cluster over a 10 GbE connection on the Storage VLAN.

Storage Cluster Benchmark Configuration
Storage Cluster Benchmark Configuration

The following are the benchmark results –

Storage Throughput Benchmarks
Storage Cluster Throughput Benchmarks

The Storage Cluster’s performance is quite good, and the 10 GbE connection is saturated for 128 KB writes and larger. The slightly lower read throughput results from a combination of our SSD’s wire performance and the additional latency on writes due to the need to copy data from the primary NAS storage to the secondary NAS.

Storage Cluster IOPs Benchmarks
Storage Cluster IOPs Benchmarks

IOs/sec (IOPs) performance is important for virtual disks such as VMs and LXC containers, as they frequently perform smaller writes.

We also ran benchmarks from a VM running Windows 10 in our Proxmox Cluster. These benchmarks benefit from a number of caching and compression features in our architecture, including:

  • Write Caching with the Windows 10 OS
  • Write Caching with the iSCSI vdisk driver in Proxmox
  • Write Caching on the NAS drives in our Storage Cluster
Windows VM Disk Benchmarks
Windows VM Disk Benchmarks

The overall performance figures for the Windows VM benchmark exceed the capacity of the 10 GbE connections to the Storage Cluster and are quite good. Also, the IOPs performance is close to the specified maximum performance values for the RS1221+ NAS.

Windows VM IOPs Benchmarks
Windows VM IOPs Benchmarks

Failure Testing

The following scenarios were tested under a full workload –

  • Manual Switch between Active and Standby NAS devices
  • Simulate a network failure by disconnecting the primary NAS ethernet cable.
  • Simulate active NAS failure by pulling power from the primary NAS.
  • Simulate a disk failure by pulling a disk from the primary NAS pool.

In all cases, our system failed over within 30 seconds or less and continued handling the workload without error.

TrueNAS

TrueNAS SCALE Dashboard
TrueNAS SCALE Dashboard[ez-toc]

We have quite a bit of high-speed SSD storage available on the pve1 server in our Proxmox cluster. We made this storage available as a NAS drive using TrueNAS SCALE.

Useful Links

Installing TrueNAS Scale in a VM

We installed TrueNAS SCALE in a Virtual Machine on our pve1 storage. This VM will not be movable as it will be associated with SSD disks that are only available on pve1. TrueNAS will use the available disks to form a new ZFS pool, so we must pass through the physical disks to the TrueNAS VM. The video below explains the procedure used to set up TrueNAS SCALE in a Proxmox VM and properly pass through the disks.

The Virtual Machine for TrueNAS SCALE was created with the following parameters –

  • 4 CPUs (1 socket, enable NUMA)
  • 32 GB Memory (8 GB + 24 GB to support ZFS caching, Not a Ballooning device)
  • 64 GB Disk (use zfsa_mp for boot drive), enable Discard and SSD emulation, turn off Backup.
  • High-speed Services Network, VLAN Tab=10, use Bridge MTU (=1)
  • QEMU Guest Agent checked
  • Do not start the VM until the disks are passed through and configured in the VM (see below)

It is important to pass the physical disk drives on pve1 through to the TrueNAS VM by referencing the physical device names and serial numbers, as explained in the video.

The following is the disk name and serial number information for our server. Use the commands and procedure here to get this information.

Dev NameModelUnique Storage ID (SCSI)Serial
sde - scsi1
KPM5XRUG3T84scsi-358ce38ee208944e939U0A020TNVF
sdf - scsi2
KPM5XRUG3T84scsi-358ce38ee2089452d39U0A02HTNVF
sdg - scsi3
KPM5XRUG3T84scsi-358ce38ee207e13c129P0A0GWTNVF
sdh - scsi4
KPM5XRUG3T84scsi-358ce38ee207e876d29S0A038TNVF
sdi - scsi5
KPM5XRUG3T84scsi-358ce38ee2089451d39U0A02DTNVF
sdj - scsi6
KPM5XRUG3T84scsi-358ce38ee208844f139S0A0FCTNVF
sdk - scsi7KPM5XRUG3T84scsi-358ce38ee207e877129S0A039TNVF
sdl - scsi8KPM5XRUG3T84scsi-358ce38ee2088b59939T0A08WTNVF
sdm - scsi9PX04SRB384scsi-3500003976c8a5d71Y6T0A101TG2D
sdn - scsi10PX04SRB384scsi-3500003976c8a5099Y6T0A0GLTG2D
sdo - scsi11PX04SRB384scsi-3500003976c8a408dY6S0A0AHTG2D
sdp - scsi12
PX04SRB384scsi-3500003976c8a5259Y6T0A0KWTG2D

Physical Disk Information for Passthrough

The Backup option was turned off for all of the disks that were passed through to TrueNAS so that they could be backed up at a file level, which is much faster.

Options for the initial install after the first boot differ slightly from the video in the updated version of TrueNAS SCALE. The differences include –

  • Configure admin login
  • Do not create a Swap file
  • Allow EFI Boot = Yes
  • There is no need to install the qemu guest agent; it’s already installed

Create a ZFS Storage Pool

After booting TrueNAS, a ZFS storage dataset was created with the passthrough disks as follows –

  • Datastore name – zfs1
  • RAID-10: Type – mirror, Width=2, VDEVs=6 (final capacity is 20.95 TB)
  • All 12 disks are auto-selected
  • All other options were left as defaults

The image below shows the configuration of the ZFS dataset.

TrueNAS ZFS Dataset Configuration
TrueNAS ZFS Dataset Configuration

Once the pool was created, the following were configured –

  • Enabled Auto TRIM
  • Configured Pool Scrubs to run on Sunday at midnight every 30 days

Expand ARC Cache Size

By default, TrueNAS will use only half the memory allocated for its ARC cache. We used the procedure in the video below to expand the ARC Cache memory limit to 24 GB. You must create an init script in Truenas and set an absolute number (25769803776) for max_arc_size in the init script.

TrueNAS Configurationn

The following steps were performed to configure the TrueNAS system –

  • Configured the Dashboard
  • Admin account email set
  • Set up e-mail relay/forwarding
  • Set up a hostname (nas-10), DNS servers, MTU, etc., under the Network menu
  • Set up local NTP servers; eliminated default servers
  • Set the Timezone to New_York
  • Set up Syslog server
  • IMPORTANT: set the applications pool to zfs1 BEFORE creating any shares
  • Set up Homes dataset for user logins
  • Setup account for user logins in users group
  • Create shares as follows –
    • First, create a dataset and set owner and group for each share
    • Then, create an SMB share for the dataset
    • Use Default Share Parameters
    • Setup snapshot that runs every 15 minutes on each of the datasets/shares (creates snapshots for file rollback); also set up daily snapshots on the zfs1 and ix-applications dataset to capture ZFS and apps setups
  • Set up shares and enable access by user group
  • Created a signed SSL certificate – see this procedure
  • Extended GUI session timeout to 30 minutes (System Advanced Settings)

These commands are useful for working with snapshots –

# Make snapshots visible
# zfs set snapdir=visible|hidden <dataset>

# display snapshot list
# zfs list -t snapshots

App Catalog

TrueNAS comes with the IXsystems apps catalog preconfigured. We add the TrueCharts catalog using the minimal variant of the procedures outlined here. Note that initially, it will take a LONG time (like overnight) to set up the Truecharts catalog.

Good information on configuring Apps on TrueNAS SCALE can be found here.

Backups

The TrueNAS VM is included in our cluster’s daily Proxmox Backup Server backup. This backs up the TrueNAS boot disk.

The large ZFS datastore would take a long time to back up at a block level, so we’ll set up a rsync job for one of our Synology NAS drives.

We used the procedure in the following video to set up TrueNAS backups using rsync.

The approach uses an SSH connection between TrueNAS and one of our Synology NAS drives to transfer and update a copy of the file on our main dataset. Here’s an example of the setup of the rsync task on the TrueNAS side.

Example TrueNAS Backup Task (rsync)
Example TrueNAS Backup Task (rsync)

The rsync jobs run every 15 minutes and only copy the files that are changed on the TrueNAS side. The target Synology drive takes snapshots, does replication, and runs off-site backups to protect the data in the TrueNAS dataset.

Here’s another procedure for setting this up that looks pretty good. I have not tried this one.

Data Protection and Integrity

We are using the ZFS file system, pool scrubbing, S.M.A.R.T tests, snapshots, and rsync replication to protect our data. Here’s an overview of our final data integrity setup –

TrueNAS Data Integrity Configuration
TrueNAS Data Integrity Configuration

File Browser App

We installed the File Browser App from the TrueNAS catalog. This app provides a web GUI that enables file and directory manipulation on our NAS. The following videos will help to get File Browser set up –

The key to getting this working without permission errors is to set the ACLs on each of the datasets that are exposed in shares and the File Browser (if used) as follows:

Dataset ACLs for File Browser
Dataset ACLs for Shares and the File Browser

Proxmox Backup Server

This page covers the installation of the Proxmox Backup Server in our HomeLab. Our approach will be to run the Proxmox Backup Server (PBS) in a VM on our server and use shared storage on one of our NAS drives to store backups.

Proxmox Backup Server Installation

We used the following procedure to install PBS on our server.

PBS was created using the recommended VM settings in the video. The VM is created with the following resources:

  • 4 CPUs
  • 4096 KB Memory
  • 32 GB SSD Storage (Shared PVE-storage)
  • HS Services Network

Once the VM is created, the next step is to run the PBS installer.

After the PBS install is complete, PBS is booted, the QEMU Guest Agent is installed, and the VM is updated using the following commands –

# apt update
# apt upgrade
# apt-get install qemu-guest-agent
# reboot

PBS can now be accessed via the web interface using the following URL –

https://<PBS VM IP Address>:8007

Create a Backup Datastore on a NAS Drive

The steps are as follows –

  • Install CIFS utils
# Install NFS share package on Proxmox
apt install cifs-utils
  • Create  a mount point for the NAS PBS store
mkdir /mnt/pbs-store
  • Create a Samba credentials file to enable logging into NAS share
vi /etc/samba/.smbcreds
...
username=<NAS Share User Name>
password=<NAS Share Password>
...
chmod 400 /etc/samba/.smbcreds
  • Test mount the NAS share in PBS  and make a directory to contain the PBS backups
mount -t cifs -o rw,vers=3.0, \
    credentials=/etc/samba/.smbcreds, \
    uid=backup,gid=backup \
    //<nas-#>.anita-fred.n et/PBS-backups \
    /mnt/pbs-store
mkidr /mnt/pbs-store/pbs-backups
  • Make the NAS share mount permanent by adding it to /etc/fstab
vi /etc/fstab
...after the last line add the following line
# Mount PBS backup store from NAS
//nas-#.anita-fred.net/PBS-backups /mnt/pbs-store cifs vers=3.0,credentials=/etc/samba/.smbcreds,uid=backup,gid=backup,defaults 0 0
  • Create a datastore to hold the PBS backups in the Proxmox Backup Server as follows. The datastore will take some time to create (be patient).
PBS Datastore Configuration
PBS Datastore Configuration
PBS Datastore Prune Options
PBS Datastore Prune Options
  • Add the PBS store as storage at the Proxmox datacenter level. Use the information from the PBS dashboard to set the fingerprint.
PBS Storage in Proxmox VE
PBS Storage in Proxmox VE
  • The PBS-backups store can now be used as a target in Proxmox backups. NOTE THAT YOU CANNOT BACK UP THE PBS VM TO PBS-BACKUPS.

Setup Boot Delay

The NFS share for the Proxmox Backup store needs time to start before the Backup server starts on boot. This can be set for each node under System/Options/Start on Boot delay. A 30-second delay seems to work well.

Setup Backup, Pruning, and Garbage Collection

The overall schedule for Proxmox backup operations is as follows:

  • 03:00 – Run Pruning on the PBS-backups store
  • 03:30 – Run PBS Backups on all VMs and LXCs EXCEPT for the PBS Backup Server VM
  • 04:00 – Run a standard PVE Backup on the PBS Backup Server VM (run in suspend mode; stop mode causes problems)
  • 04:30 – Run Garage Collection on the PBS-backups store
  • 05:00 – Verify all backups in the PBS-backups store

Local NTP Servers

We want Proxmox and Proxmox Backup Server to use our local NTP servers for time synchronization. To do this, modify/etc/chrony/chrony.conf to use our servers for the pool. This must be done on each server individually and inside the Proxmox Backup Server VM. See the following page for details.

Backup Temp Directory

Proxmox backups use vzdump to create compressed backups. By default, backups use /var/tmp, which lives on the boot drive of each node in a Proxmox Cluster. To ensure adequate space for vzdump and reduce the load on each server’s boot drive, we have configured a temp directory on the local ZFS file systems on each of our Proxmox servers. The tmp directory configuration needs to be done on each node in the cluster (details here). The steps to set this up are as follows:

# Create a tmp directory on local node ZFS stores
# (do this once for each server in the cluster)
cd /zfsa
mkdir tmp

# Turn on and verify ACL for ZFSA store
zfs get acltype zfsa
zfs set acltype=posixacl zfsa
zfs get acltype zfsa

# Configure vzdump to use the ZFS tmp dir'
# add/set tmpdir as follows 
# (do on each server)
cd /etc
vi vzdump.conf
tmpdir: /zfsa/tmp
:wq

Proxmox VE

This page covers the Proxmox install and setup on our server. You can find a great deal of information about Proxmox in the Proxmox VE Administrator’s Guide.

Proxmox Installation/ZFS Storage

Proxmox was installed on our server using the steps in the following video:

The Proxmox boot images are installed on MVMe drives (ZFS RAID1 on our Dell Sever BOSS Card, or ZFS single on the MNVe drives on our Supermicro Servers). This video also covers the creation of a ZFS storage pool and filesystem. A single filesystem called zfsa was set up using RAID10 and lz4 compression using four SSD disks on each server.

A Community Proxmox VE License was purchased and installed for each node. The Proxmox installation was updated on each server using the Enterprise Repository.

Linux Configuration

I like to install a few additional tools to help me manage our Proxmox installations. They include the nslookup and ifconfig commands and the tmux terminal multiplexor. The commands to install these tools are found here.

Cluster Creation

With these steps done, we can create a 3-node cluster. See our Cluster page for details.

ZFS Snapshots

Creating ZFS snapshots of the Proxmox installation can be useful before making changes. This enables rollback to a previous version of the filesystem should any changes need to be undone. Here are some useful commands for this purpose:

zfs list -t snapshot
zfs list
zfs snapshot rpool/ROOT/<node-name>@<snap-name>
zfs rollback rpool/ROOT/<node-name>t@<snap-name>
zfs destroy rpool/ROOT/<node-name>@<snap-name>

Be careful to select the proper dataset – snapshots on the pool that contain the dataset don’t support this use case. Also, you can only roll back to the latest snapshot directly. If you want to roll back to an earlier snapshot, you must first destroy all of the later snapshots.

In the case of a Proxmox cluster node, the shared files in the associated cluster filesystem will not be included in the snapshot. You can learn more about the Proxmox cluster file system and its shared files here.

You can view all of the snapshots inside the invisible /.zfs directory on the host filesystem as follows:

# cd /.zfs/snapshot/<name>
# ls -la

Local NTP Servers

We want Proxmox and Proxmox Backup Server to use our local NTP servers for time synchronization. To do this, we need to modify/etc/chrony/chrony.conf to use our servers for the pool. This needs to be done on each server individually and inside the Proxmox Backup Server VM. See the following page for details.

The first step before following the configuration procedures above is to install chrony on each node –

apt install chrony

Mail Forwarding

We used the following procedure to configure postfix to support forwarding e-mail through smtp2go. Postfix does not seem to work with passwords containing a $ sign. A separate login was set up in smtp2go for forwarding purposes.

Some key steps in the process include:

# Install postfix and the supporting modules
# for smtp2go forwarding
sudo apt-get install postfix
sudo apt-get install libsasl2-modules

# Install mailx
sudo apt -y install bsd-mailx
sudo apt -y install mailutils

# Run this command to configure postfix
# per the procedure above
sudo dpkg-reconfigure postfix

# Use a working prototype of main.cf to edit
sudo vi /etc/postfix/main.cf

# Setup /etc/mailname -
#   use version from working server
#   MAKE SURE mailname is lower case/matches DNS
sudo uname -n > /etc/mailname

# Restart postfix
sudo systemctl reload postfix
sudo service postfix restart

# Reboot may be needed
sudo reboot

# Test
echo "Test" | mailx -s "PVE email" <email addr>

vGPU

Our servers each include a Nvidia TESLA P4 GPU. This GPU is sharable using Nvidia’s vGPU. The information on how to set up Proxmox for vGPU may be found here. This procedure also explains how to enable IOMMU for GPU pass-through (not sharing). We do not have IOMMU setup on our servers at this time.

You’ll need to install the git command and the cc compiler to use this procedure. This can be done with the following commands –

# apt update
# apt install git
# apt install build-essential

Now you can follow the procedure here. Be sure to include the steps to enable IOMMU. I downloaded and installed the 6.4 vGPU driver from the Nvidia site and did a final reboot of the server.

vGPU Types

The vGPU drivers support a number of GPU types. You’ll want to select the appropriate one in each VM. Note that multiple sizes of vGPUs are not allowed (i.e., if one GPU uses 2 GB of memory, all must). The following table shows the types available. (this data can be obtained by running mdevctl types on your system).

Q Profiles - Not Good for OpenGL/Games
vGPU TypeNameMemoryInstances
nvidia-63GRID P4-1Q1 GB8
nvidia-64GRID P4-2Q2 GB4
nvidia-65GRID P4-4Q4 GB2
nvidia-66GRID P4-8Q8 GB1
A Profiles - Windows VMs
vGPU TypeNameMemoryInstances
nvidia-67GRID P4-1A1 GB8
nvidia-68GRID P4-2A2 GB4
nvidia-69GRID P4-4A4 GB2
nvidia-70GRID P4-8A8 GB1
B Profiles - Linux VMs
vGPU TypeNameMemoryInstances
nvidia-17GRID P4-1B1 GB8
nvidia-243GRID P4-1B41 GB8
nvidia-157GRID P4-2B2 GB4
nvidia-243GRID PR-2B42 GB4

Problems with Out Of Date Keys on Server Nodes

I have occasionally seen problems with the SSH keys getting out of date on our servers. The fix for this is to run the following commands on all of the servers. A reboot is also sometimes necessary.

# Update certs and repload PVE proxy
pvecm updatecerts -F && systemctl restart pvedaemon pveproxy

# Reboot if needed
reboot

Welcome To Our Home Lab

Home Network Dashboard
Home Network Dashboard

This site is dedicated to documenting the setup, features, and operation of our Home Lab. Our Home Lab consists of several different components and systems, including:

  • A high-performance home network
  • A storage system that utilizes multiple NAS devices
  • An enterprise-grade server
  • Applications, services, and websites

Home Network

Gen 2 Home Network Rack
Gen 2 Home Network Core Rack

Our Home Network is a two-tiered structure with a core based upon high-speed 25 GbE capable aggregation switches and optically connected edge switches. We use UniFi equipment throughout. We have installed multiple OM4 fiber multi-mode fiber links from the core to each room in our house. The speed of these links ranges from 1 Gbps to 25 Gbps, with most connections running as dual-fiber LACP LAG links.

Telephone System

To be added

Surveillance System

To be added

Storage System

To be added

Enterprise Server

To be added

Backups

Daily backups for all VMs and LXC containers are configured as follows.

Applications, Services, and Websites

We are hosting several websites, including:

Set-up information for our self-hosted sites may be found here.