This page covers the Proxmox install and setup on our server. You can find a great deal of information about Proxmox in the Proxmox VE Administrator’s Guide.
Proxmox Installation/ZFS Storage
Proxmox was installed on our server using the steps in the following video:
The Proxmox boot images are installed on MVMe drives (ZFS RAID1 on our Dell Sever BOSS Card, or ZFS single on the MNVe drives on our Supermicro Servers). This video also covers the creation of a ZFS storage pool and filesystem. A single filesystem called zfsa was set up using RAID10 and lz4 compression using four SSD disks on each server.
A Community Proxmox VE License was purchased and installed for each node. The Proxmox installation was updated on each server using the Enterprise Repository.
Linux Configuration
I like to install a few additional tools to help me manage our Proxmox installations. They include the nslookup and ifconfig commands and the tmux terminal multiplexor. The commands to install these tools are found here.
Cluster Creation
With these steps done, we can create a 3-node cluster. See our Cluster page for details.
ZFS Snapshots
Creating ZFS snapshots of the Proxmox installation can be useful before making changes. This enables rollback to a previous version of the filesystem should any changes need to be undone. Here are some useful commands for this purpose:
zfs list -t snapshot
zfs list
zfs snapshot rpool/ROOT/<node-name>@<snap-name>
zfs rollback rpool/ROOT/<node-name>t@<snap-name>
zfs destroy rpool/ROOT/<node-name>@<snap-name>
Be careful to select the proper dataset – snapshots on the pool that contain the dataset don’t support this use case. Also, you can only roll back to the latest snapshot directly. If you want to roll back to an earlier snapshot, you must first destroy all of the later snapshots.
In the case of a Proxmox cluster node, the shared files in the associated cluster filesystem will not be included in the snapshot. You can learn more about the Proxmox cluster file system and its shared files here.
You can view all of the snapshots inside the invisible /.zfs directory on the host filesystem as follows:
# cd /.zfs/snapshot/<name> # ls -la
Local NTP Servers
We want Proxmox and Proxmox Backup Server to use our local NTP servers for time synchronization. To do this, we need to modify/etc/chrony/chrony.conf to use our servers for the pool. This needs to be done on each server individually and inside the Proxmox Backup Server VM. See the following page for details.
The first step before following the configuration procedures above is to install chrony on each node –
apt install chrony
Mail Forwarding
We used the following procedure to configure postfix to support forwarding e-mail through smtp2go. Postfix does not seem to work with passwords containing a $ sign. A separate login was set up in smtp2go for forwarding purposes.
Some key steps in the process include:
# Install postfix and the supporting modules # for smtp2go forwarding sudo apt-get install postfix sudo apt-get install libsasl2-modules # Install mailx sudo apt -y install bsd-mailx sudo apt -y install mailutils # Run this command to configure postfix # per the procedure above sudo dpkg-reconfigure postfix # Use a working prototype of main.cf to edit sudo vi /etc/postfix/main.cf # Setup /etc/mailname - # use version from working server # MAKE SURE mailname is lower case/matches DNS sudo uname -n > /etc/mailname # Restart postfix sudo systemctl reload postfix sudo service postfix restart # Reboot may be needed sudo reboot # Test echo "Test" | mailx -s "PVE email" <email addr>
vGPU
Our servers each include a Nvidia TESLA P4 GPU. This GPU is sharable using Nvidia’s vGPU. The information on how to set up Proxmox for vGPU may be found here. This procedure also explains how to enable IOMMU for GPU pass-through (not sharing). We do not have IOMMU setup on our servers at this time.
You’ll need to install the git command and the cc compiler to use this procedure. This can be done with the following commands –
# apt update # apt install git # apt install build-essential
Now you can follow the procedure here. Be sure to include the steps to enable IOMMU. I downloaded and installed the 6.4 vGPU driver from the Nvidia site and did a final reboot of the server.
vGPU Types
The vGPU drivers support a number of GPU types. You’ll want to select the appropriate one in each VM. Note that multiple sizes of vGPUs are not allowed (i.e., if one GPU uses 2 GB of memory, all must). The following table shows the types available. (this data can be obtained by running mdevctl types on your system).
Q Profiles - Not Good for OpenGL/Games | |||
vGPU Type | Name | Memory | Instances |
nvidia-63 | GRID P4-1Q | 1 GB | 8 |
nvidia-64 | GRID P4-2Q | 2 GB | 4 |
nvidia-65 | GRID P4-4Q | 4 GB | 2 |
nvidia-66 | GRID P4-8Q | 8 GB | 1 |
A Profiles - Windows VMs | |||
vGPU Type | Name | Memory | Instances |
nvidia-67 | GRID P4-1A | 1 GB | 8 |
nvidia-68 | GRID P4-2A | 2 GB | 4 |
nvidia-69 | GRID P4-4A | 4 GB | 2 |
nvidia-70 | GRID P4-8A | 8 GB | 1 |
B Profiles - Linux VMs | |||
vGPU Type | Name | Memory | Instances |
nvidia-17 | GRID P4-1B | 1 GB | 8 |
nvidia-243 | GRID P4-1B4 | 1 GB | 8 |
nvidia-157 | GRID P4-2B | 2 GB | 4 |
nvidia-243 | GRID PR-2B4 | 2 GB | 4 |
Problems with Out Of Date Keys on Server Nodes
I have occasionally seen problems with the SSH keys getting out of date on our servers. The fix for this is to run the following commands on all of the servers. A reboot is also sometimes necessary.
# Update certs and repload PVE proxy pvecm updatecerts -F && systemctl restart pvedaemon pveproxy # Reboot if needed reboot