Recently, we have received questions about why our VSAN Ready Nodes don’t have local drives dedicated to running ESXi. This post will provide some guidance on how to successfully deploy an ESXi host running from a SD card, such as Dell IDSDM solution, or USB drive. 

This method is fully supported as long as you take into account some requirements and recommendations. Dell takes this one step further with the Integrated Dual Secure Digital Module (IDSDM). This module can support up to two 16GB SD cards and can protect them in a RAID 1 configuration. This means you get all the benefits of running ESXi from a SD card, with hardware-enabled redundancy as well.

This post was originally going to focus only on our Ready Node architecture, but I felt it prudent to discuss this particular topic on a more general scale. Items relevant to VSAN are discussed but obviously if your host is not enabled with VSAN then any items pertaining to VSAN can be ignored.

Requirements to install ESXi on SD/USB

  1. A supported SD/USB drive is required. Dell SD cards are certified to the highest standards to ensure reliable operation. The use of non-Dell SD cards is not recommended because Dell support has seen reliability/performance issues in the field with non-Dell SD cards. Each vendor has specific devices that they support. Dell’s solution automatically provides some boot redundancy with 2 SD cards. The Integrated Dual SD Module supports up to 16GB SD cards in RAID 1.
  2. Total system memory must be 512GB or less. For systems with more than 512GB of RAM installation to a local HDD/SSD is required.

ESXi Scratch Partition

The ESXi scratch partition is used by ESXi to store syslog files, core dump files, VSAN trace files, and other files. The most important to manage SD/USB boot are:

  • Syslog
  • core dump (PSOD)
  • VSAN trace files

What are these three items? And why do I care?

  • Syslog files
    • This is a group of files that store critical log data from various processes running on the host.
    • Examples include: vmkernel.log, hostd.log, vmkwarning.log, etc.
    • These files are written to real-time by the host.
  • Core Dump Files
    • A core dump is the official name for the Purple Screen of Death, otherwise known as PSOD.
    • The core dump contains helpful troubleshooting data to determine the cause and fix for the PSOD.
    • Although not written often (hopefully not ever!) we need a persistent place to store these items so they can be retrieved.
  • VSAN Trace Files
    • VSAN trace files contain actions taken by VSAN on the data that is being written to and read from VSAN data stores.
    • This data is needed by VMware support should any issue with the VSAN data store occur.
    • These files are being written to in real-time, like the syslog files.
    • These are completely separate from the syslog files.

So why do we have to “manage” these files? Doesn’t ESXi just store them on the SD/USB drive?

Not by default. When ESXi is installed to a SD/USB device, the scratch partition is not created on the drive itself, but in a RAMDisk. A RAMDisk is a block storage device dynamically created within the system RAM. This device is mounted to the root file system for the ESXi installation to use.

  • This location is very high performance (it’s in RAM).
  • Unfortunately, the RAMDisk is not persistent across reboots, so all the data written there is lost forever if the host were to crash or even be rebooted manually.

There is one exception to this rule. In failure scenarios other than complete system failure, the VSAN trace files are written to the locker partition on the SD/USB drive. These trace files are written in order from newest to oldest until the locker partition is full. This won’t necessarily capture all the VSAN trace files as the VSAN trace files can be much larger than the locker partition.

Part 2 of this blog will discuss some different methods/software to address the management of the files we have discussed.  Look for it coming soon.  Once it is posted I will link it HERE.