Introduction
VMware vSAN continues to be the Hyperconverged Infrastructure (HCI) market leader. Traditional applications like Microsoft SQL Server and SAP HANA; next-generation applications like Cassandra, Splunk, and MongoDB ,and container-based services orchestrated through Kubernetes are run on vSAN by customers today. The success of vSAN is attributed to many factors such as performance, flexibility, ease of use, robustness, and pace of innovation.
Paradigms associated with traditional infrastructure deployment, operations, and maintenance include various disaggregated tools and often specialized skill sets. The hyperconverged approach of vSphere and vSAN simplifies these tasks using familiar tools to deploy, operate, and manage private-cloud infrastructure. VMware vSAN provides the best-in-class enterprise storage and is the cornerstone of VMware Cloud Foundation, accelerating customer’s multi-cloud journey.
VMware HCI, powered by vSAN, is the cornerstone for modern data centers whether they are on-premises or in the cloud. vSAN runs on standard x86 servers from more than 18 OEMs. Deployment options include over 500 vSAN ReadyNode choices, integrated systems such as Dell EMC VxRail systems, and build-your-own using validated hardware on the VMware Compatibility List. A great fit for large and small deployments with options ranging from a 2-node cluster for small implementations to multiple clusters each with as many as 64 nodes—all centrally managed by vCenter Server.
Whether you are a customer deploying traditional, or container-based applications, vSAN delivers developer-ready infrastructure, scales without compromise, simplifies operations, and management tasks as the best HCI solution today – and tomorrow.
Architecture
vSAN is VMware’s software-defined storage solution, built from the ground up for vSphere virtual machines. It abstracts and aggregates locally attached disks in a vSphere cluster to create a storage solution that can be provisioned and managed from vCenter and vSphere Client. vSAN is embedded within the hypervisor, hence storage and compute for VMs are delivered from the same x86 server platform running the hypervisor.
vSAN backed HCI provides a wide array of deployment options that span from a 2-node setup to a standard cluster with the ability to have up to 64 hosts in a cluster that accommodates a stretched cluster topology to serve as an active disaster recovery solution. vSAN includes a capability called HCI mesh that allows customers to remotely mount a vSAN datastore to other vSAN clusters, disaggregating storage and compute that allows greater flexibility to scale storage and compute independently.
vSAN integrates with the entire VMware stack, including features such as vMotion, HA, DRS etc. VM storage provisioning and day-to-day management of storage SLAs can all be controlled through VM level policies that can be set and modified. vSAN delivers enterprise-class features, scale and performance, making it the ideal storage platform for VMs.
Servers with Local Storage
Each host contains flash drives (all flash configuration) or a combination of magnetic disks and flash drives (hybrid configuration) that contribute cache and capacity to the vSAN distributed datastore.
Each host has one to five disk groups, each disk group contains one cache device and one to seven capacity devices.
Pre-requisites
- Should have vCenter Server installed. vCenter Server is used to manage vSAN.
- A minimum of three ESXi hosts are required. The maximum number of ESXi hosts that can use vSAN are eight.
- ESXi hosts must be of version 5.5 or higher.
- A dedicated vSAN network is required. 1Gbps network can be used, but 10Gbps network is recommended, with two NICs for fault tolerance purposes.
- All ESXi hosts with local storage must have at least one SSD and one hard disk.
- The SSDs must make up at least 10 percent of the total amount of storage.
- vCenter should have at least one cluster.
Notes:- Not every host in a vSAN cluster needs to have local storage in order to take advantage of vSAN storage resources.
- Hosts without storage are used to compute resources.
- After you enable vSAN on a cluster, a single vSAN datastore is created. This datastore uses storage from every ESXi host in the vSAN cluster and contains all VM files.
Configure vSAN
Overview of the steps required to configure Virtual SAN (vSAN) in your vSphere environment:
- Create a dedicated VMkernel network for vSAN. The network has to be accessible by all ESXi hosts. 1 Gbps network can be used, however, 10 Gbps network is recommended, with two NICs for fault-tolerance purposes.
- Create a vSAN cluster. When creating a cluster using vSphere Web Client, the vSAN option is available:
- The vSAN cluster can be configured in two modes:
- Automatic mode – to create a vSAN datastore all local disks are claimed by vSAN.
- Manual mode – manually select disks to add to the vSAN datastore.
- If you configure the vSAN cluster in Automatic mode, all ESXi hosts are scanned for empty disks that are then configured for vSAN.
- If you configure the vSAN cluster in Manual mode, you need to create disk groups for vSAN.
Features of VMware vSAN
The features of VMware vSAN depends greatly on the kind of license but includes the following:
- Supports storage policy based management (SPBM) for automated management of storage profiles.
- Supports software defined data at rest encryption, preventing unauthorized access of data at rest.
- A cluster can include 2 to 64 nodes.
- Offers stretched clusters wherein more than one virtualization host server can be used in the same setup for higher security and availability.
- A cluster supports deduplication (eliminating duplicating copies of the same data), compressing data, and erasure coding (protecting data by breaking it down into sectors), ensuring efficient storage management and security.
- Offers support for storage quality of service (QoS), which enables administrators to limit the number of input-output operations per second (IOPS) consumed by specific VMs.
- VMware vSAN 7.0 update two incorporates the following new features:
- HCI Mesh Updates:
vSAN clusters can share storage capacity with non HCI Sphere clusters and adopt HCI without having to scale computing resources and storage or replace existing servers. Licensing HCI Mesh deployment for vSAN clusters or vSAN clusters sharing storage requires vSAN Enterprise or Enterprise Plus Licenses. - Simplified File Services:
File Services backup of file shares is simplified using APIs that allow backup and recovery software vendors to integrate with vSAN file services. The new API enables backup software to track new data and add scalability enhancements to files.
- HCI Mesh Updates:
Integration Mechanism
- Using VMware vSAN management SDK for python to gather storage usage and performance statistics for a vSAN Cluster.
- Use NativeBridgeService from Gateway to connect and execute the python script.
- In Inventory, vSAN entities (Disks) are mapped as components to existing vCenter resources (Hosts).
Integration Configuration
When VMware is Integrated with vSAN, navigate to Setup → Integrations and Apps. In the vCenter Plug-ins Configurations section the vSAN checkbox needs to be enabled.
Discover physical components of vSAN.
- Disks
Discovered vSAN Disks can be seen under Infrastructure
→ vCenter → vSAN Components → vCenter → DataCenter → Cluster -> vSAN Components → vCenter → DataCenter → Cluster -> Host → vSAN Components.
Below are the health icon status
Health Icon Color | Health Icon Status | Description |
---|---|---|
Green | Good | The health of the object is normal. |
Yellow | Warning | The object is experiencing some problems. |
Red | Critical | The object is either not functioning properly or will stop functioning soon. |
- Click Disk Name to see disk attributes.
Notes:- If isSSD is True, disk is SSD.
- If isSSD is False, disk is HDD.
- If the Capacity Attribute value is None, we display 0MB.
- When the Rebalance Result Status is None, we do not get fullness. Fullness Above Threshold, Variance and DataToMoveB attribute values.
- Navigate to Infrastructure → vCenter → DataCenter → Cluster → Attributes → vSAN to view discovered vSAN Cluster Attributes.
- Navigate to Infrastructure → vCenter → DataCenter → Cluster → Host → Attributes → vSAN to view discovered vSAN Host Attributes.
Monitoring metrics and Templates
Template Name | Monitor Name | Metric Name | Unit | Description |
---|---|---|---|---|
VMware vSAN VirtualMachine Performance | VMware vSAN VirtualMachine Performance | vmware_vsan_virtual_machine_iopsRead | number | Virtual machine read IOPS. |
vmware_vsan_virtual_machine_iopsWrite | number | Virtual machine write IOPS. | ||
vmware_vsan_virtual_machine_throughputRead | rate_bytes | Virtual machine read throughput. | ||
vmware_vsan_virtual_machine_throughputWrite | rate_bytes | Virtual machine write throughput. | ||
vmware_vsan_virtual_machine_latencyRead | time_ms | Virtual machine read latency. | ||
vmware_vsan_virtual_machine_latencyWrite | time_ms | Virtual machine write latency. | ||
VMware vSAN Host Network Performance | VMware vSAN Host Network Performance | vmware_vsan_host_net _rxThroughput | rate_bytes | Host throughput inbound of all VMkernel network adapters enabled vSAN traffic. |
vmware_vsan_host_net _txThroughput | rate_bytes | Host throughput outbound of all VMkernel network adapters enabled vSAN traffic. | ||
vmware_vsan_host_net _rxPackets | number | Host network inbound packets per second of all VMkernel network adapters enabled vSAN traffic. | ||
vmware_vsan_host_net _txPackets | number | Host network outbound packets per second of all VMkernel network adapters enabled vSAN traffic. | ||
vmware_vsan_host_net _rxPacketsLossRate | permille | Percentage of host inbound packets loss rate of all VMkernel network adapters enabled vSAN traffic. | ||
vmware_vsan_host_net _txPacketsLossRate | permille | Percentage of host outbound packets loss rate of all VMkernel network adapters enabled vSAN traffic. | ||
VMware vSAN Host Cache Disk Performance | VMware vSAN Host Cache Disk Performance | vmware_vsan_host_cache_disk_iopsDevRead | number | vSAN disk physical/firmware layer read IOPS. |
vmware_vsan_host_cache_disk_iopsDevWrite | number | vSAN disk physical/firmware layer write IOPS. | ||
vmware_vsan_host_cache_disk_throughputDevRead | rate_bytes | vSAN disk physical/firmware layer read throughput. | ||
vmware_vsan_host_cache_disk_throughputDevWrite | rate_bytes | vSAN disk physical/firmware layer write throughput. | ||
vmware_vsan_host_cache_disk_latencyDevRead | time_ms | vSAN disk physical/firmware layer read latency. | ||
vmware_vsan_host_cache_disk_latencyDevWrite | time_ms | vSAN disk physical/firmware layer write latency. | ||
vmware_vsan_host_cache_disk_latencyDevGAvg | time_ms | vSAN disk Guest IO latency (total latency). | ||
vmware_vsan_host_cache_disk_latencyDevDAvg | time_ms | vSAN disk IO device latency (from HBA to backend storage). | ||
VMware vSAN Host Capacity Disk Performance | VMware vSAN Host Capacity Disk Performance | vmware_vsan_host_capacity_disk_iopsDevRead | number | vSAN disk physical/firmware layer read IOPS. |
vmware_vsan_host_capacity_disk_iopsDevWrite | number | vSAN disk physical/firmware layer write IOPS. | ||
vmware_vsan_host_capacity_disk_throughputDevRead | rate_bytes | vSAN disk physical/firmware layer read throughput. | ||
vmware_vsan_host_capacity_disk_throughputDevWrite | rate_bytes | vSAN disk physical/firmware layer write throughput. | ||
vmware_vsan_host_capacity_disk_latencyDevRead | time_ms | vSAN disk physical/firmware layer read latency. | ||
vmware_vsan_host_capacity_disk_latencyDevWrite | time_ms | vSAN disk physical/firmware layer write latency. | ||
vmware_vsan_host_capacity_disk_latencyDevGAvg | time_ms | vSAN disk Guest IO latency (total latency). | ||
vmware_vsan_host_capacity_disk_latencyDevDAvg | time_ms | vSAN disk IO device latency (from HBA to backend storage). | ||
vmware_vsan_host_capacity_disk_iopsRead | number | Disk vSAN layer reads IOPS. | ||
vmware_vsan_host_capacity_disk_iopsWrite | number | Disk vSAN layer writes IOPS. | ||
vmware_vsan_host_capacity_disk_latencyRead | time_ms | Disk vSAN layer read latency. | ||
vmware_vsan_host_capacity_disk_latencyWrite | time_ms | Disk vSAN layer write latency. | ||
VMware vSAN Cluster Virtual Disk Performance | VMware vSAN Cluster Virtual Disk Performance | vmware_vsan_virtual_disk_iopsLimit | number | The applied IOPS limit. |
vmware_vsan_virtual_disk_NIOPS | number | This shows IOPS that are represented using a weighted size of 32KB by default. This means that a 64KB read or write operation represents two normalized IO. The weighted size is a configurable parameter. | ||
vmware_vsan_virtual_disk_NIOPSDelayed | number | This is the IOPS for normalized IOs that are delayed. | ||
VMware vSAN Cluster Performance | VMware vSAN Cluster Performance | vmware_vsan_cluster_capacity_global_freeCapacityB | Bytes | The amount of free Virtual SAN capacity in bytes |
vmware_vsan_cluster_capacity_global_totalCapacityB | Bytes | The total Virtual SAN capacity in bytes. | ||
vmware_vsan_cluster_capacity_summary_usedB | Bytes | The amount of Virtual SAN capacity being used in bytes. | ||
vmware_vsan_cluster_capacity_other_used | Bytes | The amount of Virtual SAN capacity being used in bytes | ||
vmware_vsan_cluster_backend_congestion | count | vSAN cluster congestion for the vSAN backend. | ||
vmware_vsan_cluster_backend_iopsRead | count | vSAN cluster reads IOPS for the vSAN backend. | ||
vmware_vsan_cluster_backend_iopsWrite | count | vSAN cluster writes IOPS for the vSAN backend. | ||
vmware_vsan_cluster_backend_latencyAvgRead | MilliSeconds | vSAN cluster average read latency for the vSAN backend. | ||
vmware_vsan_cluster_backend_latencyAvgWrite | MilliSeconds | vSAN cluster average write latency for the vSAN backend. | ||
vmware_vsan_cluster_backend_oio | count | vSAN cluster outstanding I/O for the vSAN backend. | ||
vmware_vsan_cluster_backend_throughputRead | Bytes | vSAN cluster read throughput for the vSAN backend. | ||
vmware_vsan_cluster_backend_throughputWrite | Bytes | vSAN cluster write throughput for the vSAN backend. | ||
vmware_vsan_cluster_client_congestion | count | Congestions of I/Os generated by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_iopsRead | count | Read IOPS consumed by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_iopsWrite | count | Write IOPS consumed by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_latencyAvgRead | MilliSeconds | Average read latency of I/Os generated by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_latencyAvgWrite | MilliSeconds | Average write latency of I/Os generated by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_latencyAvgWrite | count | Outstanding I/O from all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_throughputRead | Bytes | Read throughput consumed by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
vmware_vsan_cluster_client_throughputWrite | Bytes | Read throughput consumed by all vSAN clients in the cluster, such as virtual machines, stats object, etc. | ||
VMware vSAN Cluster DiskGroup Performance | VMware vSAN Cluster DiskGroup Performance | vmware_vsan_cluster_disk_group_compCongestion | count | vSAN disk group Comp-congestion. |
vmware_vsan_cluster_disk_group_iopsCongestion | count | vSAN disk group IOPS-congestions. | ||
vmware_vsan_cluster_disk_group_logCongestion | count | vSAN disk group Log-congestions. | ||
vmware_vsan_cluster_disk_group_memCongestion | count | vSAN disk group Mem-congestions. | ||
vmware_vsan_cluster_disk_group_slabCongestion | count | vSAN disk group Slab-congestions. | ||
vmware_vsan_cluster_disk_group_ssdBytesDrained | bytes | vSAN disk group SSD-congestions. | ||
vmware_vsan_cluster_disk_group_iopsRead | count | vSAN disk group (cache tier disk) front end read IOPS, including RC read misses. | ||
vmware_vsan_cluster_disk_group_iopsWrite | count | vSAN disk group (cache tier disk) front end write IOPS. | ||
vmware_vsan_cluster_disk_group_latencyAvgRead | milliseconds | vSAN disk group (cache tier disk) front end read latency. | ||
vmware_vsan_cluster_disk_group_latencyAvgWrite | milliseconds | vSAN disk group (cache tier disk) front end write latency. | ||
vmware_vsan_cluster_disk_group_rcHitRate | percentage | vSAN disk group (cache tier disk) Read Cache hit rate. | ||
vmware_vsan_cluster_disk_group_throughputRead | bytes | vSAN disk group (cache tier disk) front end read throughput. | ||
vmware_vsan_cluster_disk_group_throughputWrite | bytes | vSAN disk group (cache tier disk) front end write throughput. | ||
VMware vSAN Host Backend and Client Performance | VMware vSAN Host Backend and Client Performance | vmware_vsan_host_backend_congestion | count | vSAN host congestions for the vSAN backend. |
vmware_vsan_host_backend_iopsRead | count | vSAN host read IOPS for the vSAN backend. | ||
vmware_vsan_host_backend_iopsWrite | count | vSAN host write IOPS for the vSAN backend. | ||
vmware_vsan_host_backend_latencyAvgRead | milliseconds | vSAN host read I/O average latency for the vSAN backend. | ||
vmware_vsan_host_backend_latencyAvgWrite | milliseconds | vSAN host write I/O average latency for the vSAN backend. | ||
vmware_vsan_host_backend_oio | count | vSAN host outstanding I/O for the vSAN backend. | ||
vmware_vsan_host_backend_iopsResyncRead | count | vSAN host read IOPS of resync traffic, including policy change, repair, maintenance mode / evacuation and rebalance from resyncing objects in the perspective of vSAN backend. | ||
vmware_vsan_host_backend_iopsRecWrite | count | vSAN host recovery write IOPS in the perspective of vSAN backend. | ||
vmware_vsan_host_backend_throughputRead | bytes | vSAN host read throughput for the vSAN backend. | ||
vmware_vsan_host_backend_throughputWrite | bytes | vSAN host write throughput for the vSAN backend. | ||
vmware_vsan_host_client_congestion | count | Congestions of I/Os generated by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_iopsRead | count | Read IOPS consumed by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_iopsWrite | count | Write IOPS consumed by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_latencyAvgRead | milliseconds | Average read latency of I/Os generated by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_latencyAvgWrite | milliseconds | Average write latency of I/Os generated by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_oio | count | Outstanding I/O from all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_throughputRead | bytes | Read throughput consumed by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
vmware_vsan_host_client_throughputWrite | bytes | Write throughput consumed by all vSAN clients in the host, such as virtual machines, stats object, etc. | ||
VMware vSAN Host Disk Performance | VMware vSAN Host Disk Performance | vmware_vsan_host_disk_summaryHealth | Enumerated Map 0 red 1 green 2 yellow 3 None | The overall health status of the disk. It is the aggregated health status for the disk operational health, disk congestion health, disk metadata health, disk capacity health, disk component limit health, and disk dedup usage health. The status is reported as one of the following values: 1) green: Good. The health of the object is normal. 2) yellow: Warning. The object is experiencing some problems. 3) red: Critical. The object is either not functioning properly or will stop functioning soon. |
vmware_vsan_host_disk_capacityHealth | Enumerated Map 0 red 1 green 2 yellow 3 None | The disk capacity health status | ||
vmware_vsan_host_disk_operationalHealth | Enumerated Map 0 red 1 green 2 yellow 3 None | The disk operational health status. The status is reported as one of the following values: 1) green: Good. The health of the object is normal. 2) yellow: Warning. The object is experiencing some problems. 3) red: Critical. The object is either not functioning properly or will stop functioning soon. |