This article describes the supportability of having a
Windows server boot from a Storage Area Network (SAN).
Microsoft supports booting from a Storage Area Network
(SAN) if the SAN vendor supports their particular hardware platform booting a
Windows server. The SAN and host bus adapter (HBA) must be configured according
to the SAN vendor's guidelines and the SAN vendor must
act as the main point of contact for boot-related issues. This
requirement exists because booting from a SAN is extremely complex, and the
vendor needs to support the particular configuration because the SAN vendor
provides the SAN boot supportability statement. It is important to note that
the information that is included in this article is not intended to be an
all-inclusive list of the items that are required to boot from a SAN. The SAN
vendor must provide specific steps, drivers, firmware revisions and resources
about how to make their hardware (storage systems, switches, Host Bus Adapters,
and so on) work properly together.
The following issues must be addressed so multiple computers can
successfully boot from a SAN:
- To boot multiple computers from a SAN, the SAN must either be configured in a switched environment, or it must be directly attached from each host to one of the storage sub-system's Fibre Channel ports. The use of Fiber Channel - Arbitrated Loop (FC-AL) is not supported when booting multipe servers from the SAN because it does not allow the hosts that are attached to the SAN to be properly segregated from each other. A switched environment allows the hosts to be separate from each other. Booting to a SAN with a Fiber Channel-Arbitrated Loop topology is only supported when booting a single server from the SAN.
- The host must have exclusive access to the disk that it is
booting from. No other host on the SAN should be able to detect or have access
to the same logical disk. This can be accomplished by using a type of Logical
Unit Number (LUN) management such as LUN masking, zoning or some combination of
these methods. LUN management is normally configured at the switch, storage
subsystem and/or Host Bus Adapter (HBA) level and not within Windows. Windows
provide no capabilities for mapping LUNs.
- Multi-path software and multiple HBAs improve your chances
of recovery from a path failure. The purpose of having multiple HBAs in a
single host is to have redundancy and (possibly) increased throughput. However,
if a failure occurs and a path to the SAN is lost, there may be a period of
time where the drives on the SAN are not accessible. This path failure may
cause problems with the Windows server. Behavior of multi-path software varies
greatly between vendors. Check the Windows Catalog (formerly Hardware Compatibility List or HCL) for
Storage/RAID systems to make sure that the multi-path driver is in the Windows Catalog with the storage system. If you cannot find the multi-path software, contact your
To see the Storage/RAID Catalog, please view the following
Microsoft Web site:
- If the hosts that are attached are part of a Windows 2000 cluster solution, you must use one HBA for the boot process and a separate HBA for shared storage.
- If the hosts that are attached are part of a Windows 2000 cluster solution and are using the Microsoft multipath I/O (MPIO) feature, you need four HBAs.
This section describes several issues that may prevent a Windows
server from successfully booting from a SAN:
- A very common problem when you configure a SAN is that it
is possible that multiple hosts may have access to the same logical disk. This
usually occurs because proper LUN management was not employed. The default
behavior of Windows is to attach and mount every logical unit that it detects
when the HBA driver loads. If multiple hosts mount the same disk, file system
damage can occur. It is up to the configuration of the SAN to insure that only
one host can access a particular logical disk at a time. Symptoms of multiple
hosts accessing the same logical disk are:
Disk Management displays the same logical disk on multiple hosts.Plug and Play notification that new hardware is found may occur on multiple hosts when you add or configure a new logical disk.When you try to access a logical disk by using My Computer or Windows Explorer, you may receive an "Access Denied", "Device not Ready", or similar error message that may indicate that other hosts have access to the same logical disk.
- Your computer stops responding (hangs) or has slow response
times. This can indicate that there is a high latency to the pagefile, and this
may be accompanied by events in the System Log such as:
If the preceding error messages are in the System Log, it
indicates that Windows was trying to access a disk and there was a problem. If
the disk that is referenced is on the SAN, it could indicate a latency issue.
If an Event ID 51 is shown, this indicates that Memory Manager was attempting
to copy data to or from memory and had a problem. Another indicator of pagefile
latency issues is if the Windows server has a system failure, and either of the
following error messages are displayed on a blue screen:
Event ID: 51
Event Type: Warning
Description: An error was detected on device \Device\Harddisk0\DR0
during a paging operation.
Event ID: 11
Description: The driver detected a controller error on
Event ID: 9
Description: The device, \Device\ScsiPort0, did not respond within the timeout
A possible resolution is to
place the pagefile on the host's local hard disk. Windows needs reliable access
to the pagefile as data is paged in or out of memory. Having the pagefile local
to the host guarantees that access is not influenced by other devices and hosts
on the SAN.
Note If the pagefile is not on the same partition as the boot
partition (typically c:\Windows or c:\WINNT), the creation of a Memory.dmp file
will not occur. A Memory.dmp file is used for troubleshooting a Windows
computer that has a STOP error. For information about how to configure your
computer for a crashdump, see Windows Help.
There are several ways to resolve the preceding problems. The
first method is to try and correlate the time with any events that are
occurring on the SAN. For example, it HostA was doing a large copy operation
and HostB reports the Error 9s, it may imply that proper LUN management is not
in place. Another example is if HostB produces errors whenever HostA is
rebooted. This may indicate that FC-AL is being used and HostB is being
affected by a Loop Initialization Primitive (LIP) sequences from HostA. These
can often be corrected by re-configuring the SAN, and this requires the
assistance of the hardware vendor. Any type of latency issues might be resolved
by placing pagefile on the Windows server's local hard disk, but again, this
disables the creation of a memory dump. A key point to understand is that the
hardware vendor of the SAN will have the most information about the proper
configuration, and must be the first point of contact for all configuration
questions and concerns.
For more information about Windows server clusters in a SAN environment, click the following article numbers to view the articles in the Microsoft Knowledge Base:
Windows clustering and geographically separate sites
Cluster Service improvements for storage area networks