Several major platform vendors and business-critical application providers
have begun to make formal announcements about their intent to support
Microsoft's Wolfpack clustering initiative (for information on this initiative,
see Joel Sloss, "Wolfpack Beta 2," June 1997). As a result, many
systems administrators and designers are in a panic over planning for the
orderly introduction of this new technology. Much of this planning will focus on
load distribution and failover scenarios for key applications. Although many
administrators and designers see storage planning and data management as minor
components of the overall process, these factors are key to the success or
failure of a clustering implementation.
As of this writing, Microsoft plans to formally announce Wolfpack's
availability this summer. Several of the early adopter partners such as Compaq
Computer, Digital Equipment, HP, IBM, NCR, and Tandem and other platform
partners such as Amdahl, Siemens-Nixdorf, and Stratus will make available a
broad range of Microsoft-certified clustering configurations. At the same time,
cluster-aware versions of business-critical applications such as Oracle Parallel
Server, SAP R/3, the Microsoft BackOffice Suite, and Computer Associates (CA)
Unicenter TNG will complement these hardware announcements. These partnering
forces will combine to quickly move this technology into the enterprise. During
this time, many MIS executives will face a lot of marketing hype that oversells
the ease of integrating these clustering solutions. Systems administrators faced
with this situation need to realistically look at integrating clustering storage
and data management components (for information about why you would need to
implement a clustering solution, see Mark Smith, "Clusters for Everyone,"
June 1997).
Storage Pragmatics and Challenges
Each vendor that supports Wolfpack will offer a slightly different solution;
therefore, you need to consider some common storage issues across the board. One
key issue is the need to establish a storage hierarchy and a backup and fault
recovery plan early on.
In establishing this storage hierarchy, you take a different approach from
the one you typically take with Hierarchical Storage Management (HSM). Instead
of focusing on classes of service (i.e., cache/SSD, online, nearline, offline)
and the use of various storage devices (e.g., disk, tape, optical) to provide
these classes, you need to focus on segregating SCSI disk storage devices and
subsystems depending on whether they are server, Wolfpack, or application
driven. With application-driven requirements, you also need to determine whether
your disk requirements are for raw or formatted (NTFS) drives (raw for Oracle
Parallel Server and other business-critical applications that manage their own
disk space, and formatted for all other applications). You can then apply
optional data protection and availability schemes (i.e., RAID, disk mirroring)
to these hierarchies as you see fit.
Storage Requirements
Microsoft built Wolfpack Release 1 so that both servers in a Wolfpack pair
operate in an active/active mode (for definitions of such terms, see "Clustering
Terms and Technologies," June 1997). As a result, each server will support
different sets of applications, day-to-day workloads, and licensed copies of
applications from the other server in case one server fails over. This
configuration requires that you prepare a worst-case approach to plan for RAM,
cache, and disk capacity on each server and storage subsystem.
You also need to analyze which protection or availability scheme works best
for each server and storage subsystem. No one scheme is universal, so you must
plan for some type of partitioning or multiple storage device or subsystem.
In addition to these concerns, you must be aware that Wolfpack relies on a
quorum drive (a dedicated drive--or spindle--that both servers share to store
and retrieve quorum resource log information). It can be a single point of
failure: If this drive crashes or becomes corrupt, Wolfpack loses all its
housekeeping information and neither failover nor failback can occur. At a
minimum, you must provide hardware mirroring on this drive or consider
backup-on-the-fly, 24-hour-per-day protection.