In computing, a storage area network
(SAN) is an architecture
to attach remote computer storage devices such as disk array controllers,
tape libraries and CD arrays to servers in such a way that to
the operating system the devices appear as locally attached devices.
Although cost and complexity is dropping, as of 2007, SANs are
still uncommon outside larger enterprises.
Network types
Most storage networks use the SCSI protocol for communication between servers and disk drive devices, though they do not use its low-level physical interface, instead using a mapping layer such as the FCP mapping standard.
Benefits
Sharing storage usually simplifies storage administration and adds flexibility since cables and storage devices do not have to be physically moved to move storage from one server to another. Note, though, that with the exception of SAN file systems and clustered computing, SAN storage is still a one-to-one relationship. That is, each device, or Logical Unit Number (LUN) on the SAN is "owned" by a single computer (or initiator). In contrast, Network Attached Storage (NAS) allows many computers to access the same set of files over a network. The contrast between the SAN and NAS has been blurred with the creation of a NAS head.
SANs tend to increase storage capacity utilization, since multiple servers can share the same growth reserve.
Other benefits include the ability to allow servers to boot from the SAN itself. This allows for a quick and easy replacement of faulty servers since the SAN can be reconfigured so that a replacement server can use the LUN of the faulty server. This process can take as little as half an hour and is a relatively new idea being pioneered in newer data centers. There are a number of emerging products designed to facilitate and speed up this process still further. For example, Brocade Communication Systems offers an Application Resource Manager product which automatically provisions servers to boot off a SAN, with typical-case load times measured in minutes. While this area of technology is still new, many view it as being the future of the enterprise datacenter.
SANs also tend to enable more effective disaster recovery processes. A SAN attached storage array can replicate data belonging to many servers to a secondary storage array. This secondary array can be local or, more typically, remote. The goal of disaster recovery is to place copies of data outside the radius of effect of an anticipated threat, and so the long-distance transport capabilities of SAN protocols such as Fibre Channel and FCIP are required to support these solutions. (The physical layer options for the traditional direct-attached SCSI model could only support a few meters of distance: not nearly enough to ensure business continuance in a disaster.) Demand for this SAN application has increased dramatically after the September 11th attacks in the United States, and increased regulatory requirements associated with Sarbanes-Oxley and similar legislation.
Newer SANs allow duplication functionality such as "cloning", "Business Continuance Volumes (BCV)" and "snapshotting," which allows for real-time duplication of LUN, for the purposes of backup, disaster recovery, or system duplication. With higher-end database systems, this can occur without downtime, and is geographically independent, primarily being limited by available bandwidth and storage. Cloning and BCV's create a complete replica of the LUN in the background (consuming I/O resources in the process), while snapshotting stores only the original states of any blocks that get changed after the "snapshot" (also known as the delta blocks) from the original LUN, and does not significantly slow the system. In time, however, snapshots can grow to be as large as the original system, and are normally only recommended for temporary storage. The two types of duplication are otherwise identical, and a cloned or snapshotted LUN can be mounted on another system for execution, or backup to tape or other device, or for replication to a distant point.
Disk controllers
The driving force for the SAN market in the enterprise space is rapid growth of highly transactional data that require high speed block level access to the hard drives (such as data from email servers, databases, and high usage file servers). Historically, enterprises would have "islands" of high performance SCSI storage RAIDs that were locally attached to each application server. These "islands" would be backed up over the network, and when the application data exceeded the maximum amount of data storable by the individual server, the end user would often have to upgrade his server to keep up.
The disk controllers used in enterprise SAN environments are designed to provide applications with block level access to high speed, reliable "virtual hard drives" (or LUNs). In addition, modern SANs allow enterprises to intermix FC SATA drives with their FC SCSI drives. Some studies indicate that SATA drives have lower performance, a higher failure rate, higher capacity, and lower prices than SCSI. This allows enterprises to have multiple tiers of data that will migrate over time to different types of media. For example: many enterprises relegate files that are rarely accessed to FC SATA while keeping their frequently used data in FC SCSI.
Another feature of most enterprise disk controllers is an I/O cache. This feature allows higher overall performance for writing to the controller, and in some cases (like for contiguous file access where read ahead is enabled) reading from the controller.
SAN types
SANs require an infrastructure specially designed to handle storage communications called a fabric. Thus, they tend to provide faster and more reliable access than higher level protocols such as NAS. A fabric is similar in concept to a segment in a local area network.
The industry standard SAN technology is Fibre Channel networking with the SCSI command set. A typical Fibre Channel SAN fabric is made up of a number of Fibre Channel switches. Today, all major SAN equipment vendors also offer some form of Fibre Channel routing solution, and these bring substantial scalability benefits to the SAN architecture by allowing data to cross between different fabrics without merging them. These offerings use proprietary protocol elements, and the top-level architectures being promoted are radically different. When extending Fibre Channel over long distances for disaster recovery solutions, it can be mapped over other protocols. For example, products exist to map Fibre Channel over IP (FCIP) and over SONET/SDH. It can also be extended natively using signal repeaters, high-power laser media, or multiplexers such as DWDMs.
Types of SAN
A centralized storage area network contains many heterogeneous servers connected to one single storage space. The single storage space can have heterogeneous storage entities or disk drives. Centralized storage area networks are useful for simplifying the storage architecture in large organizations. The storage space can be treated as a black box so that administration of storage is easy. Centralized storage area networks are compatible with many heterogeneous server environments including Unix, Solaris, Linux, Windows based servers and more.
A distributed storage area network contains many geographically-dispersed disk drive networks. All the networks are treated as one unit and are connected by the iSCSI storage area network protocol. Distributed storage area networks is a sub-network of shared storage devices that allows for all information stored to be shared among all of the servers on the network. Distributed storage area networks are most popular in large organizations with geographically dispersed storage pools, that can be connected and communicate through iSCSI.
SAN vs Traditional Server Based Storage
In a typical large LAN-installation, each of a number of servers (and perhaps mainframes) may have its own dedicated storage devices. If a client needs access to a particular storage device, it must go through the server that controls the device. In a SAN, no server sits between the storage devices and the network; instead, the storage devices and servers are linked directly to the network. The SAN arrangement improves client-to-storage access efficiency, as well as direct storage-to-storage communications for backup and replication functions.
SANs at work
SANs are primarily used in large scale, high performance enterprise storage operations. It would be unusual to find a Fibre Channel disk drive connected directly to a SAN. Instead, SANs are normally networks of large disk arrays. SAN equipment is relatively expensive, therefore, Fibre Channel host bus adapters are rare in desktop computers. The iSCSI SAN technology is expected to eventually produce cheap SANs, but it is unlikely that this technology will be used outside the enterprise data center environment. Desktop clients are expected to continue using NAS protocols such as CIFS and NFS. The exception to this may be remote replication sites. Remote replication enables the data center environment to exist in multiple locations for disaster recovery and business continuity purposes. The performance issues formerly inherent in iSCSI
SANs in a Small Office / Home Office (SOHO)
With the increasing rise of digital media in all phases of life and its effect on storage needs, it's natural that SANs have begun to enter into the SOHO market. Historically, this market was dominated by NAS systems, but SOHO is poised to become a major market for SAN infrastructure as SOHO performance requirements rise.
Systems such as film scanners and video editing applications require performance that cannot be provided by traditional file servers. For example, motion picture film at 2048x1536 requires more than 300MBytes/s for each real-time stream, and several of these streams can be required simultaneously. As a result, several Gigabits per second can be required, which creates a problem for standard NAS technologies. In addition, these systems need to work with the same files collaboratively, so they cannot be distributed through different file servers or DAS connections.
Instead of having many computers connected to the network, with each one requiring a low bandwidth and only the server being stressed under heavy traffic, the SOHO "real-time" area only needs to integrate a few systems, but all of them require high bandwidth to access to the same files. These problems are addressed very well by 4Gbit Fibre Channel SAN infrastructures, where the aggregated bandwidth for sequential I/O operations is extremely high.