Data & System Availability - Storage options
| Article Index |
|---|
| Data & System Availability |
| Storage options |
| Indexing services |
| Virtualization |
| Failover sites |
| System Portability |
| Backup |
| More information |
| All Pages |
Thin provisioning
With a SAN, data per gigabyte is more expensive than with local storage. The advantages of having it available independent of the servers make up for a lot of the cost but it's still better to be conservative with allocating storage. Application developers or server administrators tend to ask for more storage than they actually need.
One solution to this problem is to give them the storage they need, but only actually store what they really use. This is called 'thin provisioning'. It's a smart way to dynamically size the LUN on the array as it's needed.

Figure 2: Thin Provisioning
Linked Clones
Another way to save storage is to use linked clones. The principle of this technique is that it provides one set of data to multiple virtual machines, while keeping track of the differences between them and storing those differences in a separate location. When this is done on the array, the performance impact is negligible.
A physical server can also provide virtual machines with linked clone disks. This is a little bit slower and does take some CPU resources away from the VMs but it doesn't need an intelligent storage array and is also a very good solution.

Figure 3: Linked Clones
Deduplication
At the moment, deduplication is mainly used in backup scenario's. That means that data is first stored on a main storage system and at backup time deduplicated at a separate system or a different tier in the storage system. The reason it is not used on active data yet is mainly because the deduplication process is a very calculation intensive process that, at the moment, simply isn't fast enough for modern storage demands.
The deduplication process works by first accepting all data. It then either inline or in a background process, first compresses it and then at a block level, checks if that block already exists. If it does, it simply points to that block, if not, the new block is stored. This can reduce the backup data size of multiple backups by 50% to even 90% of a traditional backup data set.

Figure 4: Deduplication
Archiving
Because a central high performance storage system can be quite expensive, a lot of companies decide to move less used data to less expensive high capacity storage. This is typically done by setting up the storage in multiple tiers. This process can be all inline of the storage system that moves data on block level to a slower, and therefore less expensive set of disks. When the data is accessed again, it is moved back to the fast storage tier. Clients and applications can access all the data as it stays online at all times. Another way to archive data is to have a data management solution decide what data to move. This is then done at a file-, mail- or database object level. The advantage of this system is that it actually moves data out of the systems, possibly leaving a so called 'stub' behind as a reference for clients and applications. This means that when the data is accessed again, it needs to be restored from another location which can be a time consuming process. On the other hand, this significantly reduces the active data size which in turn reduces backup time by large factors.