2009年5月13日星期三

Solid Disk Drive's MTBF

Using the architectural definitions and modeling tools for the STEC ZeusIOPS wear-leveling algorithms and assuming that the SLC NAND flash will tolerate exactly 100,000 Program/Erase (P/E) cycles, the math says that the latest version of the 256GB (raw) STEC ZeusIOPS drive will wear out below it's rated usable capacity when exposed to a 100% 8KB write workload with 0% internal cache hit at a constant arrival rate of 5000 IOs per second in 4.92 years when configured at 200GB, and in 8.91 years configured at 146GB (yeah, I was off by .08 years).

Unfortunately, I cannot share the actual data or spreadsheet used to compute these numbers because they contain STEC proprietary information about their architecture and wear-leveling algorithms. So you'll have to trust me on this, and trust that IBM and EMC are in fact using the same STEC drives with the identical wear-leveling algorithms, just formatted at different capacities.

At a mix of 50/50 Read/write, the projected life of the drive is 9.84 years @ 200GB, and 17.8 years @ 146GB. And for what TonyP asserts is the "traditional business workload" (70% read / 30% write) the projected life expectancy is a healthy 16 years @200GB and 30 years @146GB.

Now, that's long enough for the drives to be downright ancient - more likely they will have been replaced with newer/faster technology long before the drive is even half-through its P/E life expectancy under those conditions.

So in the Real World that we all actually live in, nothing is ever 100% write – even database logs (which are not recommended for Flash drives) will not typically generate a 100% constant write workload at max drive IOPS. And the current generation of SLC NAND has been observed to easily exceed 100,00 P/E cycles, so even the above numbers are extremely conservative.

No, the truth is, the difference between the projected life at 146GB and 200GB on a 256GB (raw) ZeusIOPS is truly insignificant...and your data is no more at risk for the expected life of the drive either way.

Unless, of course, your array can't adequately buffer writes or frequently writes smaller than 8k blocks which will drive up the write amplification factor...two issues I suspect the DS8K in fact suffers from. Which, of course, would explain why IBM's Distinguished Engineers wouldn't want to take the risk with the DS8K. They don't get to be DEs by leaving things to chance, to be sure.

Symmetrix, on the other hand, isn't subject to these risk factors. Writes are more deeply buffered and delayed by the larger write cache of Symmetrix (DS8K is limited to 4GB or 8GB of non-volatile write cache vs. 80% of 256GB on DMX4 and 80% of 512GB on V-Max). Symmetrix writes are always aligned to the ZeusIOPS' logical page size to minimize write amplification, and the P/E cycles experienced by the NAND in the drive is proactively monitored to enable pre-emptive replacement should a drive exhibit premature or runaway wear-out.

Not so the DS8K, apparently…hence the conservative approach.

ZZ from: http://thestorageanarchist.typepad.com/weblog/2009/05/2002-meh-ibm-really-really-doesnt-get-flash.html

2009年4月14日星期二

How to make good use of technology


The following is an interesting view of orphaned storage, or stranded capacity will vary you look at your own data/storage stack:




















Storage Capacity Definition

  • Changing RAID levels will impact the delta between RAW and Usable.
  • Array virtualization will reduce the waste between usable and allocated
  • Thin provisioning will save space differences between allocated and used
  • Deduplication will impact utilized and application content
  • Archive will reduce both allocated and utilized volumes
  • SRM will help in managing and monitoring all the areas above

The right architecture is needed to right-size the total storage estate. Look at short term and long term options to improve ROA and overall utilization.

But the ultimate goal is to reduce TCO, so we should make a balance among several aspects such like performance, space efficiency and purchase budget and maintance cost.




2009年4月8日星期三

Enterprise Flash Drive Cost and Technology Project

In January a year ago, EMC surprised the IT world with the introduction of Flash drives. In the Wikibon Peer Insight this Tuesday (2/24/2009) we heard that EMC had introduced 200GB and 400GB flash drives, and reduced the price of flash drives relative to disk drives. Other leading vendors such as HDS, IBM HP and Sun have all introduced flash drives, and most if not all storage vendors have plans to introduce them in 2009.

Flash drives have two major benefits for reducing storage and IT energy budgets.

  1. The ability to perform hundreds of times more I/O that traditional disks and replace large numbers of disks that are I/O constrained. This allows the remaining data that is I/O light to be spread across fewer high capacity lower speed SATA hard drives. The impact is fewer actuators, fewer drives and more efficient storage controllers., leading to lower storage and energy costs.
  2. The ability to increase system throughput by reducing I/O response times. Flash can have a profound effect for workloads which are elapsed time sensitive. One EMC customer was able to avoid purchasing 1,000 system Z mainframe MIPS and software by reducing batch I/O times with flash drives. Others have placed critical database tables on flash volumes and significantly improved throughput. By eliminating a large proportion of “Wait for I/O” time, Wikibon estimates that between 2% and 7% of processor power and energy consumption can be saved.

In the Peer Insight, Daryl Molitor of JC Penney articulated a clear storage strategy of replacing FC disks with flash, and meeting the rest of the storage requirements with high density SATA disks. Daryl’s objective was to reduce storage costs and energy requirements. This bold strategy begs two questions:

In what time scale will Flash drives replace FC drives?

















In June 2008 I wrote a Wikibon article
"Will NAND storage obsolete FC drives?" The update of projection chart in the original article is shown in chart 1 below.


It shows that the actual reduction in prices of NAND storage is coming down at about 60%/year. At this rate of comparative reduction, FC drives will be obsolete in less than three years time. There is already significant opportunity to move some data to flash drives, and by starting now Daryl is placing himself in a good strategic position.

What architectural, infrastructure and/or ecosystem changes must be available to implement this strategy? Some vendors and analysts have predicted that Flash technology will profoundly change the way that systems are designed, leading to flash being implemented in multiple places in the systems architecture. However, such fundamental architectural changes will also require significant changes in the operating systems, database software and even application software to exploit it. Gaining industry agreement to such changes will not happen within three years. Disk drives are currently the standard technology for non-volatile secure access to data and will remain the standard for at least the next three to five years. EMC was right to introduce flash technology as a disk drive as the simplest way to introduce the technology within the current software ecosystem.

That is not to say that technology changes are not required. Vendors and analysts have pointed out that the architecture of all current array systems are not designed to cope with flash storage devices that operate at such low latencies. This leads to limited numbers of flash drives being supported within an array, and less than optimal performance from the flash drives. Vendors are moving to fix this, and this will happen within three years.

The most fundamental architectural change required is to ensure that the right data is placed on flash storage. To begin with specific database tables and high activity volumes are being moved to flash drives manually on an individual basis. The next stage will be to automate the dynamic movement of data to and from flash drives to optimize overall IO performance. A prerequisite is to be able to track I/O activity on blocks of data and hold the metadata. Virtualization architectures will have a head-start in providing the infrastructure to provide monitoring and automated dynamic movement of data blocks.

So which vendors will provide the flash technology that operates efficiently in a storage array and provides automated dynamic (second by second) data balancing? At the moment, none can. Clearly EMC have a head-start in understanding the technology, understanding customer usage and understanding the storage array requirements. The storage vendors offering virtualization are also well positioned; Compellent has probably the most versatile architecture with its unique ability to dynamically move data within a storage volume to different tiers, and IBM has broken the 1 million IOPS barrier for an SPC workload with flash storage connected to an IBM San Volume Controller (SVC). Other storage virtualization vendors such as 3PAR, NetApp, HP and Hitachi are also well positioned.


Action Item: The race is on. Storage executives should be exploring the use of flash drives for trouble spots in the short term on existing arrays in order to build up knowledge and confidence in the technology. For full scale implementation, storage executives should wait for solutions that provide storage arrays modified to accommodate low-latency flash drives and automated dynamic placement of data blocks to optimize the use of flash.


ZZ From:http://wikibon.org/?c=wiki&m=v&title=Enterprise_Flash_Drive_Cost_and_Technology_Projections

VMWare and how it effects Storage

1:VMware Server 因为上面的很多OS共用一个或几个HBA,IO为small block Random IO,因此对IOPS的要求比对带宽的要求高..


"VMWare Changes Everything"


That's a lovely marketing phrase, but when it comes to storage, it does, and it doesn't. What you really need to understand is how VMWare can effect your storage environment as well as the effects that storage has on your VMWare environment. Once you do, you'll realize that it's really just a slightly different take on what storage administrators have always battled. First some background.

Some Server Virtualization Facts

  1. The trend of server virtualization is well under way and it's moving rapidly from test/dev environments into production environments. Some people are implementing in a very aggressive way. For example, I know one company who's basic philosophy is "it goes in a VM unless it absolutely can be proven it won't work, and even then we will try it there first."
  2. While a lot of people think that server consolidation is the primary motivating factor in the WMVware trend, I have found that many companies are also driven by Disaster Recovery since replicating VMs is so much easier then building duplicate servers at a DR site.
  3. 85% of all virtual environments are connected to a SAN, that's down from nearly 100% a short time ago. Why? Because NFS is making a lot of headway, and that makes a lot of sense since it's easier to address some of the VMWare storage challenges with NFS than it is with traditional fiber channel LUNs.
  4. VMWare changes the way that servers talk to the storage. For example, they force the use of more advanced file systems like VMFS. VMFS is basically a clustered file system and that's needed in order to perform some of the more attractive/advanced things you want to do with VMWare like VMotion.

Storage Challenges in a VMWare Environment

  1. Application performance is dependant on storage performance. This isn't news for most storage administrators. However, what's different is that since VMWare can combine a number of different workloads all talking through the same HBA(s), the result is that the workload as seen by the storage array turns into a highly random, usually small block I/O workload. These kinds of workloads are typically very sensitive to latency much more than they require a great deal of bandwidth. Therefore the storage design in a VMWare environment needs to be able to provide for this type of workload across multiple servers. Again, something that storage administrators have done in the past for Exchange servers, for example, but on a much larger scale.
  2. End to end visibility from VM to physical disk is very difficult to obtain for storage admins with current SRM software tools. These tools were typically designed with the assumption that there was a one-to-one correspondence between a server and the application that ran on that server. Obviously this isn't the case with VMWare, so reporting for things like chargeback becomes a challenge. This also effects troubleshooting and change management as well since the clear lines of demarcation between server administration and storage administration are now blurred by things like VMFS, VMotion, etc.
  3. Storage utilization can be significantly decreased. This is due to a couple of factors, the first of which is that VMWare requires more storage overhead to hold all of the memory, etc. so that it can perform things like VMotion. The second reason that VMWare uses more storage is that VMWare admins tend to want very large LUNs assigned to them to hold their VMFS file systems and to have a pool of storage that they can use to rapidly deploy a new VM. This means that there is a large pool of unused storage sitting around on the VMWare servers waiting to be allocated to a new VM. Finally, there is a ton of redundancy in the VMs. Think about how many copies of Windows are sitting around in all those VMs. This isn't new, but VMware sure shows it to be an issue.

Some Solutions to these Challenges

As I see it there are three technical solutions to the challenges posed above.

  1. Advanced storage virtualization - Things like thin provisioning to help with the issue of empty storage pools on the VMWare servers. Block storage virtualization to provide the flexibility to move VMWare's underlying storage around to address issues of performance, storage array end of lease, etc. Data de-dupulication to reduce the redundancy inherent in the environment.
  2. Cross domain management tools - Tools that have the ability to view storage all the way from the VM to the physical disk and to correlate issues between the VM, server, network, SAN, and storage array are beginning to come onto the market and will be a necessary part of any successful large VMWare rollout.
  3. Virtual HBAs - These are beginning to make their way onto the market and will help existing tools to work in a VMWare environment.

Conclusion

Organizations need to come to the realization that with added complexity comes added management challenges and that cross domain teams that encompass VMWare Admins, Network Admins, and SAN/Storage Admins will be necessary in order for any large VMWare rollout to be successful. However, the promise of server virtualization to reduce hardware costs and make Disaster Recovery easier is just too attractive to ignore for many companies and the move to server virtualization over the last year shows that a lot of folks are being drawn in. Unfortunately, unless they understand some of the challenges I outlined above, they may be in for some tough times and learn these leassons the hard way.

--joerg

ZZ From: http://joergsstorageblog.blogspot.com/2008/06/vmware-and-how-it-effects-storage.html

IBM XIV Could Be Hazardous to Your Career

XIV的dual drive failure真是硬伤,以后盘肯定越来越大,rebuild时间也会加长,两块盘坏的概率实在大了点。如果能够把180块盘做成类似Raid-6或许会安全很多...

So, I haven't blogged in a while. I guess I should make all of the usual excuses about being busy (which is true), etc. But the fact of the matter is that I really haven't had a whole heck of a lot that I thought would be of interest, certainly there wasn't a lot that interested me!

But now, I have something that really get my juices flowing. The new IBM XIV. I don't know if you've heard about this wonderful new storage platform from the folks at IBM, but I'm starting to bump into a lot of flolks that are either looking seriously at one, or have one, or more, on the floor now. It's got some great pluses:

  • It's dirt cheap. On top of that, I heard that IBM is willing to do whatever it takes on price to get you to buy one of these boxes, to the point that they are practically giving them away. And, as someone I know and love once said "what part of free, isn't free"?
  • Fiber channel performance from a SATA box. I guess that's one of the ways that they are using to keep the price so low.
  • Teir 1 performance and reliability at a significantly lower price point.

So, that's the deal, but like with everything in this world, there's no free lunch. Yes, that's right, I hate to break it to you folks, but you really can't get something for nothing. The question to ask yourself is, is the XIV really too good to be true? The answer is yes, it is.

But the title of this blog is pretty harsh, don't you think? Well, I think that once you understand that the real price you are paying for the "almost free' XIV could be your career, or at least your job, then you might start to understand where I'm coming from. How can that be? Well, I think that in most shops, if you are the person who brought in a storage array which eventually causes a multi-day outage in your most critical systems that your job is going to be in jeopardy. And that's what could happen to you if you buy into all of the above from IBM regarding the XIV.

What are you talking about Joerg?!? IBM says that the XIV is "self healing", and that it can rebuild the lost data on a failed drive in 30 minutes or less. So how can what your said be true? Well folks, here's the dirty little secret that IBM doesn't want you to know about the XIV. Due to its architecture if you ever lose two drives in the entire box (not a shelf, not a RAID group, the whole box all 180 drives) within 30 minutes of each other, you lose all of the data on the entire array. Yup, that's right, all your tier 1 applications are now down, and you will be reloading them from tape. This is a process that could take you quite some time, I'm betting days if not weeks to complete. That's right, SAP down for a week, Exchange down for 3 days, etc. Again, do you think that if you brought that box in after something like that your career at this company wouldn't be limited?

So, IBM will tell you that the likely hood of that happening is very small, almost infinitesimal. And they are right, but it's not zero, so you are the one taking on that risk. Here's another thing to keep in mind. Studies done at large data centers have show that disk drives don't fail in a completely random way. They actually fail in clusters, so the chances of a second drive failing within the 30 minute window after that first drive failed are actually a lot higher than IBM would like you to believe. But, hey, let's keep in mind that we play the risk game all the time with RAID protected arrays, right? But the big difference here is that the scope of the data loss is so much greater. If I have a failure in a 4+1 RAID-5 raid group, I'm going to lose some LUNs, and I'm going to have to reload that data from tape. However, it's not the entire array! So I've had a much smaller impact across my Tier 1 applications, and the recovery from that should be much quicker. With the XIV, all my Teir 1 applications are down, and they have to all be reloaded from tape.

Just so you don't think that I'm entirely negative about the XIV let me say that what I really object to here is the use of a XIV with Tier 1 applications or even Tier 2 applications. If you want to use one for Tier 3 applications (i.e. archive data) I think that makes a lot of sense. Having your archive down for a week or two won't have much in the way of a negative impact on your business, unlike having your Tier 1 or Tier 2 applications down. The once exception to that I can think of is VTL. I would never use a XIV as the disks behind a VTL. Ca you imagine what would happen if you lost all of the data in your VTL? Let's hope that you have second copies of the data!

Finally, one of the responses from IBM to all of this is "just replicate the XIV if your that worried". They right, but that doubles the cost of storage, right?

ZZ From:http://joergsstorageblog.blogspot.com/2009/01/ibm-xiv-could-be-hazardous-to-your.html

xiv does hitachi math with roman numerals

看来对于Publish的测试数据真的得好好琢磨琢磨,特别是XIV这种只写"0"的测法,真是太有创意了。
XIV Random IOPS, 20ms response time, cache miss IOPS maximum 22000 -- 27000...



I almost didn’t believe it.

And I still wouldn’t, if it wasn’t corroborated from several sources.

I’ve been told that there are actually people trying to sell XIV to unsuspecting prospects using good old Hitachi Math.

That’s right. Hitachi Math. That “modernistic form of algebra that arrives at irreproducible results that also have the unique property of having absolutely no bearing on reality” that I’ve talked about here on numerous occasions. That same whacky logic that Hitachi has been using for years to mislead us all about how many meel-yun IOPS a USP can do by counting reads serviced exclusively from the buffers on the front-end Fibre Channel ports – a totally meaningless statistic.

Apparently, there are at least some who sell XIV arrays that are willing to stoop to these same lows in their quest to unseat the competition and gain footprint.

I guess given the growing market comprehension of the inarguable space and power inefficiencies of XIV’s “revolutionary” approach, coupled with the forced admissions that simultaneous dual drive failures in two separate XIV drive bays are indeed fatal and the growing realization that just because Moshe was there for the dawn of the Symmetrix era doesn’t make him all-powerful (nor the parent of today’s DMX)…well, I guess this all has proven just too much to overcome with IBM’s vaunted “trusted partner” approach to sales.

Nope, you won’t get no vendor bashing from those guys, just plain unadulterated crap-ola. When the facts get in the way, all you can do is lead with what you do best, I guess.

But I never would have guessed that anyone would attempt Hitachi Math using roman numerals.

Apparently it has been done.

those iops aren’t real iops

Several sources have identified the sleight of hand that has been pulled by more than one XIV sales representative in various accounts. The trick centers around the seemingly simple demonstrations of the performance of the XIV array in random I/O workloads.

Given the architecture of XIV, where every host volume is actually spread across 180 1TB SATA drives in “chunks” of 1MB, sequential I/O performance can be expected to be pretty good in an XIV - as it should be with any such wide-striped configuration.

But logic says that running a reasonably sized random I/O workloads against an XIV should quickly exceed the capability of cache to mask the slowness of the 7200 rpm SATA drives it uses. Sooner or later random workloads will overpower cache and start forcing read misses and write destages to meet the I/O demands.

However, many customers report that XIV pre-sales have demonstrated IOPS and response times for random workloads that exceed all logic. In fact, to anyone who understands storage performance, the results have seemed outright too good to be true.

Results like these usually set alarms off in the minds of skeptics and cynics. People started digging into these results, and where shocked at what they found.

They’d been scammed.

Here’s how: Apparently, the standard XIV performance demonstration uses the popular iometer workload generation and measurement tool (why they don’t use an SPC workload is beyond me, but that’s a story for another day). Only here’s the twist: the version and configuration of iometer used for the XIV demo has been carefully tuned to write (and read) data blocks that are entirely zero-filled – blocks of null data.

No big deal, right? I mean, it takes the same amount of time to write a block of all zeroes as it does to write a block with any other combinations of 1’s and 0’s, right?

Wrong!

At least, not on an XIV storage system.

One of the (apparently overlooked) features of the XIV storage system is that it defines the default state of all untouched/unwritten/unmodified data to be all-zeroes. And it checks incoming writes to see if they contain any 1’s, and if they DON’T, then the XIV storage system DOESN’T WRITE THE DATA, neither to disk nor to cache. And it doesn’t have to, in fact, because the data is already zero!

Similarly, reads of any unmodified data blocks don’t require actually reading data off the disk – a quick check of the LUN-to-block mapping tables, and if the block is either unallocated or not yet modified, a buffer of zeroes can returned without that annoying wait for the disk drive to actually retrieve the data.

Get the picture?

Run iometer set to write only zeros against an XIV array, and you’ll get incredibly high IOPS with amazingly low response times – because the array never has to actually move data to or from the disk drives!

UPDATED 21 Jan, 2009: An acquaintance emailed me about a similar ploy he witnessed performed by XIV representatives. Instead of iometer, he was shown how fast an XIV array could write data, using the UNIX dd command to copy from /dev/zero to the XIV array.

Same trick, different tool.

In effect, you’re simply measuring how fast the front-end nodes can figure out that a block is all zeroes.

Hitachi Math, XIV style.

life in the real world

Now, it’s a cute trick, you gotta admit – especially if you didn’t fall prey to the trickery.

Needless to say, I seriously doubt that there are many practical applications that routinely create mounds of zero-filled I/Os blocks. And the people who explained this misdirection to me noted that when you ran the same tests using iometer configured to write (and read) non-zero data, random IOPS fell back in line with what logic tells someone experienced in storage performance would expect.

Actually, much slower. It seems that for all the hype, an XIV storage array is only able to deliver somewhere between 22,000 and 27,000 cache miss IOPS maximum (dependent upon block size and referential locality). For perspective, that’s a fraction of what a CLARiiON or a Symmetrix DMX4 can deliver from 180 drives, whether they’re SATA or FC-based (XIV only supports the slower SATA drives), and whether they’re spinning hard drives or Enterprise Flash Drives.

See, there is no free lunch when it comes to storage performance. Sure, wide striping can deliver a lot of IOPS in aggregate. But in the end, when you need a specific block of data off of a SATA drive, and that block isn’t in cache, you’re going to have to wait for the disk to get the data (unless, of course, it’s all zeroes). A 7200 rpm SATA drove simply cannot deliver data as fast as a 10K or 15K rpm disk drive can – not to mention the sub-millisecond response times you can get from high-performance Enterprise Flash Drives.

So you’re going to wait.

Oh, and speaking of wide striping. Another reported application of Hitachi Math with Roman Numerals is to compare test results of 8 LUNs on an XIV vs. 8 LUNs on a Symm or a CLARiiON. But the XIV team will insist on comparing XIV’s “thin and wide” implementation against the standard “fat” allocation scheme of its competitors. For the Symm, this effectively constrains the test to using 64 drives on the Symm vs. 180 on the XIV – hardly a fair comparison. A much more accurate comparison would be to put 180 drives in both arrays, and to use Virtual Provisioning on the Symmetrix against a pool formed of all 180 drives.

you gotta wonder why?

When I started posting about the technical inefficiencies and data risks of the XIV approach to storage, I was accused by many as being scared of XIV. And even now, when other bloggers have begun to raise awareness about these same XIV issues (most recently HDS’s Claus Mikkelsen and Dimitris over at Recovery Monkey) there are those who claim it’s all just FUD and competition bashing.

Hardly.

No, we’re all just shining the lights on the issues so that people can make informed decisions – pointing out things that XIV sales folks don’t find important to share with prospects for some reason.

Things I encourage you to ask your IBM sales team to come clean on. Don’t take my word for these things – make IBM come clean with answers to the operational and technical flaws of the XIV storage systems that I discussed last year (here and there) starting even before IBM officially launched the product (did they ever)?

But first, put aside my motivations and those of Claus and Dimitris for the moment…

Ask yourself why someone selling XIV storage systems would resort to such a deplorable application of Hitachi Math as to mislead prospects about the performance of their product using a hacked up version of iometer to generate an outlandishly fake IO workload?

If the XIV array was as fast as it has been claimed, why would anyone ever have to resort to such tactics?

And if they’re offering you the XIV storage system for free, be sure you understand their motivation. Attractive as it may seem in today’s economy, we all know that there is no such thing as “free storage” in this world. No matter what promises they make, you know for sure that you’re going to have to pay for that XIV version of “free”sooner or later.

With a lot of luck, your price won’t include the unrecoverable loss of your data.

And unless your data is made up of nothing but 0’s,
with an XIV storage system, sooner or later you ARE going to pay…dearly!

ZZ from:http://thestorageanarchist.typepad.com/weblog/2009/01/1037-xiv-does-hitachi-math-with-roman-numbers.

Wide striping is a two edged sword

当XIV的sales 对你的忽悠说,XIV可以提供和FC阵列提供一样的IOPS时,千万要记住,这是在牺牲存储的利用率的基础上得到的。

I have spent a lot of time lately talking with some of my coworkers, friends, etc. on the topic of wide striping. This topic keeps coming up since there are now a number of vendors selling storage arrays with SATA drives that claim to have "the same performance as fiber channel". Some of the Sales folks I work with keep asking how we are supposed to dissuade people from that idea, or if it's true. One of the prime offenders in this regard is IBM with their new XIV array. The XIV uses wide striping and SATA drives and they claim to have "enterprise performance" at a very low price point. But they aren't the only ones; you have Dell telling people the same thing about their EqualLogic line of storage as well, and there are other too. For an excellent article about the XIV and its performance claims, take a look at http://thestorageanarchist.typepad.com/weblog/2009/01/1037-xiv-does-hitachi-math-with-roman-numbers.html.

What I usually tell them is that the statement is true; you can get fiber channel performance by striping across a large number of SATA drives. The only problem is that you have to give up a lot of usable disk space in order to keep it that way. A quick example usually illustrates the point quite well. Let's say that for the sake of easy math the average application in your environment uses about 5TB of space (I'm sure some are a lot more, and some a lot less, but we are talking average here). Let's also say that you need about 2,000 IOPS per application in order to maintain the 20ms max response time you need in order to meet the SLAs you have with your customers. Finally, let's also assume that your SATA array has about 90TB of useable space using 180 750GB SATA drives and you can get about 20,000 IOPS in total from the array. So, let's do some basic math here. That means that you can run about 10 applications at 5 TB apiece which will take up about 50TB. So, your array will perform well, right up until you cross the ½ full barrier. After that, performance will slowly decline as you add more application/data to the array.

So, what does this mean? It means that the cost per GB of these arrays is really about twice what the vendors would have you believe. OK, but considering how much cheaper SATA drives are than 15K fiber channel drives, that's still OK, right? Sure, as long as you are willing to run your XIV at ½ capacity. In today's' economic climate, that's going to be tough to do. I can just imagine the conversation between your typical CIO and his Storage Manager.

Storage Manager – "I need to buy some more disk space."

CIO – "What are you talking about, you're only at 50% used in theses capacity reports you send me and we didn't budget for a storage expansion in the first year after purchase!"

Storage Manager – "Well, you know all that money we are saving by using SATA drives? Well, it means I can't fill up the array; I have to add space once I reach 50% or performance will suffer."

CIO – "So let performance suffer! We don't have budget for more disk this year. Why didn't you tell me this when you came to me with that 'great idea' of replacing our 'enterprise' arrays with a XIV?!?!"

Storage Manager – "Ahhh … ummmmm … gee, I didn't know, IBM didn't tell me! But we had some performance issues early on, and figured this out. Do you really want to tell the SAP folks that their response time is going to double over the next year?"

CIO – "WHAT! We can't let that happen, we have an SLA with the SAP folks and my bonus is tied to keeping our SLAs! How could you let something like this happen! Maybe I should use the money for your raise to pay for the disks!"

Storage Manager – "Um, well, actually, we need to buy an entire new XIV, the one we have is already full."

OK, enough fun, you get the idea … make sure you understand what wide striping really buys you and if you decide that the TCO and ROI make sense, make sure you communicate that up the management tree in the clearest possible terms. Look at the applications that you currently run, see how much space they require, but don't base the sizing of your EqualLogic (see, I'm not just bashing the XIV) just on your space requirements. Base them more on your IOPS requirements. With SATA drives chances are pretty good that if you size for IOPS, you'll have more than enough space.


ZZ from:http://joergsstorageblog.blogspot.com/2009/01/wide