An I/O booster for big data? That’s the question! – Part 2



 
 

Rocket LaunchIn part 1 of this blog post, we talked about the need for fast communications in a smarter architecture and used the example of the postal industry to look at how communication delays have been reduced over time. In this second part, we will see how the same type of evolution is applicable to big data and the input/output (I/O) world.

In the same way that we defined orders of magnitude to transport our packages through the postal service, we here define classical orders of magnitude for our I/O access time.

I/O time improvement with recent technologies

The conventional units for measuring I/O are the millisecond (ms) and microsecond (µs), where 1 ms = 1,000 µs. Let’s assume that the processor needs about 100 µs for initiating the I/O request and that it takes about 100 µs for the interpretation when the storage array has sent the information back. We’ll also assume that the data are on an optimized storage area network (SAN) with optical fiber links, which is the required time to obtain this information stored on disk back to the processor (service time).

Classically we’d approximate that response times beyond 10 ms (10,000 µs) will be unacceptable to the application. In practice, a storage box with serial-attached SCSI (SAS) disks will provide access to records in the range of roughly 5 ms (5,000 µs).

We can summarize the magnitude of the time commonly observed in customer production as follows:

I/O requests by CPU Service time I/O processed by CPU Total I/O time
Time (µs) SAS 100 5,000 100 5,200

These elements are of course visible and measurable at:

  • Infrastructure level, processor and disks (system administrator)
  • Application level (database administrator)

To reduce this time, I/O solid-state drive (SSD) flash drives have emerged, and I have already described in a previous blog the interesting usage of these devices in IBM hardware. In this case, fast SSD mixed with cheaper conventional SAS can increase the array performance through the use of an automated tiering device.

However, assuming that we put all the data on the SSD, we would be in the order of magnitude of 1 ms = 1,000 µs.

Thus this scenario would improve our table as follows:

I/O requests by CPU Service time   I/O processed by CPU Total I/O time Improvement ratio
Time (µs) SAS 100 5,000 100 5,200 µs 1x
Time (µs) SSD  100 1,000 100 1,200 µs 5x

This is very correct and sufficient in many cases, but what solution would change the order of magnitude? How do we build an I/O booster?

How do we get a breakthrough?

Texas Memory Systems (TMS) is a strong leader in the field of flash components. The recent acquisition of this company by IBM offers a new range of products: The IBM FlashSystem family.

IBM FlashSystem

These devices allow a service time of 200 µs, so here is our breakthrough!

Look at how this improves our table:

I/O requests by CPU Service time I/O processed by CPU Total I/O time Improvement ratio
Time (µs) SAS 100 5,000 100 5,200 µs 1x
Time (µs) SSD 100 1,000 100 1,200 µs 5x
Time (µs) IBM FlashSystem  100   200 100   400 µs 20x 

Recently with one of my clients, I positioned an IBM FlashSystem machine directly into production on an Oracle database and I was amazed! We found the same figures as in the table above, which matched the numbers for FlashSystem!

For example, an I/O intensive process (99 percent of I/O wait) decreased from 4,000 seconds to 200 seconds—an improvement of 20x. Obviously I/O is “boosted” and the processors are more loaded, which significantly shortens your treatment time.

Besides the improvement in performance, the simplified implementation by inserting the device in the I/O patch is amazing. With mirroring techniques this may be transparent for critical applications. In our case we implemented it in only four hours.

Unlike with traditional storage devices, you don’t have to change your entire infrastructure by linking up with a manufacturer box. You keep your existing storage infrastructure, and advanced replication or copy functions continue to be performed by the existing bays.

IBM FlashSystem is an easily locatable 1U hardware in the data center, economical in terms of space, energy and price per gigabyte compared to SAS and SSD solutions.

Conclusion

We are at the beginning of the extensive use of this type of IBM FlashSystem. They can be easily combined within your current architectures to drastically reduce your response time, and they are more than complementary to the famous SSD solutions. This highly innovative approach fully meets cloud and smarter computing requirements.

Honey, I Shrunk the Kids is a movie. “Honey, I shrunk the I/O” is now a reality!

Please ask a representative to try this wonderful technology. You can also tweet any comments or experimental results to me @philip7787.


Philippe Lamarche is currently an IBM Systems Architect in the hardware division (STG) since 1995, working with French industry customers and System Integrators. He has spent over 30 years at IBM in different technical positions. As a presales technical role he is a Certified IT Specialist at expert level. You can reach him on Twitter: @philip7787.

Redbooks Thought Leader

 
 
Smarter Computing Analyst Paper - HurwitzTo effectively compete in today’s changing world, it is essential that companies leverage innovative technology to differentiate from competitors. Learn how you can do that and more in the Smarter Computing Analyst Paper from Hurwitz and Associates.

Subscribe to the Smarter Computing Blog
This entry was posted in Big Data, Smarter Storage and tagged , , , . Bookmark the permalink.

Recent Posts

Answering the call for a new generation of systems

Doug Balog

IBM’s new generation of Power Systems are tuned for Linux, designed for data and optimized for cloud. The new POWER8 processor is at the very core and stands at the heart of the OpenPOWER Foundation, enabling unique, community-created innovation.

Continue reading

Manage your data, not your storage

David Vaughn

IBM recently announced the IBM FlashSystem 840, a fantastic product delivering 1.1 million IOPs with 135 microsecond latency. Nothing else on the market can touch it. As good as it is, I don’t think that is the biggest story from the announcement.

Continue reading

Leave a Reply

Your email address will not be published. Required fields are marked *

* Copy This Password *

* Type Or Paste Password Here *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>