High Performance Computing

High Performance Computing (HPC) is essential to many engineering and science activities. Whether it’s energy research, defense, automotive, oil & gas exploration or semiconductor design, the problem requires that many processors must be harnessed to work on a common, very large data set.

Violin Memory appliances enable the storage of hot content to keep up with the Petaflop capacity of today’s supercomputers. Hard drives will always provide the bulk capacity needed for these applications, but Violin provides key technical benefits that can dramatically improve the effective performance of an HPC environment:

  • Random Access Bandwidth: Disks have excellent characteristics for streaming data (100 MByte/sec).  However, they can slow dramatically (1 MByte/sec) if the disk accesses are random or multiple CPUs are simultaneously accessing the same disks. Memory appliances can sustain over 1 GByte/sec of random accesses, up to 1000 times faster than a drive.
  • Latency: Disk accesses typically take 5 to 10 milliseconds. During this time, the CPUs are often running inefficiently. Violin reduces this down to microseconds and increases CPU efficiency for some applications.


Violin appliances can be deployed in either of two models:

  • Centralized: The hot data for the applications are housed in a set of Violin appliances accessible by all processors.
  • Distributed: Each rack of equipment can be allocated a Violin appliance. This can hold "hot" shared data and also be used for scratch space and to assist rapid checkpointing and booting of applications.

Is sequential storage performance all you need?

Some well written applications have managed to structure their data in large files which require only sequential access. This is often true of simplistic benchmarks which do not have to cope with real data. There are many situations, however, where sequential access to the storage is not possible:

Fine grain data: Where data objects are small (<1 MByte) and may be retrieved independently, sequential disk access is difficult to achieve and lower latency would benefit application performance. Metadata is the most common example.

Shared data:  Where many CPUs are accessing the same data, the disks can thrash as each CPU asks for a different part of the data set. Caching of this data in network attached memory is preferred.

Multi-dimensional data: Where data is stored in sequence in one dimension, but may also be accessed via another dimension, the accesses are not sequential. Geoseismic data may be stored in 3 or even 4 dimensions.