Sanify, Never Degrade


Sanify Home
Why not RAID?
Why use Sanify?
Why rely on Sanify?
Configuration
Administration
Licensing
Contact

Why rely on Sanify?

Trusting your data to any storage solution carries inherent risks. Here are a few good reasons to have confidence in Sanify.

Sanify continues to withstand the test of time.

Sanify has been deployed in multiple clusters and running 24 hours a day since Q1, 2010 with minimal downtime and no data loss in any environment despite equipment failures, routine maintenance, moves, upgrades.

Reliability is always the top priority.

Although the development path has been perhaps slower and more difficult, it has always been Sanify strategy to build software supported primarily by those with the greatest vested interest in it's utility: its users. Sanify has never been funded by public or private venture capital and has no debt; there are no smoke and mirrors short cuts to meet financial or marketing deadlines.

Sanify software engineering methodology is sound.

Sanify is 100% custom code written from scratch specifically for this purpose. It is very heavily tested 24 hours a day using dozens of computers and has been deployed in production since Q1, 2010. Mike Hayward is the chief architect and has over thirty years of software engineering experience including math and computer science education at various universities including well-respected Carnegie Mellon University, home of the Software Engineering Institute and the renowned Capability Maturity Model. With well over 50,000 hours of real world experience developing computer software for dozens of companies ranging from international startups to super computers at Intel, Mike has the qualifications to build reliable systems.

A zero defect release process maintains a solid foundation.

Since the very first line of code was written, hundreds of versions ago, Sanify has consistently maintained zero known defects at each release and continues to vigorously pursue quality and safety as job one. The vast majority of software, despite being far less complex, is typically riddled with thousands or even tens of thousands of defects. In particular, corruptive defects are simply not acceptable in storage software; it is Sanify policy to extensively test each release within its parameters and that no production release shall be made with any known corruption defect. This is not to say that Sanify is feature complete or that there are never any defects, but that it gets tested rigorously and defects get fixed.

All development is test driven.

Over ninety percent of the Sanify code base is present specifically to cope with or test fault behavior. Sanify incorporates a great many real time cross checks, and in addition to real time testing, both the design and implementation are driven by extensive simulated fault testing for which there is no substitute. Typically, Sanify goes through hundreds and sometimes even thousands of years of simulated constant network and process failures before each release.

Sanify relies on Sanify.

We eat our own dog food. All Sanify software development is conducted on servers with 100% of their file systems stored in Sanify clusters.


Design Goals

Sanify architecture relies upon the following prioritized storage design goals as a fundamental guiding principles when trade-offs must be made. Sanify never sacrifices reliability for performance.

1. CorrectnessThe system must never silently corrupt data.

Many storage systems sacrifice correctness for performance, or are simply incorrect to decrease implementation cost or time to market. Consider USB hard drives which have volatile disk caches which cannot be flushed or disabled. Consider distributed file systems which don't guarantee cache coherency in order to achieve availability or performance. Consider two drive mirroring which can trivially split brain.

2. DurabilityThe system must not degrade below the specified level of redundancy.

Many fault tolerant storage systems quietly lower redundancy when faults occur, dramatically decreasing safety. This is typically done because the implementation is both simpler and cheaper. This is true of traditional RAID implementations and even many expensive storage area networking products, despite sufficient storage hardware to maintain redundancy.

3. AvailabilityThe system must enable data to be accessed with minimal down time when faults occur.

Many fault tolerant storage systems are difficult or nearly impossible to access when typical single point failures occur. Even when data is not lost, not being able to use it for hours, days, or even weeks can be very problematic and costly. Consider the consequences of a typical hardware RAID controller failure.

4. PerformanceThe system must provide reasonable performance even in fault scenarios.

Many fault tolerant storage systems do not degrade gracefully or recover automatically when faults occur. Consider the performance impact of reconstructing a RAID5 array after a drive failure: every single byte of every single drive must be read. Even some expensive storage area networking products require recovery to be manually initiated during periods of low activity.