SoftRAID Features and Compatibility
Using SoftRAID with Disks
Using SoftRAID with Volumes
Using SoftRAID with Mirror Volumes
SoftRAID Best Practices
Resolving Specific Problems
What do these log entries mean?
SoftRAID Features and Compatibility
During the 30 day evaluation period, you can use all of the features of SoftRAID. This allows you to try out SoftRAID for free and learn about its advanced RAID and disk monitoring features.
After the 30 day evaluation period is over, your SoftRAID volumes will continue to mount and you can continue to read and write files on them. If you are using a SoftRAID volume as your startup volume, you can continue to use it to start up your Mac. All other SoftRAID features will be disabled.
SoftRAID version 4 is compatible with Mac OS X 10.4 (Tiger) and later. We recommend that you run the latest bug fix release for the version of Mac OS X you are using.
SoftRAID is compatible with all PowerPC Macs which run Mac OS X 10.4. We test every release on a 533 MHz PowerMac G4 with 512 MB of RAM to ensure that SoftRAID runs on legacy hardware.
Most third party utilities are fully compatible with SoftRAID volumes. This includes: DiskWarrior, TechTool Pro, Drive Genius and Data Rescue.
When you are using SoftRAID for both your startup volume and all your data volumes, the SoftRAID Monitor and driver will use less than 0.2% of one of your Mac's CPUs. On a the average Mac Pro, that means that SoftRAID will be using less than 0.01% of your CPU power. In addition, all parts of SoftRAID will be using less than 30 MB of your physical RAM. We have designed SoftRAID to have as little impact on the speed of your system as possible. This allows SoftRAID to be very usable on a 533 MHz PowerMac G4 with 512 MB of RAM.
Even when you are performing a mirror rebuild on more than one volume, the driver is still using less than 1% of the power of one CPU. If you find that your Mac is not as responsive as you want during a mirror rebuild, you can change the
optimization for that volume which will give your work higher priority over the mirror rebuild operation.
XSAN and other SANs are incompatible with SoftRAID. SANs require direct access to the disks they use and cannot use disks which are under SoftRAID's control.
Each Mac that has SoftRAID volumes needs to have a separate serial number. SoftRAID only checks the serial number on a Mac if the it has SoftRAID volumes. If a Mac does not have SoftRAID volumes, the serial number is not checked.
This means that if you have two Macs, each of which have SoftRAID volumes attached to them, you will need 2 serial numbers.
However, if you have two Macs and one set of SoftRAID disks which you move back and forth between the two Macs, you will only need one serial number. You will need to restart your Mac after you disconnect the SoftRAID disks to get that Mac to release the serial number so that the other one can use it.
Using SoftRAID with Disks
Most modern disks only get limited testing during the manufacturing process. They get tested to ensure that they can read and write data correctly, spin at the correct speed and can start and stop. They are not tested to ensure that every sector can read and write correctly.
You can perform this advanced level of testing on your new disk using the SoftRAID disk certify feature. This will write a pattern to each sector on your new disk and then read it back to ensure that it is identical. Testing your new disks before you use them will prevent you from using a disk which is unreliable which might result in you losing valuable files.
You can also certify a used disk before reuse it in a different Mac or for a different volume. Certifying a disk will allow you to make sure the disk is working reliably.
Remember that certifying a disk will destroy all files and data on it.
You can use SoftRAID to certify a CF or HDSC card before using it in your digital camera. This will ensure that the card is working reliably and that all sectors on it can read and write without errors. If you routinely certify your CF or HDSC cards, you will greatly reduce the chance that you will lose photographs due to media failure. (In a professional photography studio, we recommend that you certify each card every 30 - 60 days.)
Remember that certifying a card will destroy all photos on it.
If you ever have to recover a file that you have accidentally erased or send your disk to a data recovery service after it has failed, it is much easier to locate the files you need if all the unused space on the disk is filled with zeros. SoftRAID helps makes it easier to recover files by filling disks with zeros during the last pass of certifying a disk.
Some of the controllers used on SSDs (Solid State Disks) use data compression to minimize the amount of data they have to write to flash memory. This allows them to minimize the wear on the flash memory and to attain much higher write performance when tested using benchmarking applications. (Most benchmark applications write blocks of zeros when testing the write speed of a disk).
The disk certify function in SoftRAID was written with these data compression SSD controllers in mind. Every pass of the disk certify function, except the last one, will write out a noncompressible random data pattern. This ensures that SoftRAID tests as many of the locations in the memory chips as possible.
Every time you start up the SoftRAID application, it gets the SMART status of every disk which support SMART. In addition, the SoftRAID Monitor gets the SMART status every time you restart your Mac and every 24 hours after that. Each time SoftRAID gets the SMART status of a disk, it checks to see if the disk has failed the SMART test. In addition, the SMART measurements are also used to predict whether the disk is more likely to fail. This prediction is based on results of a
study by Google engineers of disk failure using 100,000 disks over an 8 month period.
SoftRAID uses SMART to ask a disk how many hours it has been used. Disks which support SMART are on SATA, SAS and Fibre Channel buses. Disks which don't support SMART will say "SMART status: test unavailable" in the expanded part of the disk tile in the SoftRAID application main window.
For disks which don't support SMART, the SoftRAID driver maintains an hours of use counter which it updates every time your Mac is shutdown or volumes on the disk are unmounted. This counter is stored on the disk itself and is still valid if you move the disk to a different Mac. SoftRAID will even restore this counter to its correct value if you initialize the disk a second time.
We recommend that you replace older disk drives even if they have not failed. As disks age, the chance that they will fail increases. It is always better to replace a disk before it fails than to wait for it to fail and have to restore data from a backup or replace a disk on a Mac which is currently in use.
We currently have no recommendations for replacement intervals for SSD (Solid State Disks). These disks are too new and we do not have enough experience with them to make meaningful recommendations.
We recommend that disks in laptops be replaced after 5,000 hours of use. These disks are smaller and less reliable than the disks found in desktop computers and servers. This amount of use corresponds to 2 - 3 years of use by an average user.
We recommend that disks in desktop computers be replaced after 10,000 hours. These disks are more reliable than the smaller ones in laptops. They are subjected to the repeated stress of being turned on and off. This number of hours corresponds to 4 - 5 years of use in an average office environment.
We recommend that disks in servers be replaced after 20,000 - 25,000 hours. These disks are usually properly cooled and are not subject to the stress of being turned on and off but they often experience periods of intense activity. This number of hours corresponds to 2 - 3 years of use in a server which is on 24 hours a day.
These recommendations are corroborated by the
Google study on disk failure in servers which showed that disks fail at a rate of 2 - 3% during the first year and 7 - 10% during the subsequent years.
If you have many identical external disks, it is often easy to confuse one with another unless you add some sort of label to the outside of the disk. SoftRAID allows you to add the a label to each SoftRAID disk, a label which will appear in the SoftRAID user interface, log entries and email notifications. If the label you add to a disk in the SoftRAID is the same as the label you place on the outside of the physical disk, you can easily keep track of which physical disk corresponds to the one you have selected when you are using the SoftRAID application. This will help prevent you from inadvertently initializing the wrong disk or disconnecting it from your Mac when it is still in use.
Using SoftRAID with Volumes
SoftRAID volume safeguards protect you from accidentally destroying a volume which contains files that you need. When you enable the safeguard on a volume, SoftRAID prevents you from doing anything to the volume's disks which would destroy the volume. For example, you cannot initialize or certify a disk if any of the volumes on that disk contain a safeguard. SoftRAID also prevents you from deleting or erase a volume with a safeguard.
When you change the optimization setting for a volume in SoftRAID, you are telling the SoftRAID driver the primary use for that volume. This allows the driver to fine tune its behavior, like the mirror rebuild rate and when status information is written to the disk, so that the driver will not interfere with your work. For more information on the different types of optimization, see the
volume optimization help page.
You can make any SoftRAID volume read-only as long as it is not your startup volume and does not contain files which another application is using. You can do this by selecting "Make Read-Only" under the Volume menu in the SoftRAID application. Once a volume has been made read-only, the SoftRAID driver will prevent any application from from writing to the volume. At a later time, you can convert the volume back to read-write. For more information, see the
help page on read-only volumes.
If you are creating a stripe volume for editing uncompressed digital video footage, there are several things you can do to maximize its performance.
If the volume uses disks with rotating media (and not SSDs), you should create the volume with only the first 30% of each disk. You can do this by initializing each disk with SoftRAID and then creating your new stripe volume. In the SoftRAID application new volume window, click the Max Size button to see the maximum size that you can create. Then enter a value which is 30% of this as the size of your new volume.
You should also use SoftRAID to disable the journal on your stripe volume to prevent the file system from writing to the journal when you are capturing or playing back video data.
You should also shutdown your Mac each evening or restart it every 12 - 18 hours if you leave it on all the time. SoftRAID performs several maintenance functions when you first startup your Mac and every 24 hours thereafter. These functions include checking for new versions of SoftRAID and checking the SMART status of each disk attached to your Mac. You don't want these functions to be performed while you are capturing or playing back video data. (You can also disable these functions in the SoftRAID preferences. They are found in the
Application and
Volume tabs of the Preferences window.)
Using SoftRAID with Mirror Volumes
We recommend that your mirror volumes have 3 disks. This allows you to have two disks connected to your Mac at all times, a mirror pair. They can be two internal disks in a Mac Pro, or Mac Mini Server or a pair of external eSATA, FireWire or USB disks. The two disks act as a traditional mirror, if one disk fails while you are working, the second disk takes over and allows you to continue working.
The third disk in your mirror volume should be an external disk which is normally stored in another building. This disk should be connected up to the Mac with the other two disks in the mirror periodically so it can be rebuild and get updated with the most recent volume files. This third disk provides you with a disaster recovery mechanism. If your Mac gets stolen or the building your Mac is in burns down, you will still have the third mirror disk which you can use. At SoftRAID, we rebuild to our third mirror disk every 2 weeks. That way, the most we will ever lose is 2 weeks worth of work.
Fast Mirror Rebuilds happen so quickly that users often wonder whether all their mirror volumes contain the same data. If your 2 terabyte volume rebuilds in 10 - 20 minutes, you might think that some parts of the volume weren't rebuilt correctly.
We added the volume validate function to SoftRAID for just this purpose. If you
validate a mirror volume, SoftRAID will check if all the disks in the mirror are identical. SoftRAID does this by reading every sector from each of the disks in the mirror and comparing them. It issues an error message if it finds any sector which is not identical on all of the disks in the mirror volume. (We used this feature extensively when we were testing the Fast Mirror Rebuild code to ensure that it was rebuilding mirror volumes correctly.)
When you validate a SoftRAID volume, the SoftRAID driver reads all the sectors used for that volume on all the disks for that volume. When you validate a mirror volume, the driver takes the extra step of comparing the data on each disk and making sure it is identical. If a mirror volume has been validated successfully with SoftRAID, you are guaranteed that all the disks for that mirror volume contain identical data.
SoftRAID Best Practices
Most modern disk only receive limited testing when they are manufactured. They are tested to ensure that they can start and stop reliably and that they can read and write to some small fraction of the sectors on them. You should not trust your valuable work on these disks until you know that all of the sectors on them can reliably store data. Otherwise, you might write out an important file to a volume on the disk only to discover later than you can't read it back.
You can test all of the sectors on your new disk by using SoftRAID's disk certify function. This will write out data to every sector on the disk and then read it back to ensure that it is reliable. After successfully completing a multi-pass disk certify using SoftRAID on your new disk, you can be certain that it can store your files and read them back correctly.
It is tempting to ignore disk errors when they occur. Your Mac seems to be working fine and yet SoftRAID keeps telling you that one of your disks has an error.
You should take all disk errors seriously. Often a disk which has been reliable will have one or two errors a few months before it fails catastrophically. These disk errors are your early warning sign that a disk is about to fail. For more information on the steps we recommend taking when you encounter a disk with errors, see the
disk errors help page.
We recommend that your mirror volumes have 3 disks. This allows you to have two disks connected to your Mac at all times, a mirror pair. They can be two internal disks in a Mac Pro, or Mac Mini Server or a pair of external eSATA, FireWire or USB disks. The two disks act as a traditional mirror, if one disk fails while you are working, the second disk takes over and allows you to continue working.
The third disk in your mirror volume should be an external disk which is normally stored in another building. This disk should be connected up to the Mac with the other two disks in the mirror periodically so it can be rebuild and get updated with the most recent volume files. This third disk provides you with a disaster recovery mechanism. If your Mac gets stolen or the building your Mac is in burns down, you will still have the third mirror disk which you can use. At SoftRAID, we rebuild to our third mirror disk every 2 weeks. That way, the most we will ever lose is 2 weeks worth of work.
While we all rely on the internet and the servers on it, they are not available 100% of the time. In our testing we have seen times when email servers are unreachable, even the ones at MobileMe, Google and Yahoo. We therefore always recommend that you have two
outgoing email servers configured for your outgoing email notification. This will ensure that you always get the email notifications which SoftRAID sends out.
When you configure 2 outgoing email servers, you won't get two emails for every notification from the SoftRAID Monitor. The second (called the “alternate”) outgoing email server is only used if there is an problem sending email with the primary server.
Resolving Specific Problems
Whenever I startup my Mac, SoftRAID tells me one of my disks has errors. Why does this disk keep having errors?
SoftRAID maintains an error count for each disk. The error count is stored on the disk and allows you to determine which disk in your system is not reading and writing reliably. SoftRAID will display a dialog and send you an email if you have a disk with a non-zero error count. This happens every time you start up your Mac. Since the error count is store on disk, you can see which disk had an error days or weeks after the error occurred, even if you have restarted your Mac multiple times since then.
If you want to reset the error count, you can read the
Disk IO and Error Counts help page for instructions. You should also read the
help page on disk errors to see the steps SoftRAID engineers recommend when one of your disks has errors.
If you are using a volume on a disk which is not reading or writing correctly, you may or may not be notified of these errors. If your volume was created with Apple's Disk Utility program, either a normal volume or an AppleRAID volume, you will be notified of less than 30% of the errors which occur. It can look like the disk is working correctly, when in fact, it is not able to write your files or read them back reliably. Disk Utility will not always report errors when you create a new volume. Once your volume is created, the applications and file system code will report less than half of the disk errors back to you.
With SoftRAID, all of the disk errors are reported to you. These include all the errors which occur when you initialize a disk or create a volume on one or more disks. They also include all the errors the driver encounters when you are reading and writing files on a SoftRAID volume.
You may have one disk which SoftRAID predicts will fail based on SMART measurements but you still want to keep using it. Rather than having to disable SMART measurements on all disks, you can get SoftRAID to ignore just that one disk. To this you will have to modify the “MonitorExclusionList.plist” found in the SoftRAID application support folder (at "/Library/Application Support/SoftRAID/"). Once you create an entry for the disk you want to ignore, with the disk's serial number, you can disable SMART measurements on that disk. For complete instructions on how to get SoftRAID to ignore the SMART measurements for a particular disk, go to the
Excluding Disks and Volumes help page.
If you are convinced that SoftRAID is causing problems with your system, you can remove SoftRAID entirely. To to this, you will have to
convert all of your SoftRAID volumes to normal or AppleRAID volumes. Once you have converted all your volumes, you can
uninstall SoftRAID using the SoftRAID application.
What do these log entries mean?
The SoftRAID driver writes this entry to the system log whenever some other process tries to write to a SoftRAID volume which in does not have permission to write to. The SoftRAID driver considers this an illegal operation, writing to a volume which is only opened for reading, and returns the an error to the other process. It also writes and error message to the system log.
The most frequent process which causes this error is the journal code in the HFS file system. Starting with Mac OS X 10.5, the journal code would open a volume for reading and then write to it. The journal code would do this whenever Mac was incorrectly shutdown (either as a result of power loss or a kernel panic). All other storage drivers on the Mac permitted this write even though it violated the permissions model inherent in Mac OS X. This bug was finally fixed in Mac OS X 10.6.4.
When you startup a Mac from an AppleRAID or SoftRAID volume, the early part of the system startup uses the helper partitions associated with that volume. Once the kernel and drivers have been loaded from the helper partition, the system mounts the startup volume itself.
If the system is unable to mount the startup volume itself, it writes "waiting for root device" onto the screen and will continue to wait. If you are not starting up using verbose booting, you will not see this string, but will instead see your Mac hang with a grey screen.
The usual cause of this problem is that the helper partitions have not been updated correctly. This is usually caused by one of several bugs in Mac OS X. This bug occurs much more frequently in Mac OS X versions 10.4 and 10.5 than it does in version 10.6 and later.
If you upgrade your startup volume from one version of Mac OS X to the next, a new SoftRAID driver will be installed which lacks the SoftRAID Monitor and SoftRAID daemon. Although the newly installed copy of Mac OS X lacks the SoftRAID Monitor and SoftRAID daemon, your startup volume will still contain the two files which tell it to start these two SoftRAID background tasks. Mac OS X writes an error to the system.log file each time it tries to start up these two nonexistent files.
The solution to this is either to run the latest version of SoftRAID which will reinstall the SoftRAID driver, the SoftRAID Monitor and the SoftRAID daemon or to delete the two files which cause Mac OS X to try and start the SoftRAID Monitor and SoftRAID daemon. (If you want to delete these two files, they can be found at: /Library/LaunchAgents/com.softraid.SoftRAID_Monitor.plist and /Library/LaunchDaemons/com.softraid.softraidd.)