RAID File Recovery

tl;dr recovering data from hard drives can be confusing if RAID is involved

A family member asked if I could get the files off their old computer. I opened it up and there were two hard drives! There were also two optical drives, a 3.5-inch floppy, and a graphics card with DVI and S-Video. The case did not have a name brand and I had a feeling it was a custom build. The sticker on the front said Pentium III, and the one on the side said Windows XP. I would have wanted to try starting the computer, but there wasn’t a monitor or anything on hand, so I took out the two 80GB IDE hard drives and brought them home. I also ordered an appropriate USB adaptor and a flash drive to put the files on.

80GB, 2x

The plan was to connect the drives and they would mount on my mac as if they were Windows-formatted external drives. This isn’t what happened. Neither Mac OS nor Windows (via Parallels) could read either disk, though both helpfully offered to format them. My first thought was that somehow the disks or the filesystems were damaged. I created full-disk images of both disks using dd program then ran TestDisk against each. This reported that there were potentially partitions on the images, but some were larger than the disk itself, and none could be recovered.

TestDisk partition recovery results, disk 1
TestDisk partition recovery results, disk 2

TestDisk comes with a tool called PhotoRec, which scans the entire partition or disk for anything that looks like files. I ran this on both images, and it produced over 100,000 “files”. The files represented a broad range of data: code, legal disclaimers, images, video clips, and programs (my antivirus was blowing up!). None were very large, and most seemed incomplete or corrupted. The files were more like shreds of files, as if I was looking inside a paper shredder. I also noticed that similar-looking file-shreds appeared on both disks, which didn’t agree with my assumption that the disks were used independently for separate things.

‘turns out the BMP image format starts with the pixels at the bottom

I had some prior exposure to RAID (Redundant Array of Independent/Inexpensive Disks), having installed a RAID-1 array on the NAS unit in my closet. At some point it dawned on me that these two disks were potentially a RAID-0 “stripe set” array. One hint, beside from what I could see in the data, was that the two hard drives were the same model, a condition that is recommended for RAID to work properly. While RAID-1 combines the disks to create one fault-tolerant array of the same size, RAID-0 combines the disks to make one larger array. Both schemes promise some improvement in read speed, while RAID-1 also improves write speed. There are other RAID configuration options if you have more disks, but these are the two options if you have two disks. Another possibility was that the two disks were combined head-to-tail, creating a “Spanned Volume” (windows name) / “Concatenated Disk Set” (mac name) / “JBOD” (industry loves acronyms) Just a Bunch Of Disks. This possibility wasn’t entirely off the table, but based on how the data was shredded and distributed on the disks, it was looking more like a RAID striped set.

mystery music remixed because of RAID0 disk striping

A post shared by Ames Bielenberg (@amesbielenberg) on

I started to include “RAID” in my Google queries. There are a number of commercial products offering RAID support or recovery, as well as some open-source tools for Linux. I was also looking into how a basic Windows PC might get set up to use RAID, which is more commonly seen in servers. It sounds like many motherboard BIOS implementations support RAID, sometimes after a firmware upgrade. Windows now has built-in support, but I don’t think XP did. Either way, it’s software-based RAID, meaning the striping logic is handled by the CPU, potentially competing with whatever else the processor is working on. A server or custom PC can also use a hardware disk array controller. These are dedicated cards that connect directly to the hard drives and often (but not always) implement RAID on their own.

Free RAID Recovery – result screen

The tool that finally helped me was “ReclaiMe Free RAID Recovery“. At first I was sure it was a scam, but it turned out to do just what I needed. I ran this program on my Windows VM and pointed it to the images of the disks. It examined the images and eventually concluded that they did in fact form an array. It gave me the essential details of the array, namely disk order and block size, and provided an option to save the array as a single image. The resulting disk image was 160GB, as expected. I double-clicked the image and right away it mounted 4 partitions on my mac. Two were FAT format and two were NTFS, and they had names like PHOTOS and GAMES. I copied the contents of each partition to different folders in the flash drive using cp.

After a fair amount of head scratching and a number of painfully slow progress bars, I am happy to say the project was successful. It gave me plenty to ponder for a while, and seemed like a good topic for a blog post, so… ta da!

I could probably write another article about the things I tried that didn’t work and the things I only considered trying. One tool that helped me check the health of the disks was HDDScan. This gave me SMART reports, which were hard to decipher, but generally indicated that the drives were OK. I also tried converting the raw disk images to a Parallels-compatible format using qemu-img, though this route was never fruitful.

Wi-Fi Temperature Logging with ESP8266

Temperature logging is awesome! It gives us visibility on a dimension of our local environment that we can usually only feel. There also seems to be some demand for precision temperature tracking for precesses like beer brewing and clay firing.

A few months ago I built a Bluetooth Thermometer using a digital thermometer and Bluetooth LE unit from Adafruit. I also started making an iOS App that could theoretically display the temperature in real-time and a give an interactive graph, with curve-fitting, so it could notify you 5 minutes before a setpoint is reached.

The ESP8266 is a shiny new low-cost WiFi chip that everyone’s excited about.

ESP-01 Top
ESP-01 Top
ESP-01 Bottom
ESP-01 Bottom

The ESP-1 module (pictured) includes the ESP8266 chip, a flash memory, a crystal oscillator for the CPU, some indicator LEDs, and an antenna for Wi-Fi.

ESP8266 Block Diagram
ESP8266 Block Diagram

The ESP8266 IC contains both the analog Wi-Fi radio, and the digital components of a microcontroller. It can run a custom application from the flash like an Arduino. Another option is to install NodeMCU, which runs Lua scripts, interactively over serial, or from flash.

NodeMCU comes with a Lua module for my digital thermometer, DS18B20! Once that’s installed, the temperature can be read like so:

The next step is to send it somewhere over the internet. Before that’s possible, we need to connect to the Wi-Fi:

Now that we’re online, we can send those temperature number somewhere.

We could POST them to a server using HTTP (like a web form), but another option is to use MQTT, a Pub/Sub protocol designed for embedded sensor applications just like this. The MQTT server is called a broker. The sensor can publish messages to the broker with with a topic, e.g. temp/0. Applications can subscribe to topics they’re interested in, e.g. temp/# (# is wildcard).  I chose the open-source broker Mosquitto.

NodeMCU comes with an MQTT client implementation. First we connect to the broker:

Then we can publish temperature messages:

MQTT has 3 levels of QoS (Quality-of-Service). Level 1 means the message will be delivered “at least once”. The client will re-send the message until the broker acknowledges receipt.

Now that the data is published to the server, we can do anything with it!

I wanted to make graphs, so I looked around and found the time-series database InfluxDB and the graphing front-end Grafana. This excellent open-source stack is also available as a hosted service. I made a little node.js script called measure that subscribes to temp/# (using MQTT.js) and dumps the samples into InfluxDB.

Here’s a diagram of the whole flow:

System Block Diagram
System Block Diagram

I ran the sensor in my room for a while, sampling the temperature every second, and made some graphs!

Grafana - Air Conditioning
Grafana – Air Conditioning

This screenshot of the Grafana Dashboard show the temperature of my room with the AC running. Note the fluctuations as the AC cycles, keeping the temperature between 77 and 78 degrees all night.

The thermometer probe I have is waterproof, and can go up to 125°C (257°F), and I was boiling some potatoes, so I figured it’d be a good test:

Grafana - Boiling Potatoes
Grafana – Boiling Potatoes

The graph shows the water rising to 210˚F, where is stays (boiling), until I turn it off, where it cools down to room temperature over a few hours.

The DS18B20 sensor has a maximum resolution of 1/16˚C (0.1125˚F). This is made visible by zooming in:

Grafana- Zoomed In
Grafana- Zoomed In

Whee temperature logging!

What next? One possibility is be to make my own controller for our old AC unit, with precise temperature and mode controls, scheduling, and HomeKit integration!

Howdy!

Hello World.

This is my WordPress blog! Jon and I are hosting it using Amazon Web Services. I keep meaning to post things here, about projects, adventures, musings… but it hasn’t happened yet (I’ll have to update this if/when I actually post).

My Old Site has some interesting projects.

Cheers, Ames