$USER@bit-of-a-byte ~ $ cat /var/log/ubiquiti-edgerouter-lite-squashfs-block-read-error-part-1.log

Ubiquiti EdgeRouter Lite SquashFS Block Read Error: Part 1

Problem Description

Upon booting the EdgeRouter Lite, the system will not boot. Connecting to console reveals the following log:

SQUASHFS error: zlib_inflate error, data probably corrupt
SQUASHFS error: squashfs_read_data failed to read block 0x1b4574
SQUASHFS error: Unable to read fragment cache entry [1b4574]
SQUASHFS error: Unable to read page, block 1b4574, size 8519
SQUASHFS error: Unable to read fragment cache entry [1b4574]
SQUASHFS error: Unable to read page, block 1b4574, size 8519

This can be caused by an unclean shutdown, which could potentially occur with a power failure. Due to the failure, parts of the internal ext3 filesystem become corrupted, which causes the inability to load the SquashFS partition which contains EdgeOS. The end result, of course, is a non-functional unit.

Problem Severity

There are four potential levels to determining the severity of this issue:

  • Can the filesystem be salvaged by fscking the ext3 partition until it is salvageable? (Severity: Annoying.)
  • If the answer to the above is no, can the flash drive be saved by zeroing out the device and writing a new filesystem to it? (Severity: Hardware issue - End user repair possible..)
  • If the answer to the above is no, is the device capable of booting? If yes, it is possible you may be suffering from the EdgeRouter Lite RAM issue. (Severity: Hardware issue - RMA the device.)
  • If the answer to the above is no, is the device under warranty? If yes, (Severity: Hardware issue - RMA the device.) If no, (Severity: Device Replacement.)

Problem Resolution

Resolutions will be broken down into two posts, one for each problem the user could resolve without warranty work.

Resolution Credits

A special thanks to Ubiquiti for providing a community support option via their official website. Additionally, special thanks to all those users that utilize said forum, your information was invaluable in helping me to fix my EdgeRouter, and write this article.

Resolution One: Repair the File System

This is by far the easiest resolution. Open the EdgeRouter Lite (Caution: Your warranty is now void. Depending on how soon you need the device back in operation, it may be acceptable for you to void the warranty on a $100USD device if you can get it back in service by the end of the day. If you think your device may need to be RMA’d, do not open the device and proceed to UBNT’s Support Site.) by removing two screws on the bottom of the device, located right under where the interface connections are. Your device will slide apart, revealing the internal board. On the board you will see a soldered on female USB port with a small silver USB flash drive, roughly 3cm long. This is your 2GB of internal memory, but the device itself is 2.6GB.

Remove the USB drive and plug it into a USB port on your Mac OSX or Linux powered device. Open a terminal, and run dmesg. You are looking for the devfs name of your device.

[  262.645401] scsi 2:0:0:0: Direct-Access                               5.00 PQ: 0 ANSI: 2
[  262.647109] sd 2:0:0:0: Attached scsi generic sg1 type 0
[  262.648527] sd 2:0:0:0: [sdb] 4057088 512-byte logical blocks: (2.07 GB/1.93 GiB)
[  262.648795] sd 2:0:0:0: [sdb] Write Protect is off
[  262.648817] sd 2:0:0:0: [sdb] Mode Sense: 0b 00 00 08
[  262.649095] sd 2:0:0:0: [sdb] No Caching mode page found
[  262.649114] sd 2:0:0:0: [sdb] Assuming drive cache: write through
[  262.652309]  sdb: sdb1
[  262.655018] sd 2:0:0:0: [sdb] Attached SCSI removable disk

In this case, my device name is sdb. Please note that for this guide I am not using an actual EdgeRouter flash drive. This is a stand-in device. The next thing to do is make sure none of the partitions on the device have been automatically mounted. Where I used sdb in the following example, use whatever dmesg identified as your USB device.

zyradyl@captor ~ $ mount | grep sdb
/dev/sdb1 on /media/zyradyl/b7bcf200-26a1-41ed-9122-625558dbc907 type ext4 (rw,nosuid,nodev,relatime,data=ordered,uhelper=udisks2)

This would indicate that at least one of the partitions on the disk has been mounted. To correct this we need to use the umount command. Note that in some operating systems an eject button exists in the taskbar. In my experience, not only do these buttons unmount the mounted partitions, it also removed the device entry from the kernel. So I prefer to use manual umount commands. In the following example, do one line per mounted partition, substituting the device names of your partitions where I specified /dev/sdb1.

zyradyl@captor ~ $ sudo umount /dev/sdb1

Once you have unmounted the devices, you can use fdisk to get a good view of the partition layout on your USB device. I like to do this even if I know what I should expect, just as a form of a sanity test to make sure I have the right device. Note that, as previously stated, this device is a stand in. I have tried to recreate the partition table to the best of my memory.

zyradyl@captor ~ $ sudo fdisk -l /dev/sdb

Disk /dev/sdb: 2 GiB, 2077229056 bytes, 4057088 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xd9dd6356

Device     Boot  Start     End Sectors  Size Id Type
/dev/sdb1         2048  264191  262144  128M  b W95 FAT32
/dev/sdb2       264192 4057087 3792896  1.8G 83 Linux

So we can see from the above that we have two partitions on the USB device. In your case, when using an actual EdgeRouter flash drive, one of the partitions will be a vfat partition type, and the second one will be a linux partition type. Now that we know where our partitions are, and that they are safely unmounted, we can use fsck. Remember to replace my instance of sdb2 with the correct partition for your device. Note that depending on the damage to the filesystem, this has the potential to be a destructive operation.

zyradyl@captor ~ $ fsck.ext3 -fvy /dev/sdb2

Allow the program to run till completion. If you notice the command has become looped, you will need to clear the journal out on your device. Note that this is a potentially destructive operation to data that had not been written from the journal to the disk. The first thing that we need to do is clear out the recovery indicator using debugfs Remember to replace sdb2 with the proper device name of your ext3 partition.

zyradyl@captor ~ $ debugfs -w -R “feature ^needs_recovery” /dev/sdb2

After clearing out this flag, it is now possible to use tune2fs to force removal of the journal.

zyradyl@captor ~ $ tune2fs -f -O ^has_journal /dev/sdb2

Now you can go back up and run the fsck command. Once that is done, you will need to re-enable the journal on your filesystem.

zyradyl@captor ~ $ tune2fs -f -O has_journal -j size=128M /dev/sdb2

Once you have a clean fsck output, and you have your journal enabled again, plug the USB device back into the EdgeRouter and boot. If you are lucky, everything will come back as it should. As there may be data loss, make sure to restore from a recent configuration backup (you do have configuration backups, right?) and reboot, before logging into the device through ssh to make sure everything is where it should be.

If you are still having problems, stick around for part two of this guide, where I will be walking you through using a separate host computer to recreate the filesystem.

EDIT (2018/12/09): Part 2 was never written, but this is actually a fairly common issue with EdgeRouter Lites. A simple Google search should turn up the EdgeRouter Emergency Recovery Kit GitHub, and that plus some posts on the UBNT forums should help you solve the problem. Sorry about my inability to deliver on what I mention in my posts.

[How to Recover an EXT3 Volume with an Unreadable Journal][2]