This is DEFFS

DEFFS stands for "Distributed, Encrypted, Fractured File System". Its primary goal is to secure a group's files while also allowing said files to be accessible from multiple machines on a local network, and preserving the ability to keep files private from others.

Table of Contents

Example Use Case

A small laboratory has 3 employees: a Chemist, a Biologist, and a Physicist. They're working on an experiment together, so they'd like to share files. Traditional solutions to this issue include git or Google Drive, however none of the 3 employees are programmers, and their files are large enough that they don't want to have to pay for extra storage from Google.

DEFFS provides a non-invasive, easy-to-use solution for their issue. On each of their machines, they create two folders, "storepoint" and "mountpoint", at any location they desire.
$ mkdir storepoint mountpoint
They execute the DEFFS binary, providing the paths to the two folders they created, and the number of machines that DEFFS will interact with (2, as the Chemist's machine will interact with the Biologist's and the Physicist's, and etc).
$ DEFFS mountpoint storepoint -n 2
Now, whenever they want to share files, all they have to do is copy the files into "mountpoint", or just work in that directory by default. Whenever files are written inside of the mountpoint, their contents are encrypted and split into shards which are distributed to the other machines via their shared local network.

For example, the Physicist would like to share his new star charts with his coworkers. He copies the chart files and pastes them into a new folder inside of "mountpoint". Each of the charts is split into 3 encrypted shards, 2 of them being sent to the Chemist's and Biologist's machines. When the Chemist opens her own "mountpoint" directory, she will see the new folder containing the star charts. Upon opening the chart, DEFFS will send a TCP request to the other 2 machines, asking for the corresponding file shards. The shards will be recombined, decrypted, and displayed to the Chemist. No manual uploading or downloading required. In this way, DEFFS treats a network of machines as a single machine, distributing its contents over each node.

Security

DEFFS currently encrypts file contents with the plain AES cipher provided by OpenSSL, but this will soon be replaced by AES GCM for improved security. The symmetric encryption keys are passed through Shamir's Secret Sharing Scheme, providing a number of key chunks equal to the number of shards to be created and machines to interact with. Shard files are stored in a hidden directory within the storepoint (the shardpoint) with an SHA256-generated filename, and contain the Shamir-encoded encryption key chunks. "Header" files, or the normally-visible files that users will interact with, actually only contain the SHA256 hash that corresponds with the file's shards. To access a file, all shards for the corresponding file must be retrieved to unlock the file's contents, due to the nature of Shamir's algorithm.

FAQs

What distinguishes DEFFS from other distributed filesystems?

Distributed filesystems are not a new concept, and DEFFS builds on the conventions that previous projects have set up. Among other distributed filesystems, DEFFS is most similar to Tahoe-LAFS.

DEFFS stands apart in a few key ways:

Do I have to approve shard requests when my colleagues want to view files?

No, shard requests and replies are completely automatic.

What if a new machine joins the network after DEFFS has been mounted?

By default, the new DEFFS instance will sniff the network for other DEFFS instances, and notify them that it has joined the network. The new node will receive file headers but shards will not be redistributed. The new machine will only begin to store shards if a file is written to.

What if a hard drive disconnects, dies, or otherwise leaves the network?

The default DEFFS configuration is geared towards an always-on network of reliable machines, so this is not a top issue right now. However, the current planned solution to this issue is to base DEFFS shard architecture on RAID 1 mirroring to allow any machine to disconnect without data loss. A simpler option that will be implemented soon is the option to automatically create redundant shards.

What's the deal with concurrent file writes?

This solution to the issue of concurrent file modifications is a currently-undecided one, but there are a few different ways to go. The simplest solution is to lock or make readonly a file on all machines if it's open on one, preventing more than one user from working on a given file at once. This solution is better for smaller networks with users that don't modify each other's files a lot. The second option is to have a git-like commit feature, which would be harder to implement and more complicated for users to understand, but would handle cases where files are highly-shared.