In this thesis we present the design of Keso, a distributed and completely decentralized file system based on the peer-to-peer overlay network DKS. While designing Keso we have taken into account many of the problems that exist in today's distributed file systems.
Traditionally, distributed file systems have been built around dedicated file servers which often use expensive hardware to minimize the risk of breakdown and to handle the load. System administrators are required to monitor the load and disk usage of the file servers and to manually add clients and servers to the system.
Another drawback with centralized file systems are that a lot of storage space is unused on clients. Measurements we have taken on existing computer systems has shown that a large part of the storage capacity of workstations is unused. In the system we looked at there was three times as much storage space available on workstations than was stored in the distributed file system. We have also shown that much data stored in a production use distributed file system is redundant.
The main goals for the design of Keso has been that it should make use of spare resources, avoid storing unnecessarily redundant data, scale well, be self-organizing and be a secure file system suitable for a real world environment.
By basing Keso on peer-to-peer techniques it becomes highly scalable, fault tolerant and self-organizing. Keso is intended to run on ordinary workstations and can make use of the previously unused storage space. Keso also provides means for access control and data privacy despite being built on top of untrusted components. The file system utilizes the fact that a lot of data stored in traditional file systems is redundant by letting all files that contains a datablock with the same contents reference the same datablock in the file system. This is achieved while still maintaining access control and data privacy.