I have recently explored the idea of a Distributed Bioinformatics File System in Eagle Genomics’ blog.
In this article, I mentioned about GlusterFS, a distributed file system, and talked about my previous experience in installing it under Amazon Web Services for a Bioinformatics project.
Finally, I enumerated two potential features that a future filesystem, Distributed Bioinformatics File System (DBFS) should preferably have: data deduplication and delta encoding.
I have received interesting and encouraging comments from one of the GlusterFS developers, who informed us that they have been actually already experimenting with integrating these features!
Read the full article here.