Comments (11)
There exist other storage solutions for baremetal clusters, but among them, I would only use Rook/Ceph. Kubernetes does support NFS natively, though I think only a madman would actually use NFS in production (though I understand there are many people who will vehemently disagree).
Personally, I would stay away from any shared-filesystem type approach. You will either trade speed or reliability, and there is almost always a better way to achieve the actual requirements than sharing POSIX-like filesystems amongst disparate machines.
If you really must have a shared filesystem, GlusterFS is another option. If offers a native shared filesystem as well as an NFS compatibility later. It is supported by Kubernetes out-of-the-box. Since it is simpler than Ceph and concentrates just on a shared filesystem, that is probably the most appropriate answer to your needs. Keep in mind that, like any distributed storage system, you must install n+1 (i.e. minimum 3 nodes) for redundancy, and it is easy to overlook this fact in the GlusterFS documentation.
from torus.
Have you checked out https://github.com/rook/rook?
from torus.
from torus.
@SkinyMonkey If you are in a single cloud provider, why are you not simply using the provided storage services? Azure offers both NFS-like storage and volume-like storage. AKS already has storage classes for azure-disk (volume) types out of the box.
Torus' use cases (in my opinion) are for multi-cloud architectures and baremetal clusters, where you are not already provided an economical storage infrastructure. Rook is serving those needs well, and since it is just Ceph under the hood, it is pretty easily debuggable. By and large, the Go codebase of Rook's operators and wrappers is pretty comprehensible and well-architected.
from torus.
Sadly the NFS like storage, AzureFile, is extremely slow, to the point of not being usable in a realistic environment. This has been reported multiple times since 2017 and still hasn't been adressed.
About the volume like storage, as far as I know you cannot attach the same azure disk to multiple pods, making it useless in a shared file system context.
If it does that would be a good temporary solution, but ..
the product I'm working on aims at being deployable on premise anyway. We need an abstract solution to this problem.
Rook is nice, and maybe i'm looking at it from the wrong angle, but again, it's an overkill solution for a simple temporary shared file system.
It also has prerequisites that makes it difficult to use in my case, as I described earlier (kernel updates etc)
I might use Rook or a solution like Scality's or Minio in the end, it's just that torus seemed like a really good project that could benefit a lot of people.
Right now I'm stuck between overkill solutions that need more and more infrastructure/maintenance/development to support it and simple ones that wont survive a node crash.
from torus.
I've checked/considered a lot. Rook/Ceph is in the top 5 with Glusterfs.
I agree That NFS offers no guarantee of reliability.
The product I'm working on demands that I share files between the pods i'm executing. Big files too, hundred of gigs.
Another solution would be to execute all the pods on the same node .. but it would be losing the real plus of kubernetes: distribute the computing charge over several machines.
Glusterfs was considered but I met the same problem as I met with rook : kernel updates or modules are needed.
Any distributed system needs at least 3 nodes to run yes, Mongodb, Rabbitmq, Etcd, any clustered program needs 3 nodes fo redundancy. K8s is another example.
Thank you, I'll recheck Glusterfs, maybe i'll have two separate clusters : one for the file system, one for the computation. That way the filesystem will be less subjects to autoscaling needs and problems.
from torus.
Separate clusters would be precisely my advice, in that case, yes.
Also to note, though, since your ultimate goal is to deploy to baremetal, you might as well be using bare VM instances running your own kernel, rather than the cloud provider's prefabricated kubernetes, which would give you exactly the same freedom of deployment, kernel-wise, as you would have with baremetal. You can easily also easily follow a progressive build-out, to reduce the size of each hurdle:
- Deploy to AKS using offered (however slow) shared filesystem there.
- Deploy to Azure bare VMs and construct your storage layer.
- Deploy to baremetal.
The storage interfaces for Kubernetes are nicely abstracted, so you don't need to change your applications just because you change your underlying storage infrastructure. Your YAML files may change, but your application interfaces will be the same throughout that progression.
I know I keep harping, but I would have very serious reservations about an architecture which calls for shared filesystems, much less with individual files of that size. In my opinion, you are just asking for trouble. I'm not about to dive into an architectural analysis, but I would strongly urge you to take a step back and really analyze your storage parameter needs.
from torus.
Thank you for your time and analysis, we work on both hosted and on premise solutions.
I'm not sure to understand the problems tied to a shared file system. Maybe because I never worked with one in production?
In a case where a chain of interdependant programs produce heavy files, I would be curious to see what solution, that avoid using a shared file system, would allow the system to work in a reasonably fast way.
Edit: by juggling with a 'per job' azure disk for example, I could build something in a hosted situation i guess. But i'm not sure how that would work on premise.
from torus.
The core problem is always locking, and POSIX-oriented filesystem semantics just don't provide sufficient locking guarantees or semantics to facilitate safe operation. Various tricks are used to work around this limitation, and the trade-off is always either speed or safety.
If a 'per job' azure disk is something you are considering, consider using an object store instead. Every major cloud provider supplies one, and there are a few options for self-hosted object stores (including Rook/Ceph). You can have any number of readers, and they handle locking automatically. For extremely large chunks of data, too, most object store APIs (the de facto standard is S3) allow for server-side seeking, to efficiently process smaller chunks at a time. In general, the APIs are very simple, so it is nearly as easy to use as a filesystem.
from torus.
https://github.com/tiglabs/containerfs
https://github.com/tiglabs/containerfs-csi-driver
from torus.
@Ulexus Ok! That makes sense. I did see some wait happening on read operation with NFS. 'Could have been caused by the locking.
I'm going to check object stores on Azure and the on premise solutions.
Thanks again :)
@zhengxiaochuan-3 , I did found this repo but the lack of documentation or orientation made me consider other solutions. Thank you
from torus.
Related Issues (20)
- torus flex volume doesn't have recycling policy(option) of volume
- need soft limits for volume quota and storage node's data file
- how to config torus with etcd cluster
- torusblk nbd error HOT 2
- torusblk aoe hangs while stopping the process
- tcmu: failed to first mount after mkfs without stopping torusblk
- mod ring causes panic when it doesn't have enough storage node
- request: add contribution guide
- Flex Volume failed to attach volume on Google Container Engine HOT 2
- volume myVolume is already mounted on another host HOT 1
- Is torus only designed to be a block storage? Is there any plan to build a torus filesystem HOT 2
- Running Torus in Docker HOT 2
- Add feature to prepend a HTTP-Header on calendar output.
- support systemd sd-notify in torusblk nbd
- The default etcd address can't be parsed correctly on the latest golang version HOT 2
- Torus Auth issues HOT 1
- torusblk flexprepvol: mkfs failed (race?) HOT 1
- cant mount with torus on google container engine
- Is the Torus project closed? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torus.