Question 1

Why is s3fs a problem in production?

Accepted Answer

Three reasons: it implements POSIX poorly (rename isn't atomic, locks don't work, mmap is flaky), it has no shared cache so every node fetches data independently, and it requires per-pod FUSE setup in containers which breaks in many Kubernetes environments.

Question 2

Is Mountpoint for Amazon S3 any better?

Accepted Answer

Somewhat. It's faster and more stable than s3fs for read-heavy sequential access, but it still has limited POSIX support (no random writes, no atomic rename in the general case) and no shared cache across nodes. It's still FUSE-based.

Question 3

Can I use Training Pipes with my existing S3 bucket?

Accepted Answer

Yes — this is the BYO connections feature. Point Training Pipes at your S3 bucket, and we expose it via NFS/SMB through a regional gateway. Your data stays in your S3 bucket; only the gateway middleware is ours.

Question 4

What happens on a cache miss?

Accepted Answer

The gateway fetches from the backing object store, caches the result on local NVMe, and returns it to the client. Subsequent reads of the same data hit the cache. The first read has object-storage latency (~30-50ms); subsequent reads are sub-ms.

Question 5

Do I still need to manage S3 directly?

Accepted Answer

For managed buckets: no, we handle it. For BYO connections: you still own the bucket, its lifecycle rules, and its permissions. The gateway reads and (optionally) writes via scoped credentials you provide.

Feature	s3fs	Training Pipes
Protocol	FUSE → HTTP (S3 API)	NFSv4 / SMB → gateway → S3
POSIX atomic rename	Emulated (unsafe)	Yes
POSIX file locking	No-op	Yes
mmap support	Unreliable	Yes
Shared cache across nodes Shared cache is the biggest difference for multi-node training.	No	Yes
Prefetch / preload	No	Yes
Container/Kubernetes support	Requires privileged pods	Standard NFS CSI
Per-read S3 request cost	Every read on cold cache	Only on cache miss
Tail latency under load	100ms-1s+	Sub-ms on cache hit
Works with BYO S3/GCS/Azure	S3-compatible only	Yes
Managed service	No	Yes

Comparison

s3fs vs Training Pipes

Short answer

Feature-by-feature comparison

When to use s3fs

When to use Training Pipes

The verdict

Frequently asked questions

Related reading

Ready to try Training Pipes?