Except, S3 does let you query by prefix and so the keys have more structure than...

klodolph · on March 10, 2024

That’s kind of stretching the idea of “more structure” to the breaking point, I think. The key is just a string. There is no entry for directories.

> the API implies that common prefixes indicate related objects.

That’s something users do. The API doesn’t imply anything is related.

And prefixes can be anything, not just directories. If you have /some/dir/file.jpg, then you can query using /some/dir/ as a prefix (like a directory!) or you can query using /so as a prefix, or /some/dir/fil as a prefix. It’s just a string. It only looks like a directory when you, the user, decide to interpret the / in the file key as a directory separator. You could just as easily use any other character.

hiyer · on March 10, 2024

One operation where this difference is significant is renaming a "folder". In UNIX (and even UNIX-y distributed filesystems like HDFS) a rename operation at "folder" level is O(1) as it only involves metadata changes. In S3, renaming a "folder" is O(number of files).

Someone · on March 10, 2024

> In S3, renaming a "folder" is O(number of files).

More like O(max(number of files, total file size)). You can’t rename objects in S3. To simulate a rename, you have to copy an object and then delete the old one.

Unlike renames in typical file systems, that isn’t atomic (there will be a time period in which both the old and the new object exist), and it becomes slower the larger the file.

pepa65 · on March 10, 2024

From reading the above, if you have a folder 'dir' and a file 'dir/file', after renaming 'dir' to 'folder', you would just have 'folder' and 'dir/file'.

klodolph · on March 10, 2024

There is really no such thing as a folder in S3.

If you have something which is dir/file, then NORMALLY “dir” does not exist at all. Only dir/file exists. There is nothing to rename.

If you happen to have something which is named “dir”, then it’s just another file (a.k.a. object). In that scenario, you have two files (objects) named “dir” and “dir/file”. Weird, but nothing stopping you from doing that. You can also have another object named “dir///../file” or something, although that can be inconvenient, for various reasons.

okr · on March 10, 2024

Imho, renaming "folders" on S3 results in copying and deleting O(number of files)

hiyer · on March 11, 2024

Exactly.

fiddlerwoaroof · on March 10, 2024

> That’s something users do. The API doesn’t imply anything is related.

Querying ids by prefix doesn’t make any sense for a normal ID type. Just making this operation available and part of your public API indicates that prefixes are semantically relevant to your API’s ID type.

klodolph · on March 10, 2024

“Prefix” is not the same thing as “directory”.

I can look up names with the prefix “B” and get Bart, Bella, Brooke, Blake, etc. That doesn’t imply that there’s some kind of semantics associated with prefixes. It’s just a feature of your system that you may find useful. The fact that these names have a common prefix, “B”, is not a particularly interesting thing to me. Just like if I had a list of files, 1.jpg, 10.jpg, 100.jpg, it’s probably not significant that they’re being returned sequentially (because I probably want 2.jpg after 1.jpg).

afiori · on March 10, 2024

by this logic the file "foo/bar/" correspond to the filename "f:o:o:/:b:a:r:/" (using a different caracter as separator)

fiddlerwoaroof · on March 13, 2024

Exactly