Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You might have missed the big H2 section in the article:

"Recommendation: Stick with sequences, integers, and big integers"

After that then, yes, UUIDv7 over UUIDv4.

This article is a little older. PostgreSQL didn't have native support so, yeah, you needed an extension. Today, PostgreSQL 18 is released with UUIDv7 support... so the extension isn't necessary, though the extension does make the claim:

"[!NOTE] As of Postgres 18, there is a built in uuidv7() function, however it does not include all of the functionality below."

What those features are and if this extension adds more cruft in PostgreSQL 18 than value, I can't tell. But I expect that the vast majority of users just won't need it any more.





Sticking with sequences and other integer types will cause problems if you need to shard later.

Especially in larger systems, how does one solve the issue of reaching the max value of an integer in their database? Sure for unsigned bigint thats hard to achieve but regular ints? Apps quickly outgrow that.

OK... but that concern seems a bit artificial.. if bigints are appropriate: use them. If the table won't get to bigint sizes: don't. I've even used smallint for some tables I knew were going to be very limited in size. But I wouldn't worry about smallint's very limited number of values for those tables that required a larger size for more records: I'd just use int or bigint for those other tables as appropriate. The reality is that, unless I'm doing something very specific where being worried about the number of bytes will matter... I just use bigint. Yes, I'm probably being wasteful, but in the cases where those several extra bytes per record are going to really add up.... I probably need bigint anyway and in cases where bigint isn't going to matter the extra bytes are relatively small in aggregate. The consistency of simply using one type itself has value.

And for those using ints as keys... you'd be surprised how many databases in the wild won't come close to consuming that many IDs or are for workloads where that sort of volume isn't even aspirational.

Now, to be fair, I'm usually in the UUID camp and am using UUIDv7 in my current designs. I think the parent article makes good points, but I'm after a different set of trade-offs where UUIDs are worth their overhead. Your mileage and use-cases may vary.


Idk I use whatever scales best and that would be an close to infinite scaling key. The performance compromise is probably zeroed out once you have to adapt ur database to a different one supporting the current scale of the product. Thats for software that has to scale. Whole different story for stuff that doesnt have to grow obviously. I am in the UUID camp too but I dont care whether its v4 or v7.

It's not like there are dozens of options and you constantly have to switch. You just have to estimate if at maximum growth your table will have 32 thousand, 2 billion or 9 quintillion entries. And even if you go with 9 quintillion for all cases you still use half the space of a UUID

UUIDv4 are great for when you add sharding, and UUIDs in general prevent issues with mixing ids from different tables. But if you reach the kind of scale where you have 2 billion of anything UUIDs are probably not the best choice either


There are plenty of ways to deal with that. You can shard by some other identifier (though I then question your table design), you can assign ranges to each shard, etc.

I’m really no expert on sharding but if you’re using increasing ints why can’t you just shard on (id % n) or something?

Because then you run into an issue when you 'n' changes. Plus, where are you increasing it on? This will require a single fault-tolerant ticker (some do that btw).

Once you encode shard number into ID, you got:

- instantly* know which shard to query

- each shard has its own ticker

* programatically, maybe visually as well depending on implementation

I had IDs that encode: entity type (IIRC 4 bit?), timestamp, shard, sequence per shard. We even had a admin page wher you can paste ID and it will decode it.

id % n is fine for cache because you can just throw whole thing away and repopulate or when 'n' never changes, but it usually does.


^ This

This is mentioned, and in many applications you can safely say you will never need to shard.

Yes, but if you do need to, it's much simpler if you were using UUID since the beginning. I'm personally not convinced that any of the tradeoffs that comes with a more traditional key are worth the headache that could come in a scenario where you do need to shard. I started a company last year, and the DB has grown wildly beyond our expectations. I did not expect this, and it continues to grow (good problem to have). It happens!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: