Because then you run into an issue when you 'n' changes. Plus, where are you increasing it on? This will require a single fault-tolerant ticker (some do that btw).
Once you encode shard number into ID, you got:
- instantly* know which shard to query
- each shard has its own ticker
* programatically, maybe visually as well depending on implementation
I had IDs that encode: entity type (IIRC 4 bit?), timestamp, shard, sequence per shard. We even had a admin page wher you can paste ID and it will decode it.
id % n is fine for cache because you can just throw whole thing away and repopulate or when 'n' never changes, but it usually does.
Up until Van Buren v. United States in 2020, ToS violations were sometimes prosecuted as unauthorized access under the CFAA. I suspect there are other jurisdictions that still do the equivalent to that.
Yeah the needle in a haystack tests are so stupid. It seems clear with LLMs that performance degrades massively with context size, yet those tests claim the model performs perfectly.
As someone who abuses gemini regularly with a 90% full context, the model performance does degrade for sure but I wouldn't call it massively.
I can't show any evidence as I don't have such tests, but it's like coding normally vs coding after a beer or two.
For the massive effect, fill it 95% and we're talking vodka shots. 99%? A zombie who can code. But perhaps that's not fair when you have 1M token context size.
Basing it around the act of selling data seems like a much better approach to me than what OP suggested, I agree. I imagine there are edge cases to consider around how acquisitions of company assets would work, although it’s not a use case I particularly care to defend.
“Intent to track” could be an approach, but the toll bridges near me use license plate scanners for payment, so I could see it not being that clear cut. There are likely other valid use cases, like statistical surveys, congestion pricing laws, etc.
reply