More

binsquare · 2025-12-24T20:01:59 1766606519

Things get better as the technology gets more mature. It's a promising start for sure.

binsquare · 2025-12-21T00:44:40 1766277880

:( wish it was free

binsquare · 2025-12-20T20:36:53 1766263013

I run a crowd sourced website to collect data on the best and cheapest hardware setup for local LLM here: https://inferbench.com/

Source code: https://github.com/BinSquare/inferbench

nodja · 2025-12-20T22:39:04 1766270344

Cool site, I noticed the 3090 is on there twice.

https://inferbench.com/gpu/NVIDIA%20GeForce%20RTX%203090

https://inferbench.com/gpu/NVIDIA%20RTX%203090

binsquare · 2025-12-20T22:40:53 1766270453

Oh nice catch, I'll fix that

---

Edit: Fixed

kilpikaarna · 2025-12-20T23:45:33 1766274333

Nice! Though for older hardware it would be nice if the price reflected the current second hand market (harder to get data for, I know). Eg. Nvidia RTX 3070 ranks as second best GPU in tok/s/$ even at the MSRP of $499. But you can get one for half that now.

binsquare · 2025-12-21T00:43:41 1766277821

Great idea - I've added it by manually browsing ebay for that initial data.

So it's just a static value in this hardware list: https://github.com/BinSquare/inferbench/blob/main/src/lib/ha...

Let me know if you know of a better way, or contribute :D

jsight · 2025-12-21T02:02:09 1766282529

It seems like verification might need to be improved a bit? I looked at Mistral-Large-123B. Someone is claiming 12 tokens/sec on a single RTX 3090 at FP16.

Perhaps some filter could cut out submissions that don't really make sense?

binsquare · 2025-12-18T22:40:16 1766097616

I am not working on web or server stuff.

I'm building a better primitive for infrastructure via microvm's (think virtual machine but fast and easy to use).

I am about to launch a complete rewrite of this: https://github.com/BinSquare/ERA

binsquare · 2025-12-18T09:35:05 1766050505

I run a keywords research tool, it scans posts across social media sites like bluesky, mastodon, hackernews, etc.

KeywordsPal.com

It's actually super interesting the technical aspects to scan 50k posts a day for as cheap as possible. I write about it here: https://keywordspal.com/blog/building-multi-platform-content...

I also built it as a result of being unsatisfied with f5bot

lippihom · 2025-12-22T14:13:08 1766412788

What were your issues with f5bot? I tried to sign up and Supabase auth went to spam btw.

binsquare · 2025-12-17T21:28:39 1766006919

Rohit was the driving force behind Alexa before he went on to AGI.

With this change in leadership, I'm not confident the ship is going the right direction.

xendo · 2025-12-17T23:01:02 1766012462

You really think that Rohit was driving Alexa or AGI in the right direction?

binsquare · 2025-12-18T10:45:11 1766054711

I think the nova foundational models + bedrock + ml services are in the right direction.

binsquare · 2025-12-16T20:18:22 1765916302

The same way you learn to trust other dev's to do work.

I see ai as a tool, not a peer. I trust a peer when we aligned on the requirements of the project and where we want to go.

So the answer to how we get confidence in a workflow of agents to develop, to review and test without a human verifying the implementation?

I personally don't see me getting there.

binsquare · 2025-12-16T19:15:33 1765912533

added!

binsquare · 2025-12-14T23:43:16 1765755796

Woah, what are you using for the isolation?

shinpr · 2025-12-15T02:25:59 1765765559

It’s not OS-level sandboxing or containers.

Each sub-agent is executed as a separate CLI invocation (e.g. Cursor CLI or Claude Code), which means it gets a fresh model context window. The isolation is purely at the LLM context level, not process or filesystem isolation.

The main agent passes only minimal inputs (file paths, task instructions), gets a concise result back, and keeps its own context clean.

binsquare · 2025-12-14T23:42:27 1765755747

neat - but the link seems to be broken btw