Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m not sure why you’d want to build an LLM these days - you won’t be able to train it anyway. It’d make a lot of sense to teach people how to build stuff with LLMs, not LLMs themselves.


This has been said about pretty much every subject. Writing your own Browsers, compilers, cryptography, etc. But at least for me even if nothing comes of it just knowing how it really works, What steps are involved are part of using things properly. Some people are perfectly happy using a black box, but without kowning how its made, how do we know the limits? How will the next generation of llms happen if nobody can get excited about the internal workings?


You don’t need to write your own LLM to know how it works. And unlike, say, a browser it doesn’t really do anything even remotely impressive unless you have at least a few tens of thousands of dollars to spend on training. Source: my day job is to do precisely what I’m telling you not to bother doing, but I do have access to a large pool of GPUs. If I didn’t, I’d be doing what I suggest above.


Good points. For learning purpose, just understanding what a neural network is and how it works covers it all.


But I mean people can always rent GPUs too. And they're getting pretty ubiquitous as we ramp up from the AI hype craze, I am just an IT monkey at the moment and even I have on-demand access to a server with something like 4x192GB GPUs at work.


Have you tried renting a few hundred GPUs in public clouds? Or TPUs for that matter? For weeks or months on end?


It's possible to train useful LLMs on affordable harwdare. It depends on what kind of LLM you want. Sure you won't build the next ChatGPT, but not every language task requires a universal general-purpose LLM with billions of parameters.


It's so fun! And for me at least, it sparks a lot of curiosity to learn the theory behind them, so I would imagine it is similar for others. And some of that theory will likely cross over to the next AI breakthrough. So I think this is a fun and interesting vehicle for a lot of useful knowledge. It's not like building compilers is still super relevant for most of us, but many people still learn to do it!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: