It's certainly not uncommon to cache deps in CI. But at least at some point CircleCI was so slow at saving+restoring cache that it was actually faster to just download all the deps. Generally speaking for small/medium projects installing all deps is very fast and bandwidth is basically free, so it's natural many projects don't cache any of it.
They're not being harrassed. You're basically saying that because FFmpeg doesn't have enough resources to fix all the security vulnerabilities, we should fix it by pretending like there are none.
It requires "AI" in the sense of how we all wave our hands and call everything AI nowadays. But a daily digest of the past day, upcoming day and future events/tasks would be a good "AI" feature that might actually be useful.
It's trivial to get better score than GPT-4 with 1% of the cost by using my propertiary routing algorithm that routes all requests to Gemini 2.5 Flash. It's called GASP (Gemini Always, Save Pennies)
Does anyone working in an individual capacity actually end up paying for Gemini (Flash or Pro)? Or does Google boil you like a frog and you end up subscribing?
If I actually had time to work on my hobby projects Gemini pro would be the first thing I’d spend money on. As is, it’s amazing how much progress you can squeeze out of those 5 chats every 24h; I can get a couple hours of before-times hacking done in 15 minutes, which is incidentally when free usage gets throttled and my free time runs out.
I've used Gemini in a lot of personal projects. At this point I've probably made tens of thousands of requests, sometimes exceeding 1k per week. So far, I haven't had to pay a dime!
"When you use Unpaid Services, including, for example, Google AI Studio and the unpaid quota on Gemini API, Google uses the content you submit to the Services and any generated responses to provide, improve, and develop Google products and services and machine learning technologies, including Google's enterprise features, products, and services, consistent with our Privacy Policy.
To help with quality and improve our products, human reviewers may read, annotate, and process your API input and output. Google takes steps to protect your privacy as part of this process. This includes disconnecting this data from your Google Account, API key, and Cloud project before reviewers see or annotate it. Do not submit sensitive, confidential, or personal information to the Unpaid Services."
You get 1500 prompts on AIStudio across a few Gemini flash models. I think I saw 250 or 500 for 2.5. It’s basically free and beats the consumer rate limits of big apps (Claude, ChatGPT, Gemini, meta). I wonder when they’ll cut this off.
What does "choked on it" mean for you? Gemini 2.5 pro gives this, even estimating what amouns of those 3m ships that sank after pianos became common item. Not pasting the full reasoning here since it's rather long.
Combining our estimates:
From Shipwrecks: 12,500
From Dumping: 1,000
From Catastrophes: 500
Total Estimated Pianos at the Bottom of the Sea ≈ 14,000
Also I have to point out that 4o isn't a reasoning model and neither is Sonnet 4, unless thinking mode was enabled.
That very much depends on which AGI definition you are using. I imagine there are a dozen or so variants out there. See also "AI" and "agents" and (apparently) "vibe coding" and pretty much every other piece of jargon in this field.
I think it's very widely accepted definition and there's really no competing definitions either as far as I know. While some people might think AGI means superintelligence, it's only because they've heard the term but never bothered to look up
what it means.
"Artificial general intelligence (AGI) is a field of theoretical AI research that attempts to create software with human-like intelligence and the ability to self-teach. The aim is for the software to be able to perform tasks that it is not necessarily trained or developed for."
"Artificial General Intelligence (AGI) is an important and sometimes controversial concept in computing research, used to describe an AI system that is at least as capable as a human at most tasks. [...] We argue that any definition of AGI should meet the following six criteria: We emphasize the importance of metacognition, and suggest that an AGI benchmark should include metacognitive tasks such as (1) the ability to learn new skills, (2) the ability to know when to ask for help, and (3) social metacognitive abilities such as those relating to theory of mind. The ability to learn new skills (Chollet, 2019) is essential to generality, since it is infeasible for a system to be optimized for all possible use cases a priori [...]"
The key difference appears to be around self-teaching and meta-cognition. The OpenAI one shortcuts that by focusing on "outperform humans at most economically valuable work", but others make that ability to self-improve key to their definitions.
Note that you said "AI that will perform on the level of average human in every task" - which disagrees very slightly with the OpenAI one (they went with "outperform humans at most economically valuable work"). If you read more of the DeepMind paper it mentions "this definition notably focuses on non-physical tasks", so their version of AGI does not incorporate full robotics.
I think the G is what really screws things up. I thought it was, as good as the general human, but upon googling it has a defined meaning among researchers. There appears to be confusion all over the place tho.
General-Purpose (Wide Scope): It can do many types of things.
Generally as Capable as a Human (Performance Level): It can do what we do.
Possessing General Intelligence (Cognitive Mechanism): It thinks and learns the way a general intelligence does.
So, for researchers, general intelligence is characterized by: applying knowledge from one domain to solve problems in another, adapting to novel situations without being explicitly programmed for them, and: having a broad base of understanding that can be applied across many different areas.
The Hanoi Towers example demonstrates that SOTA RLMs struggle with tasks a pre-schooler solves.
The implication here is that they excel at things that occur very often and are bad at novelty. This is good for individuals (by using RLMs I can quickly learn about many other aspects of human body of knowledge in a way impossible/inefficient with traditional methods) but they are bad at innovation. Which, honestly, is not necessarily bad: we can offload lower-level tasks[0] to RLMs and pursue innovation as humans.
[0] Usual caveats apply: with time, the population of people actually good at these low-level tasks will diminish, just as we have very few Assembler programmers for Intel/AMD processors.
The argument of (1) doesn't really have anything to do with humans or antromorphising. We're not even discussing AGI, we're just talking about the property of "thinking".
If somebody claims "computers can't do X, hence they can't think".
A valid counter argument is "humans can't do X either, but they can think."
It's not important for the rebuttal that we used humans. Just that there exists entities that don't have property X, but are able to think. This shows X is not required for our definition of "thinking".
Certainly many cultures and religions believe in some flavor of intelligent design, but you could argue that if the natural world (for what we generally regard as "the natural world") is created by the same entity or entities that created humans, that doesn't make humans artificial. Ignoring the metaphysical (souls and such) I'm struggling to think of a culture that believes the origin of humans isn't shared by the world.
In this case, I was thinking of unusual beliefs like aliens creating humans or humans appearing abruptly from an external source such as through panspermia.
Why AGI need to be even as good as average human. If you get someone with 80 IQ is still smart enough to reason and do plenty of menial tasks. Also not sure why AGI need to be as good in every task? Average human will excel others at few tasks and sux terribly in many others.
Because that’s how AGI is defined. https://en.wikipedia.org/wiki/Artificial_general_intelligenc...: “Artificial general intelligence (AGI)—sometimes called human‑level intelligence AI—is a type of artificial intelligence that would match or surpass human capabilities across virtually all cognitive tasks”
But yes, you’re right that software needs not be AGI to be useful. Artificial narrow intelligence or weak AI (https://en.wikipedia.org/wiki/Weak_artificial_intelligence) can be extremely useful, even something as narrow as a services that transcribes speech and can’t do anything else.
AGI should perform on the level of an experienced professional in every task. The average human is useless for pretty much everything but capable of learning to perform almost any task, given enough motivation and effort.
Or perhaps AGI should be able to reach the level of an experienced professional in any task. Maybe a single system can't be good at everything, if there are inherent trade-offs in learning to perform different tasks well.
For comparison, the average person can't print Hello World in python. Your average programmer (probably) can.
It's surprisingly simple to be above average in most tasks. Which people often confuse with having expertise. It's probably pretty easy to get into the 80th percentile of most subjects. That won't make you the 80th percentile of people that do the thing, but most people don't. I'd wager 80th percentile is still amateur.
The G in AGI stands for "general", not for "superhuman". An intelligence that can't learn to perform information processing and decision-making tasks people routinely do does not seem very general to me.
Here is the big question: should it be equal or better then every single person? If we assume that every healthy person is 'generally intelligent' then probably this is a benchmark. Because not every person can do the tasks that other persons do routinely. Probably we shouldn't demand it from AGI either. At least not from a single model. But it makes sense to request that specialized model can be created (or trained, fine tuned) for every task humans can do.
the average human is good at something, and sucks at almost everything. Human performance at chess and average performance at chess differ by 7 orders of magnitude.
Most people are. One of my pet peeves is that people falsely equate AGI with ASI, constantly. We have had full AGI for years now. It is a powerful tool, but not what people tend to think of as god-like “AGI.”
Models still have extreme limits relative to humans. Context size and reasoning depths, being the two most obvious. A third being their inability to incorporate new information with as little effort as humans do, without creating unintended conflicts across previously learned information.
But they vastly exceed human capabilities in other ways. The most obvious, being their ability to do shallow reasoning incorporating information from virtually any combination out of the vast number of topics that humans find useful or interesting. Another being their ability to by default produce discourse with such high written organization and grammatical quality.
For now, they are artificial "better at different things" intelligences.
I don't think most of the objections are poor at all apart from 3, it's this article that seems to make lots of strawmans. Especially the first objection is often heard because people claim "this paper proves LLMs don't reason". The author moves goalposts and is arguing against about whether LLMs lead to AGI, which is already a strawman for those arguments. And in addition, he even seems to misunderstand AGI, thinking it's some sort of super intelligence ("We have every right to expect machines to do things we can’t"). AI that can do everything at least as good as average human is AGI by definition.
It's especially weird argument considering that LLMs are already ahead of humans in Tower of Hanoi. I bet average person will not be able to "one-shot" you the moves to 8 disk tower of Hanoi without writing anything down or tracking the state with the actual disks. LLMs have far bigger obstacles to reaching AGI though.
5 is also a massive strawman with the "not see how well it could use preexisting code retrieved from the web" as well, given that these models will write code to solve these kind of problems even if you come up with some new problem that wouldn't exist in its training data.
Most of these are just valid the issues in the paper. They're not supposed to be some kind of arguments that try to make everything the paper said invalid. The paper didn't really even make any bold claims, it only concluded LLMs have limitations in its reasoning. It had a catchy title and many people didn't read past that.
It's especially weird argument considering that LLMs are already ahead of humans in Tower of Hanoi
No one cares about Towers of Hanoi. Nor do they care about any other logic puzzles like this. People want AIs that solve novel problems for their businesses. The kind of problems regular business employees solve every single day yet LLMs make a mess of.
The purpose of the Apple paper is not to reveal the fact that LLMs routinely fail to solve these problems. Everyone who uses them already knows this. The paper is an argument for why this happens (lack of reasoning skills).
No number of demonstrations of LLMs solving well-known logic puzzles (or other problems humans have already solved) will prove reasoning. It's not interesting at all to solve a problem that humans have already solved (with working software to solve every instance of the problem).
I'm more saying that points 1 and 2 get subsumed under point 5 - to the extent that existing algorithms / logical systems for solving such problems are written by humans, an AGI wouldn't need to match the performance of those algorithms / logical systems - it would merely need to be able to create / use such algorithms and systems itself.
You make a good point though that the question of whether LLMs reason or not should not be conflated with the question of whether they're on the pathway to AGI or not.
Right, I agree there. Also that's something LLMs can already do. If you give the problem to ChatGPT o3 model, it will actually write python code, run it and give you the solution. But I think points 1 and 2 are still very valid things to talk about, because while Tower of Hanoi can be solved by writing code that doesn't apply to every problem that would require extensive reasoning.
The benchmark also says Tauri takes 25s to launch on Linux and build of empty app takes over 4 minutes on Windows. Not sure if those numbers are really correct.
A few months ago, I experimented with Wails and Tauri on Windows. The builds did indeed take unreasonably long with the Rust option and were way faster with Go, no idea why but I ditched Tauri because of that since Wails did more or less the same thing.
It was an internal app, a GUI to configure a CLI tool in a user friendly manner. For that use case, I essentially built a local SPA with Vue that can also call some endpoints on server side software that we also host. There, the rendering differences between the web views didn't really matter but the small distribution size was a major boon, plus being able to interface with Go code was really pleasant (as is that whole toolchain). No complaints so far, then again, not a use case where polish would matter that much.
I'd say that the biggest hurdle for that sort of thing is just the documentation or examples of how to do things online - because Electron is the one everyone seems to use and has the most collective knowledge out there.