It's interesting to me to see how easily you can reach a much safer C without adding _everything_ from C++ as a toy project. I really enjoyed the read!
Though yes, you should probably just write C-like C++ at that point, and the result sum types used made me chuckle in that regard because they were added with C++17. This person REALLY wants modern CPP features..
I've seen Optuna used with some of the prompt optimization frameworks lately, where it's a really great fit and has yielded much better results than the "hyperparameter" tuning I had attempted myself. I can't stop mentioning how awesome a piece of software it is.
Also, I'm eager to see how well gpt-oss-120b gets uncensored if it really was using the phi-5 approach, since that seems fundamentally difficult given the training.
FWIW, I already used Heretic to decensor gpt-oss-20b [1], and it works just fine. Note that the number of refusals listed on the model card is actually an overestimate because refusal trigger words occur in the CoT, even though the model doesn't actually end up refusing in the end.
What's your intuition on other "directions"? Have you tried it on something other than "refusals"? Say "correctness" in math or something like that. I have some datasets prepared for DPO on "thinking" traces that are correct / incorrect, wondering if it'd be something that could work, or if it's out of scope (i.e. correctness is not a single direction, like refusal training)
The problem is that in order to do optimization, you need a classifier that can distinguish the two types of responses (like refusal/compliance). In case of refusals, that's relatively easy to do using trigger words like "disallowed" or "I can't". I imagine this would be much, much harder to do automatically for classes like correctness.
And I also suspect, as you hint at, that "correctness" isn't just a direction in residual space, but a concept so broad that no simple mechanistic description can capture it.
Because everyone was always a user in the definition of free software! Because it's free as in free speech.. In the first bulletin where the definition was made, Stallman envisioned no restrictions on distribution and a user being a business was entirely unrelated to how compensation were to occur: https://www.gnu.org/bulletins/bull1.txt
In the very early days they were always the same, but differences between use and distribution emerged quickly.
For example, there are zero restrictions, duties, or obligations on using the software. But once you distribute changes (or in the AGPL case allow other people to use your changes), duties and obligations attach.
>In the very early days they were always the same, but differences between use and distribution emerged quickly.
I think those concerns existed at the time of the writing of the first bulletin, if you read how they were expecting to be compensated. See the part titled "So, how could programmers make a living?".
>For example, there are zero restrictions, duties, or obligations on using the software. But once you distribute changes (or in the AGPL case allow other people to use your changes), duties and obligations attach.
Yep, the duty and obligation to redistribute, as mentioned in the bulletin above - but without a single company being the sole arbiter or commercializer of the source, as defined in the Free Software Definition you mention elsewhere. Freely, as in free speech.. A quote from the original bulletin:
```
This means much more than just saving everyone the price of a license.
It means that much wasteful duplication of system programming effort
will be avoided. This effort can go instead into advancing the state
of the art.
Complete system sources will be available to everyone. As a result, a
user who needs changes in the system will always be free to make them
himself, or hire any available programmer or company to make them for
him. Users will no longer be at the mercy of one programmer or
company which owns the sources and is in sole position to make
changes.
```
In the SaaS era, freedom is impinged not because hyperscalers make money off of free software. That was always the intended goal, because it isn't freedom like free beer or simply 'non-commercial uses'. Freedom is impinged because modifications of the software aren't redistributed if distribution is only done over generated artifacts on a network. AGPL is specifically for networked software like this.
Unless you're implying that the GNU foundation, Richard Stallman, or the free software movement generally ever viewed even narrowly commercially restrictive licenses as free software. Which you can tell from the source documents and all others in this comment thread, that is obviously not the case.
Wrt to the legal concerns with AGPL, they're not actually that it wouldn't provide any protection, but rather that it might offer the originally distributing entity too much power: legal power to declare all software used in the stack to produce a network request MUST be made source available. Basically, a ""contagious"" or copyleft license as GPLv3 intended, but even more viral than intended in the AGPL variant since it extends well beyond the source software. I have not seen any lawyer concerned with how Amazon would be able to bypass its protections, *because they're otherwise the same as GPLv3 and have already been tested.*
I think this poster created the legal theory themselves because they were aware of other legal concerns with the AGPL affecting the above scenario. I've read a lot of legalblogging about AGPL, and none bring up this as even a remote possibility, because unless you think GPLv3 case law is somehow irrelevant then you don't think AGPL will be simply bypassed.
One last thing: I'm surprised the poster was concerned about AGPL being untested, despite it using GPLv3, and not that FSL has only existed for 2 years and has 0 case law surrounding.
> legal power to declare all software used in the stack to produce a network request MUST be made source available
If I understand correctly what you say, this is one of the main concerns with the SSPL because of the following [1]:
> The SSPL is based on the GNU Affero General Public License (AGPL), with a modified Section 13 that requires that those making SSPL-licensed software available to third-parties (modified or not) as part of a "service" must release the source code for the entirety of the service, including without limitation all "management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available", under the SSPL.
I'm not familiar with this concern for the AGPL itself.
Yes, that's the MongoDB variant which codifies it directly, and SF Conservancy and other legal entities promotion FOSS licenses states that the network stack contagion concern does not actually apply for the AGPL. But because AGPL doesn’t dig into the definition of "access", simply defining it as “users interacting with it remotely through a computer network”, nor define clear boundaries for how the "contagious" part of GPLv3 interacts with the rest of the network stack of this clause, it has meant that some lawyers think that a court may overly broadly interpret the definition.
So far this contagion concern hasn't actually played out, and big corporations/hyperscalers are often using AGPL software somewhere in their stack if they're using common Linux distros - and nothing thus far has been compelled to be open sourced that isn't AGPL software.
> But because AGPL doesn’t dig into the definition of "access", simply defining it as “users interacting with it remotely through a computer network”, nor define clear boundaries for how the "contagious" part of GPLv3 interacts with the rest of the network stack of this clause, it has meant that some lawyers think that a court may overly broadly interpret the definition.
Oh yeah, I have encountered this argument before, indeed. Thanks for the pointers btw. I do agree with Drew (your last link) here. I think it's part of the FUD from Google & Co I mentioned in my first comment in this thread. To me, it's even an evidence that the AGPL actually works as intended: it's not convenient for the Big Tech companies who can't reuse the AGPL without having to release their code that's targeted to end users, which they don't want to do.
> big corporations/hyperscalers are often using AGPL software somewhere in their stack if they're using common Linux distros
Do you have specific software in mind? What's AGPL in a common Linux distro? I'm asking because this surprises me. AGPL isn't usually used for something that's not a internet service, I wouldn't expect to find it in Linux distros' basic blocks.
Is Amazon Linux a common Linux distro? If so, it's often distributed with AGPL licensed code, I can think of a few pieces of software it has that are AGPL. They haven't been able to do internal forks of Ghostscript, if they were ever to do so, because of AGPL.
Debian is also the other more common one distros with AGPL software included with it.
Other things like forks of BerkleyDB by hyperscalers have all ended up as FOSS because of AGPL. Presumably this is a better example of where non-AGPL code would have not actually seen the light of day.
These distros package AGPL software, but are these AGPL packages part of the base install (I don't think so), and does Amazon use this software on production?
Okay, I believe you, I'm not familiar with this. I'd still be interested in knowing which specific AGPL software Amazon would use themselves (note, I'm sure they distribute AGPL software through their distro, that doesn't mean they use it themselves).
> For Debian, the software are in the main archive, actually.
I mentioned the base install. Whatever you get by running deboostrap without parameters, or with a base debian docker container. Of course there's AGPL software in main. main is huge.
So for Amazon, I used to work there and not sure I can talk about specifics, but there was AGPL software used outside of the AMIs but they were approved on a case by case basis. Ghostscript is public and used in the AMIs that are shipped standard, and ofc is used sometimes by Amazon. And if any modifications went out, it was of course gladly republished, but I don't think any forks of AGPL software were being maintained to the best of my knowledge.
>I mentioned the base install. Whatever you get by running deboostrap without parameters, or with a base debian docker container. Of course there's AGPL software in main. main is huge.
No, afaik, unfortunately. That might drastically change how you distribute its base. I was a little unclear but I had meant "No but at least the most common distro ships it in their archive" with my first comment.
AGPL isn't viral or contagious, it's copyleft. You need permission from the author to copy. If you violate the license terms you're copying something you're not allowed to copy. That's a copyright violation like illegally downloading music and the rights holder is allowed to tell you to stop doing it.
Oh I agree! And I think it's straightforward to comply with.
I was just explaining the common legal concerns that pop up with the license, and that too much 'contagion' has historically been a gripe about its lack of case law.
Sorry, I'll put that in air quotes, I don't believe free software is disease causing :) just speaking about the common concern is whether or not AGPL copyleft applies to everything involved in responding to a network request (it does not).
FSL is a much simpler court case. “You weren’t allowed to compete with us. You did. Here are the actual damages incurred. Pay us.”
An AGPL enforcement would require the court to interpret its virality which is an open question before even deciding whether a violation occurred.
The potentially overreaching nature of AGPL is one reason it maybe unenforceable. On the other hand if courts lean towards the less viral interpretation Google could get around these issues by modding an AGPL project to run on their proprietary hardware that no one has access to and then simply releasing the modified source code.
>An AGPL enforcement would require the court to interpret its virality which is an open question before even deciding whether a violation occurred.
In US courts, the case law shows that the "virality" is not really an open question because of GPLv3 case law, and has never been interpreted that way. I'm not sure why you're commenting about this scenario when you're unaware that this has been actually tried in courts.
In fact, we saw that in infamous Neo4j AGPL case, actually. AGPL worked as intended and protected the AGPL software in a similar way to LGPL. The court went on to protect non-GPL compliant additions that Neo4j made as being considered contagious, even, going even further to protect the original licensee than intended with the original unmodified license.
So, just recapping, you've gone from stating that Amazon could firewall off AGPL because it has no case law, and after learning it does has its case law includes GPLv3 that it simply may not be 'viral' enough because that hasn't been tested in court, to now learning it has been tested in court and successfully enforced.
>The extent of virality added by the additional clauses is not clear.
The Neo4J case was one piece of a longstanding part of GPLv3 caselaw where the virality is clear.
>My point is that it doesn’t matter. If it is “viral” to the extent some people are concerned about, Amazon can find ways to firewall it.
Just a recap of your responses so far:
So AGPL has no case law and might even be unenforceable, so therefore it you should use non-free source available licenses. Oh, it does have case law and hyperscalers have been forced to open source their forks like of BDB?
Well, the virality hasn't been tested and FSL would be an easier case. Oh, it has been tested, multiple times and licensees have had to work out an agreement like in the Neo4j case - such that judges would actually be able to rely on prior art unlike FSL?
Okay, well, even if that's all true - Amazon could just firewall it anyways. How? Well they would simply use vast resources to create proprietary hardware, create a fork for proprietary hardware despite that making it impossible to receive patches from the main fork, and then sell that as a service.
Based on the above, I think you've done what you can to convince me.
> Google could get around these issues by modding an AGPL project to run on their proprietary hardware that no one has access to and then simply releasing the modified source code.
Well I guess they could today, I don't see the AGPL preventing them to. As long as the modified source is available under the AGPL I suppose they'd be good to go.
A license that forces someone to release software for specific hardware would be non-free I suppose.
I don't see this being practical though. Running proprietary hardware just for this reason would likely be costly, and not really efficient: someone could restore support for general hardware from upstream / only keep the interesting changes.
"First, the freedom to copy a program and redistribute it to your neighbors, so that they can use it as well as you" I can't do this with FSL unless it's a permitted purpose. So, even under this definition it is not free or open source.
The GNU Project and Richard Stallman, who made this statement, would agree that it's not free under even this earliest definition. They in-fact made it even clearer when they defined freedom of "use" as the distinct 0th freedom eventually to make it even clearer that being able to use the software freely is fundamental to their idea of freedom. Again, freedom isn't about price, it's about usage, availability, redistribution and lack of restrictions on this. I cannot freely redistribute FSL licensed code under the original definition of free software.
"Giant trillion dollar conglomerates repackaging and selling a product backed by free labor without contributing back wasn’t something they were contemplating back then."
Yes, the GNU project were acutely aware of this and designed the GPL licenses around such scenarios - they just didn't design it for SaaS businesses, where if you redistribute the built program externally after modifying it but only distributed its responses over a network, you technically weren't obligated to open source that modification. AGPL resolved this issue, and has more case law behind it than this 2 year old license, and has certainly less daunting implications than a not legally well defined 'competing purpose'.
Wrt to the legal concerns with AGPL, they're not actually that it wouldn't provide any protection, but rather that it might offer the originally distributing entity too much power: legal power to declare all software used in the stack to produce a network request MUST be made source available. I have not seen any lawyer concerned with whether or not Amazon would be able to bypass its protections, and the license was made by lawyers to clearly provide protection. Did you create this legal theory yourself? Because I've not seen any writing from a lawyer on the internet that suggests that Amazon could firewall themselves off in a friendly jurisdiction under any reading of the license, and I read a lot of AGPL lawyerblogging.
Sentry, the company who created FSL, even states that this license restricts user freedom explicitly - for the sake of the business interests of the original developer.
So summing up.. Richard Stallman, the FSF, the GNU Project, the OSI, the creators of the FSL, the company now currently using FSL, all agree that this source available license does not meet the definition of "free software". So, whose definition are we using out of thin air?
>I can't do this with FSL unless it's a permitted purpose.
You’re free to distribute it to your neighbors for free for any purpose. You’re free to distribute it for a fee for almost any purpose save one. You just can’t commercialize it as a competing product.
“Source available” again calling this source available is disingenuous. You’re deliberating using the least free term that is technically accurate.
This isn’t my favorite license, but it provides a lot more freedoms than merely looking at the source code.
With respect to AGPL providing “too much control”. That is a valid and likely reason for courts to find it unenforceable.
>You’re free to distribute it for a fee for almost any purpose save one.
So it does not meet the original free software's required freedoms, and is therefore not free software?
>“Source available” again calling this source available is disingenuous. You’re deliberating using the least free term that is technically accurate.
No, the source is available to read and the software is not free based on the historical definitions you're providing, unfortunately. Happy to understand from a different lens, but Stallman specifically meant freedom in the way even FSL writers agreed with.
Also, please refrain to using commonly used terms in the common way as 'disingenuous', it doesn't lead to interesting discussion and is how these threads end up needing to be patrolled by dang: https://news.ycombinator.com/newsguidelines.html
>With respect to AGPL providing “too much control”. That is a valid and likely reason for courts to find it unenforceable.
So, this is a personal non-legal theory that does not have a basis in jurisprudence, then? GPLv3 is proven as enforceable, and is what AGPL is based on. No court in any legal system would throw away a license based on giving "too much control". That's just not how copyright or licensing contracts work. You may want to disclaim conjectures like this with IANAL..
My entire point is that “Source available” is a term frequently used in a derogatory way to make software that doesn’t follow the principles and hey espouse sound dirty.
My entire point is how big tech has captured the zeitgeist, so the common use of that term is irrelevant.
>No court in any legal system would throw away a license based on giving "too much control".
You are 100% incorrect. Contracts are frequently found unenforceable for this exact reason.
>So it does not meet the original free software's required freedoms, and is therefore not free software?
The original definition says nothing about a fee or what restrictions may be in place.
>My entire point is that “Source available” is a term frequently used in a derogatory way to make software that doesn’t follow the principles and hey espouse sound dirty.
It's not dirty, it just doesn't follow the principles the rest of us espouse. We're interested in software that follows these principles via a license like this.
That you're ascribing malice to the entire FOSS community seems a bit strange, when they're the ones who created the free software definition in the first place. The source is available but is not free software even in the original definition.
>Contracts are frequently found unenforceable for this exact reason.
So, personal theory, wrt AGPL. Given you've recently been made aware of the stack of case law for AGPL and that it is largely _just_ GPLv3, I wonder why you think this is a possibility given it is your uninformed non-expert opinion.
>The original definition says nothing about a fee or what restrictions may be in place.
Completely out of context, because even the original definition defines it as "free speech" as in that there are no restrictions on the ways you can freely using it anyway you want, including distributing it.
You're right that a business might offer a fee for free software under this definition, but that's unrelated to it being free to distribute under any clauses.
Given that Stallman is alive and we don't have to do dubious Stallman legal textualism to justify source available licenses, when even source available license writers and users are fine with that distinction, seems a bit strange.
>It's not dirty, it just doesn't follow the principles the rest of us espouse. We're interested in software that follows these principles via a license like this.
I've been involved in this for decades at this point. Free Software and Open Source folks generally "source available" as a pejorative.
By using a term that implies the lowest level of freedoms possible for software that doesn't restrict access to the source code, you are implying that no freedoms exist beyond reading the source.
>Given you've recently been made aware of the stack of case law for AGPL and that it is largely _just_ GPLv3, I wonder why you think this is a possibility given it is your uninformed non-expert opinion.
AGPL significantly changes GPLv3. If you want to understand how that could cause it to be unenforceable read up on severability and its limitations in various jurisdictions. Courts have wide latitude in most jurisdictions to decide how much of a contract or license (in civil law jurisdictions they are always the same thing) to uphold if certain parts are invalidated.
>Completely out of context, because even the original definition defines it as "free speech" as in that there are no restrictions on the ways you can freely using it anyway you want, including distributing it.
Free speech has restrictions in every jurisdiction in the world. Saying in something is "free as in free speech" has no implication that it is absolutely free from all duties, obligations , or restrictions.
If that is a requirement for free software, the GPL isn't a free software license because it does place obligations on distribution.
>Given that Stallman is alive and we don't have to do dubious Stallman legal textualism to justify source available licenses, when even source available license writers and users are fine with that distinction, seems a bit strange.
I don't care what a single individual says about what he believes now. I'm more interested in what he said in 1985 and what the people who made up the community believed.
Mostly though I only care about any of the past cruft because Open Source and to a lesser extent Free Software has takes the air out of the room in any discussion about software freedoms.
I'm interested in realistic compromises to make more free software more viable in a world where Amazon, Google, and Facebook exist. I'm not interested in ideals about a very specific meaning of absolutely free software.
>I'm interested in realistic compromises to make more free software more viable in a world where Amazon, Google, and Facebook exist. I'm not interested in ideals about a very specific meaning of absolutely free software.
Okay, I'm confused why you bring free software or the free software definition into this at all then if you're just picking and choosing what parts of the original statement/bulletin you care about and what parts you choose to disregard, on top of disregarding the original movement and organization founded at its inception.
If you're hoping to rebrand source available software, why not call it something other than _free software_ if you want to do a rebranding? You could propose similarly internally consistent principles and attempt to cultivate a community. Call it 'fair source' or 'managed availability' or something. Refer to the 'freedoms' as rights, instead. You'd convince a much larger group and wouldn't have to pretend that principles for commercialization wasn't considered in 1985.
Since, again, from the start there the goal of free software was that no single company was supposed to be the single commercializer of a piece of software. That principles carries to the GPL.
If you're hoping to convince us that source available software is actually free software, you're giving me a great platform to talk to others about the history of actually free software and making yourself appear wrongheaded as if you didn't read the original bulletin or understand the larger software development community, or worse that you're attempting to co-opt our very specific yet widely accepted professional definition of free software.
>Okay, I'm confused why you bring free software or the free software definition into this at all then if you're just picking and choosing what parts of the original statement/bulletin you care about and what parts you choose to disregard, on top of disregarding the original movement and organization founded at its inception.
1. It's important for people to understand how OSI co-opted the goodwill and some of the ideas from the Free Software movement.
2. I think they have some good ideas even if I don't agree with all of them.
>If you're hoping to rebrand source available software, why not call it something other than _free software_ if you want to do a rebranding? You could propose similarly internally consistent principles and attempt to cultivate a community. Call it 'fair source' or 'managed availability' or something. Refer to the 'freedoms' as rights, instead. You'd convince a much larger group and wouldn't have to pretend that principles for commercialization wasn't considered in 1985.
I'm just a guy with 3 kids under 5 and not enough time to run any kind of rebranding project. I'm just angry that whenever someone launches a project that is more free than proprietary software but that isn't OSI approved, 90% of the comments are about why it isn't free or isn't open source.
I could publish a new project on hacker news and call it "fair source" and then explain how fair source isn't free software, but it's like free software with an extra restriction.
The 5 freedoms:
-1: You can't distribute this software if your name ends in "ezos".
0-4 same as the rest.
I guarantee you 90% of the comments would be attacks on the license (even if -1 was something reasonable). And it would start off with negative goodwill. Most people haven't actually read the 4 freedoms or the OSD, most people just follow the zeitgeist and it says Open Source == good, everything else == bad.
I do not think that a group financed primarily by big tech should have this kind control on the goodwill doled out by the community. But they do. I think that the more people that understand that the better.
>1. It's important for people to understand how OSI co-opted the goodwill and some of the ideas from the Free Software movement.
Okay I don't understand why this is happening in the same breath you're suggesting that OSI is responsible for making everyone in the free software movement believe freedom of use (even in commercial cases) is required otherwise things are source available. GNU foundation, OSI, and even source available license writers basically agree on this part. Can you be specific here?
Because otherwise you're just reinforcing the perception I explained above, since largely the disagreement between OSI and the original free software people is that OSI supports too _permissive_ and too many non-copyleft licenses, not that the permissive or copyleft licenses need to enshrine certain license holders or disenfranchise others, or block commercialization or competitors. That's deeply antithetical to the idea of free or open software, regardless of the camp.
>2. I think they have some good ideas even if I don't agree with all of them.
AGPL, despite achieving all of your goals to prevent hyperscalers from free riding, is not one of them?
>I'm just a guy with 3 kids under 5 and not enough time to run any kind of rebranding project. I'm just angry that whenever someone launches a project that is more free than proprietary software but that isn't OSI approved, 90% of the comments are about why it isn't free or isn't open source.
Because the community has largely agreed on the principles codified by OSI. The principles you propose seem to betray the larger movement's intentions significantly, which is much bigger in scope than OSI.
>-1: You can't distribute this software if your name ends in "ezos".
>0-4 same as the rest.
That's a lot different than source available licenses actually, which usually declares enshrines the original license holder, even though it's not technically free under the other principles. I think if you thought up of a new consistent principle that didn't enshrine a single distributor or disenfranchise entire classes of other distributors, people would be open to the idea of a variant of free software.
But I think the bigger issue is that you think AGPL is failing somehow in not being restrictive enough compared to source available licenses. Maybe you could articulate that more clearly, and _that_ would gather more mind share. Merely stating that OSI is bad doesn't really change people's opinion of source available. Mostly reinforcing free software/copyleft maxi's ideas and insinuating GPL needs to be more common.
>An interesting alternative I've been meaning to try out is inverting this flow. Instead of using an LLM at time of searching to find relevant pieces to the query, you flip it around: at time of ingesting you let an LLM note all of the possible questions that you can answer with a given text and store those in an index.
You may already know of this one, but consider giving Google LangExtract a look. A lot of companies are doing what you described in production, too!
This is just a variation of index time HyDE (Hypothetical Document Embedding). I used a similar strategy when building the index and search engine for findsight.ai
The approach used here for breaking down large documents into summarized chunks that can more easily be reasoned about is how a lot of AI systems deal with large documents that surpass effective context limits in-general, but in my experience this approach will only work up to a certain point and then the summaries will start to hide enough detail that you do need semantic search or another RAG approach like GraphRAG. I think the efficacy of this approach will really fall apart after a certain number of documents.
Would've loved to seen the author run experiments about how they compare to other RAG approaches or what the limitations are to this one.
Thanks, that’s a great point! That’s why we use the tree structure, which can search layer by layer without putting the whole tree into the context (to compromise the summary quality). We’ll update with more examples and experiments on this. Thanks for the suggestion!
Yea, as someone building systems with VLMs, this is downright frightening. I'm hoping we can get a good set of OWASP-y guidelines just for VLMs that cover all these possible attacks because it's every month that I hear about a new one.
You feed it an image. It determines what is in the image and gives you text.
The output can be objects, or something much richer like a full text description of everything happening in the image.
VLMs are hugely significant. Not only are they great for product use cases, giving users the ability to ask questions with images, but they're how we gather the synthetic training data to build image and video animation models. We couldn't do that at scale without VLMs. No human annotator would be up to the task of annotating billions of images and videos at scale and consistently.
Since they're a combination of an LLM and image encoder, you can ask it questions and it can give you smart feedback. You can ask it, "Does this image contain a fire truck?" or, "You are labeling scenes from movies, please describe what you see."
> VLMs are hugely significant. Not only are they great for product use cases, giving users the ability to ask questions with images, but they're how we gather the synthetic training data to build image and video animation models. We couldn't do that at scale without VLMs. No human annotator would be up to the task of annotating billions of images and videos at scale and consistently.
Weren't Dall-E, Midjourney and Stable diffusion built before VLM became a thing?
These are in the same space, but are diffusion models that match text to picture outputs. VLMs are common in the space, but to my understanding work in reverse, extract text from images.
The modern VLMs are more powerful. Instead of invoking text to image or image to text as a tool, the models are trained as multimodal models and it’s a single transformer model where the latent space between text and image is blurred. So you can say something like “draw me an image with the instructions from this image” and without any tool calling it’ll read the image, understand the text instructions contained therein and execute that.
There’s no diffusion anywhere which is kind of dying out except as maybe purpose-built image editing tools.
I don't think so. You have to know exactly what resolution the image will be resized to in order to predict the solution where dithering produces the model you want. How would they know that?
Auto resizing is usually to only a handful of common resolutions, and if inexpensive to generate (probably the case) you could generate versions of this for all of them and see which ones worked.
I don't think this is any different from an LLM reading text and trusting it. Your system prompt is supposed to be higher priority for the model than whatever it reads from the user or from tool output, and, anyway, you should already assume that the model can use its tools in arbitrary ways that can be malicious.
Yes, it's pretty disappointing for a seemingly big improvement over SOTA to be commercially licensed compared the previous version.. At least in the press release they're not portraying it as open source just because it's on GitHub/HuggingFace.
This has nothing to do with the newly appointed fellow nor Meta Superintelligence Labs, but rather work from FAIR that would have gone through a lengthy review process before seeing the light of day. Not fun to see the license change in any case
I remember DINOv2 was originally a commercial licence. I (along with others) just asked if they could change it on a GitHub issue, and after some time, they did. Might be worth asking