I like the use of a tiny device to generate the images. I was wondering whether the energy consumption per image would be lower, but I did the simple maths and it's not the case.
So if it takes 3 hours to generate one picture, that's about 18Wh per image.
A Nvidia Tesla or RTX GPU can generate a similar picture very quickly. Assuming one second per image and 350W under load for the whole system it's in the magnitude of 0.1Wh per image.
Of course we could consider that a raspberry pi zero uses a lot less ressources and energy to be manufactured and transported.
Would an accelerator such as the Intel Neural Compute Stick 2 work with this? It can be plugged into a Pi, however I'm not clear on how VRAM works on the compute stick or if it's shared with the host?
For on prem use, the up front cost is a lot lower. The A100 that most serious outfits are using runs in the thousands to tens of thousands of dollars per unit with very limited availability. The pi is typically under $75 usd for any variant.
A RTX 4090 has a much better value for stable diffusion but yes if you start to think about cost the pi wins. If you think about availability, I’m not sure.
The big immediate plus here, is if you live somewhere with limited access to the internet, you can still generate imagery offline on a low end laptop, like a protest group in far eastern europe or other areas. My personal travel laptop only has 8GB memory so it's exciting to be able to try out an idea even if I don't have high end hardware.
An RTX 3090 hits the current sweetspot of price/performance for me. Half the throughput of the 4090, but at 1/3rd the cost. (I needed the 24GB VRAM for other LLM projects).
Incredible! If only there was some cheap hackable eink frame, you could make a fully self contained artwork from eink panel + rpi that's (slowly) continuously updating itself..!
Yessss! I looked into building some self contained "slow tech" generative art using eink a couple of years ago but it was just impossible for my tiny budget. This is great, thanks!!
Edit..: I'm so hyped about this; the example image on TFA takes +2 hours to generate, but who cares?! I'd love to have a little display that churns around in the background and creates a new variation on my prompt every whatever hours, displaying the results on an unobtrusive eink screen.
Is it possible to incorporate a personalized "context" into the generator? Weather, market/news sentiments, calendar events, etc... to style the end result.
Haha I like the idea of walking past, glancing now and then to see if there's something you really love...
but on the other hand I would also love the statement behind something unconnected to the internet that's slowly churning out unique, ephemeral pictures. Yours to enjoy, then gone forever.
Waveshare and Pimoroni have some that work well with Raspberry Pi, if they're in your budget. I built a Waveshare epaper display + Pi Zero into a photo frame for a totally different project. Your idea tempts me.
Incredible! The march continues to get more models to run on the edge, much faster than I anticipated. The static quantization and slicing techniques here are pretty cool
On an Apple M1 with 16gig RAM, without using Pytorch compiled to take advantage of Metal, it could take 12mins to generate an image with a tweet-length prompt. With Metal, it takes less than 60 seconds.
And PyTorch on the M1 (without Metal) uses the fast AMX matrix multiplication units (through the Accelerate Framework). The matrix multiplication on the M1 is on par with ~10 threads/cores of Ryzen 5900X.
I was using a 6ish year old amd cpu with 16gigs of ram and generating a prompt would take about a half hour. Which is still massively impressive for what it is.
yes, and if he does it on a paid machine with a better GPU it'll be even faster!
While true, neither your statement or mine above is germane to the discussion. It wasn't about how long it takes. It's a discussion of how cool it is that it can be done on that machine at all.
On 21 April 2023 Google blocked usage of Stable Diffusion with a free account on colab. You need a paid plan to use it.
Apparently there are ways around it, but I just switched to runpod.io. It's very cheap (around $0.80/h for a 4090 including storage) and having a real terminal is worth it.
Now I'm wondering: could a monkey hitting random keys on a keyboard for an infinite amount of time eventually come up with the right prompts to get GPT-4 to produce code that compiles to a faithful reproduction of Doom?
Probably more easily than you'd think. DOOM is open source[1], and as GP alludes, is probably the most frequently ported game in existence, so its source code almost certainly appears multiple times in GPT-4's training set, likely alongside multiple annotated explanations.
Well, not the most ported, the Z-Machine with tons of games (even ones legally available from IF archive with great programming, such as Curses!, Jigsaw, Anchorhead) might be. It runs even on the Game Boy, up to v3 games.
Z5 and Z8 games will run fine from a 68020 and beyond.
Now I'm wondering: if there were two monkeys hitting random keys on a keyboard for an infinite amount of time, one in the gpt-4 prompt and another straight typing 0s and 1s who would produce Doom code faster?
No, because GPT-4 has finite memory, its context length, and its random number generator for output selection is probably pseudo-random with finite memory.
If the random number generator is pseudo-random, this makes GPT-4 a deterministic finite-state machine, and the output sequence does not necessarily contain all possible subsequences no matter how many times the monkey types a new random key. Put differently, some output subsequences may be inaccessible no matter which keys are input. Same if the random number generator is truly random but its value cannot select among all possible output tokens, only a subset provided by the GPT at each step.
A raspberry pi zero 2W seems to use about 6W under load (source: https://www.cnx-software.com/2021/12/09/raspberry-pi-zero-2-... )
So if it takes 3 hours to generate one picture, that's about 18Wh per image.
A Nvidia Tesla or RTX GPU can generate a similar picture very quickly. Assuming one second per image and 350W under load for the whole system it's in the magnitude of 0.1Wh per image.
Of course we could consider that a raspberry pi zero uses a lot less ressources and energy to be manufactured and transported.