My use case is to generate subtitles for Youtube videos (downloaded using yt-dlp). Word-level accurracy is also nice to have, because I also translate them using LLMs and edit the subtitles to better fit the translation.
For long videos, I have a script that fetches the transcript using yt-dlp and pipes it to an LLM for keypoints. If the content sounds interesting, I watch it; if not, I save 45 minutes.
Mind sharing the script? It's becoming a big problem to me: people send me links to "must watch videos" but neither the title nor description nor subchapter titles tell me what it actually is about.
In this particular case, it's spending 40 minutes of my life on something that could be explained in 4 sentences.
"It won't be the same if I just tell you, you have to watch it" - my mom on the video of some fake MD selling his miraculous variant of vitamin C that cures everything from cancer to dementia.
While attempting to write my own script, I found that there are many websites which offer YouTube summaries, which are probably an easier solution. For example (not affiliated) https://www.easemate.ai/video-summary It even allows you to ask questions about the transcript.
I also found a Python library for fetching YouTube video transcripts, but some issue mentioned that they got banned, so out of caution, I implemented my summary script as a JavaScript bookmarklet instead. It will probably break on the next YouTube update, so I am not sure how useful it is. Also, you have to set your own API key (and maybe URL). I used Groq (not to be confused with Grok), because it is free and very fast.
> youtube_transcript.py 'https://www.youtube.com/watch?v=DAX2_mPr9W8' | llm 'give me keypoints, ignore promotions'
Certainly! Here are the key points from the detailed discussion about dishwasher detergents, washing cycles, and hot water use:
1. **Dishwasher Detergent Basics:**
- Most dishwashers have a detergent dispenser designed to release detergent in two doses: a smaller pre-wash dose and a larger main wash dose.
- The pre-wash helps remove easily dissolvable food residues before the main wash.
- Oils and fats do not dissolve well in plain water; putting some detergent in the pre-wash water helps emulsify and remove these soils early.
2. **Why Use Loose Powder Detergents:**
- Loose powders allow flexible dosing: users can adjust detergent amounts based on load dirtiness.
- Pre-dosed pods force a single, fixed dose which can be excessive or insufficient depending on the wash.
- The dispenser’s design supports splitting detergent dosing; powders are better aligned with this system than pods.
... you get the point
If you want to build a quick one, it's yt-dlp to download the video, whisper to transcribe the audio, and Claude code have it summarize the transcript.
I'm not at my computer RN but I'll share it later.
"This video explores dishwasher detergent, focusing on a new powder formulation. The creator details the science behind effective dishwashing, including pre-wash cycles and water temperature. Independent testing results comparing the new powder to leading pods are revealed."
The gist you took away is not quite right because of course you didn’t watch it.
Why does everything have to be summarized? If you want to see the content watch the content. Technology Connections videos are interesting, entertaining (to nerds at least) and a lot of effort goes into them.
Watch them at 1.5 speed if you’d like.
Or don’t watch them at all.
But the “give me a transcript because I want to watch but don’t want to watch” thing is so annoying.
YouTube also provides a transcript on the desktop version of the site, by the way. So this entire thread is pointless.
I want my whole life to be optimized, so that I can consume far more but get nothing out of any of my consumption. Anything that requires art, ingenuity, and human effort can be compressed to a simple bullet point summary.
I don't care that Technology Connections is the perfect blend of campy midwestern technical pedantry, substantive detail, great editing, understated humor. It must undergo machine digestion, its humanity stripped, before being fed to me as a flat slurry. This way, I can optimize my consumption of slurry without ever encountering any of that pesky 'human spirit.'
People don't have infinite time. A mindset of never appreciating works on their artistic merits is terrible. But a mindset of appreciating with some works, while for other ones you just want the info please, is a perfectly reasonable way to operate.
Someone that just wants to wash dishes better shouldn't be forced to watch a 40 minute video to learn how. It doesn't mean they want slurry.
Or to put it another way: Imagine you had to watch a video essay to check the weather forecast. It would suck, even if they're good essays. Even moreso if you already have other essays you want to watch.
The moment some long form content comes out we are all TikTok kids who want a five second summary.
Never mind the fact that YouTube provides a compete transcription that you can copy/paste and dump into an LLM, making this entire thread, as I mentioned before, pointless.
The people asking for a summary are lazy people who want to be spoon fed trivia dopamine hits.
To some audiences, sure. Obviously the other guy gets value out of the content of the content, not just 'experiencing it'
That's EXACTLY what he's doing, right? Get the transcript, pipe to an LLM, determine if it's worth his time. You're on HN, we like to use scripts to automate those sorts of things.
Nobody is demanding a summary from the OP. The AUDIENCE MEMBER went out of his way to determine if the content is worth his time. Its no different than checking reviews before you watch a movie
Tangent but it is funny to me that we focus on tiktok but the news is as bad or worse in terms of super fast tidbits interspersed with ads, tragedy, and local weather
I did something similar a while back, but I treat it as "text thumbnails" and kind of replace YT frontpage with this. I don't use it all the time, but sometimes the clickbait is too much.
Also I should add Gemini (the app) is able to access YT transcripts most of the time, so sometimes I'd just paste the link and ask for a tldr. One of the few reasons to go for Gemini app, not google ai studio.
That said, Technology Connections is worth watching just because videos are very pleasant, it's probably my favorite YT subscription right now.
A choice of tech stack can never be enough to prove anything. It only establishes a lower bound on resource usage, but there is never and upper bound as long as while() and malloc() are available.
> high megapixel sensors, pre-capture, and cameras capable of pushing between 30fps and 120fps worth of compressed raws or high quality JPEGs
Surely those are buffered in the RAM first, then flushed to the card. When the buffer is full, cameras either stop recording or have to flush continuously, which reduces the burst rate.
Yes, that's correct. Buffer sizes are also all over the place, so if you want to shoot continuously, you need to pick carefully. Check https://www.fredmiranda.com/forum/topic/1856860/0 for a thorough analysis starting with the Sony A9iii (which can fill the buffer incredibly quickly with its premier feature, the 120fps 14-bit raw output global shutter). Deeper in the thread compares to the Nikon Z9.
It's an amazing piece of software. If only the code & the configuration was readable. It's overly reliant on 2-3 letter abbreviations, which I'm sure has a system, but I haven't yet been able to decipher.
reply