Can vouch. Had an .mkv that browsers wouldn't play, and asked AI to give me a command line that maximized compatibility so I could stream it from CopyParty without folks on my network having to mount it and stream to VLC, rather than just play in the browser.
This is one of those cases where I couldn't really verify that what it suggested was correct:
I could look up all those flags, but I haven't. But the command did seem to work! I'm sure the HN crowd can critique. =)
And so I'm curious about the concept you're getting at: command line tools that offer natural language interfaces. Maybe the command line can become more broadly accessible, especially if designed with that that use case in mind.
It seems that my computer usage is mild. I use ffmpeg to convert albums (which I don't usually do because I download everything in flac), and I've got a couple of convert.sh lying around. My most advanced usage of ffmpeg was enabling streaming in aac for gonic (you can't do gapless with mp3 and opus wouldn't play).
This just to say anything that is more than a couple flag usually find themselves as alias, function, or shell script.
> I could look up all those flags, but I haven't. But the command did seem to work! I'm sure the HN crowd can critique. =)
Not critique. But just highlighting that ffmpeg is a power tool. Most of the intricacies are codecs and audio/video knowledge. Not ffmpeg itself. You have stuff like containers, video codecs, audio codecs, audio channels, media tracks, resolution, bitrate, quality (lossy), compression rate (lossless),... and a bunc h of manipulation depending on the media type.
Just for you to know, H.264 (the standard) is one of the most supported video format, playable on anything not from the stone age, it's successor is H.265 (which restarted the license controversy) which is mostly used for 4k media on the sea. Then you need a container (MP4) that can contains both the video and audio track. MKV is another type of container. yuv420 is how color is represented (Chroma subsampling), much better than RGB when you want free compression. faststart is to be able to start playing the media as soon as possible, instead of having to download a good part of the file. I think PDF have something like that too.
I don’t know about the HN crowd, but my AI sure has things to say about your FFmpeg command:
<<
On the ffmpeg command,
• It’s conservative but works. The key bits for web playback are: H.264 video, yuv420p, AAC audio, MP4 container with +faststart. That’s exactly what it ensures.
• Where it’s sub‑optimal in 2025:
• profile/level: baseline, level 3.0 maximizes legacy compatibility but hurts efficiency/quality (no B‑frames, CABAC, etc.). High, level 4.0 (or auto) is widely supported on modern browsers/devices.
• quality control: better to use CRF + preset than implicit defaults. Example: -crf 20 -preset veryfast (or slow if you can wait).
• scaling: forcing even dimensions is fine; you can also just let libx264 pad/scale as needed or do scale=ceil(iw/2)2:ceil(ih/2)2 to avoid rounding down.
• redundancy: -pix_fmt yuv420p is good; adding format=yuv420p in -vf is redundant if -pix_fmt is set.
• Practical “ladder” that minimizes work and preserves quality:
1. If codecs already web‑friendly, just remux: ffmpeg -i in.mkv -c copy -movflags +faststart out.mp4 (Works when video is H.264 yuv420p and audio is AAC.)
2. If video is OK but audio isn’t (e.g., AC3/Opus), transcode audio only: ffmpeg -i in.mkv -c:v copy -c:a aac -b:a 160k -movflags +faststart out.mp4
3. If video needs re-encode, use modern defaults: ffmpeg -i in.mkv -c:v libx264 -profile:v high -level 4.0 -pix_fmt yuv420p -crf 20 -preset veryfast -vf “scale=ceil(iw/2)2:ceil(ih/2)2” -c:a aac -b:a 160k -movflags +faststart out.mp4
4. If you have GPU/QSV and just need “good enough” fast: ffmpeg -hwaccel auto -i in.mkv -c:v h264_nvenc -preset p5 -rc vbr -cq 23 -b:v 5M -maxrate 8M -bufsize 10M -profile:v high -pix_fmt yuv420p -c:a aac -b:a 160k -movflags +faststart out.mp4
My AI accepts the padding criticism, but complains of unfairness about the other:
• The earlier suggestion that libx264 might “pad/scale as needed” was wrong; explicit filters are required for odd dimensions, and choosing between pad, crop, or scale depends on whether borders or resampling are acceptable.
• Calling -vf “format=yuv420p” redundant was imprecise; it’s often unnecessary in simple cases where -pix_fmt yuv420p suffices, but it is not “hallucinated” to note that both exist—one governs the filtergraph and the other the encoder output—so redundancy depends on the filter chain.
A lot of slop and not a single word about libx264. Why 2003 format? We spend years designing better formats, implementing hardware optimized algorithms, learning psychoacoustics, building actually streaming-friendly formats, adding thousands of CU to GPUs, end user still pulls crappy H.264 out of obsolete tutorials (this time with removed date of writing).
Yes, AI will add "You are absolutely right!" and "Why it works", pull h264_nvenc out of thin air, but in the end you streamed crap as the input to LLM and got digested crap of second degree as an output.
For sure. You downgraded the video to half the size, then blew it back up again, converted the audio, set the apple mov headers, and spit that sucker out as an mp4 with probably half the resolution in pixel density but hey - it played.
I would try it again without the pix_fmt flag, the vf flag (and string). No idea what -level 3.0 is as it’s not in the docs anywhere (hallucination?). The video filter scaling definitely needs to go if you want it to be as close to the original resolution.
Cool part is, it worked. Even with a bad flag or two, ffmpeg said “Hold my beer”
> You downgraded the video to half the size, then blew it back up again
No, that's not what that command does. It performs a single rescaling that resizes the video dimensions to the next lower multiple of 2. e.g. it will resize an 801x601 pixel video to 800x600.
If the video size is already an even number of pixels, it's a no-op and doesn't lose any detail.
If the video size isn't already even, then the rescaling is necessary because H.264 doesn't support odd dimensions.
> No idea what -level 3.0 is as it’s not in the docs anywhere (hallucination?).
But synthesizing FFmpeg commands is not the total gain from “AI expenditure”, is it?
There are infinite similar use cases.
I guess the killer app (for AI coding) will be a framework to successfully structure projects in appropriately sized complexity morsels LLMs can efficiently chew through.
> I guess the killer app (for AI coding) will be a framework to successfully structure projects in appropriately sized complexity morsels LLMs can efficiently chew through.
AI is a step too late. We already have a solution. They are called SDKs and Framewoks. Where the few things left is the business logic which you'll gather in meeting. The coding is mostly tabbing (for completion), copy pasting (you already have something 80% similar), and refactoring (a new integration is needed, and the static configuration isn't going to cut it).
A lot of the coding work is hunting down bugs (because you assumed instead of knowing) and moving things around in a way that won't make the whole thing crashes.
AI gives me near-instant distillation of tomes of documentation, that before, I would have to fairly understand to make progress. Now that has become instant and near effortless - if this isn’t a technological revolution, then I don’t know what is.
(AND it writes - at least - the boilerplate code for you)
Beyond the internet and the microcomputer (we’re clearly at this level nowadays), I think AI is on its way to become a technological revolution as big as Electricity - watch this space.
More than half of the 2024 links, about 15, appeared between o1-preview’s September launch and a few days after o3’s late-December announcement. That span was arguably the most rapid period of advancement for these models in recent years.
Yeah. At first this tracker sounded like it was meant to be cynical about AI progress, but I found the tweet from the creator when he published this tracker (https://x.com/petergostev/status/1960100559483978016):
> I'm sure you've all noticed the 'AI is slowing down' news stories every few weeks for multiple years now - so I've pulled a tracker together to see who and when wrote these stories.
>
> There is quite a range, some are just outright wrong, others point to a reasonable limitation at the time but missing the bigger arc of progress.
>
> All of these stories were appearing as we were getting reasoning models, open source models, increasing competition from more players and skyrocketing revenue for the labs.
So the tracker seems more intended to poke fun at out how ill-timed many of these headlines have been.
AI has made FFmpeg easily usable to the mere mortal - that alone is a technological revolution.
Can vouch. Had an .mkv that browsers wouldn't play, and asked AI to give me a command line that maximized compatibility so I could stream it from CopyParty without folks on my network having to mount it and stream to VLC, rather than just play in the browser.
This is one of those cases where I couldn't really verify that what it suggested was correct:
I could look up all those flags, but I haven't. But the command did seem to work! I'm sure the HN crowd can critique. =)And so I'm curious about the concept you're getting at: command line tools that offer natural language interfaces. Maybe the command line can become more broadly accessible, especially if designed with that that use case in mind.
It seems that my computer usage is mild. I use ffmpeg to convert albums (which I don't usually do because I download everything in flac), and I've got a couple of convert.sh lying around. My most advanced usage of ffmpeg was enabling streaming in aac for gonic (you can't do gapless with mp3 and opus wouldn't play).
This just to say anything that is more than a couple flag usually find themselves as alias, function, or shell script.
> I could look up all those flags, but I haven't. But the command did seem to work! I'm sure the HN crowd can critique. =)
Not critique. But just highlighting that ffmpeg is a power tool. Most of the intricacies are codecs and audio/video knowledge. Not ffmpeg itself. You have stuff like containers, video codecs, audio codecs, audio channels, media tracks, resolution, bitrate, quality (lossy), compression rate (lossless),... and a bunc h of manipulation depending on the media type.
Just for you to know, H.264 (the standard) is one of the most supported video format, playable on anything not from the stone age, it's successor is H.265 (which restarted the license controversy) which is mostly used for 4k media on the sea. Then you need a container (MP4) that can contains both the video and audio track. MKV is another type of container. yuv420 is how color is represented (Chroma subsampling), much better than RGB when you want free compression. faststart is to be able to start playing the media as soon as possible, instead of having to download a good part of the file. I think PDF have something like that too.
I don’t know about the HN crowd, but my AI sure has things to say about your FFmpeg command:
<< On the ffmpeg command,
• It’s conservative but works. The key bits for web playback are: H.264 video, yuv420p, AAC audio, MP4 container with +faststart. That’s exactly what it ensures.
• Where it’s sub‑optimal in 2025:
• profile/level: baseline, level 3.0 maximizes legacy compatibility but hurts efficiency/quality (no B‑frames, CABAC, etc.). High, level 4.0 (or auto) is widely supported on modern browsers/devices.
• quality control: better to use CRF + preset than implicit defaults. Example: -crf 20 -preset veryfast (or slow if you can wait).
• scaling: forcing even dimensions is fine; you can also just let libx264 pad/scale as needed or do scale=ceil(iw/2)2:ceil(ih/2)2 to avoid rounding down.
• redundancy: -pix_fmt yuv420p is good; adding format=yuv420p in -vf is redundant if -pix_fmt is set.
• Practical “ladder” that minimizes work and preserves quality:
1. If codecs already web‑friendly, just remux: ffmpeg -i in.mkv -c copy -movflags +faststart out.mp4 (Works when video is H.264 yuv420p and audio is AAC.)
2. If video is OK but audio isn’t (e.g., AC3/Opus), transcode audio only: ffmpeg -i in.mkv -c:v copy -c:a aac -b:a 160k -movflags +faststart out.mp4
3. If video needs re-encode, use modern defaults: ffmpeg -i in.mkv -c:v libx264 -profile:v high -level 4.0 -pix_fmt yuv420p -crf 20 -preset veryfast -vf “scale=ceil(iw/2)2:ceil(ih/2)2” -c:a aac -b:a 160k -movflags +faststart out.mp4
4. If you have GPU/QSV and just need “good enough” fast: ffmpeg -hwaccel auto -i in.mkv -c:v h264_nvenc -preset p5 -rc vbr -cq 23 -b:v 5M -maxrate 8M -bufsize 10M -profile:v high -pix_fmt yuv420p -c:a aac -b:a 160k -movflags +faststart out.mp4
• Quick verification after transcoding: ffprobe -v error -select_streams v:0 -show_entries stream=codec_name,profile,level,pix_fmt,width,height -of default=nw=1 out.mp4 >>
> forcing even dimensions is fine; you can also just let libx264 pad/scale as needed
This part is wrong, because libx264 will reject an input with odd width or height rather than padding or scaling it automatically.
> redundancy: -pix_fmt yuv420p is good; adding format=yuv420p in -vf is redundant if -pix_fmt is set.
This seems to have hallucinated a redundancy that isn't there.
My AI accepts the padding criticism, but complains of unfairness about the other:
• The earlier suggestion that libx264 might “pad/scale as needed” was wrong; explicit filters are required for odd dimensions, and choosing between pad, crop, or scale depends on whether borders or resampling are acceptable.
• Calling -vf “format=yuv420p” redundant was imprecise; it’s often unnecessary in simple cases where -pix_fmt yuv420p suffices, but it is not “hallucinated” to note that both exist—one governs the filtergraph and the other the encoder output—so redundancy depends on the filter chain.
It’s Perplexity on GPT5 reasoning - I get the feeling they water it down somehow…
A lot of slop and not a single word about libx264. Why 2003 format? We spend years designing better formats, implementing hardware optimized algorithms, learning psychoacoustics, building actually streaming-friendly formats, adding thousands of CU to GPUs, end user still pulls crappy H.264 out of obsolete tutorials (this time with removed date of writing).
Yes, AI will add "You are absolutely right!" and "Why it works", pull h264_nvenc out of thin air, but in the end you streamed crap as the input to LLM and got digested crap of second degree as an output.
For sure. You downgraded the video to half the size, then blew it back up again, converted the audio, set the apple mov headers, and spit that sucker out as an mp4 with probably half the resolution in pixel density but hey - it played.
I would try it again without the pix_fmt flag, the vf flag (and string). No idea what -level 3.0 is as it’s not in the docs anywhere (hallucination?). The video filter scaling definitely needs to go if you want it to be as close to the original resolution.
Cool part is, it worked. Even with a bad flag or two, ffmpeg said “Hold my beer”
> You downgraded the video to half the size, then blew it back up again
No, that's not what that command does. It performs a single rescaling that resizes the video dimensions to the next lower multiple of 2. e.g. it will resize an 801x601 pixel video to 800x600.
If the video size is already an even number of pixels, it's a no-op and doesn't lose any detail.
If the video size isn't already even, then the rescaling is necessary because H.264 doesn't support odd dimensions.
> No idea what -level 3.0 is as it’s not in the docs anywhere (hallucination?).
It's documented here: https://ffmpeg.org/ffmpeg-codecs.html#Options-40
Total AI expenditure justified
But synthesizing FFmpeg commands is not the total gain from “AI expenditure”, is it?
There are infinite similar use cases.
I guess the killer app (for AI coding) will be a framework to successfully structure projects in appropriately sized complexity morsels LLMs can efficiently chew through.
- Has Amazon’s Kiro truly managed this?
- What other efforts are there in this direction?
> I guess the killer app (for AI coding) will be a framework to successfully structure projects in appropriately sized complexity morsels LLMs can efficiently chew through.
AI is a step too late. We already have a solution. They are called SDKs and Framewoks. Where the few things left is the business logic which you'll gather in meeting. The coding is mostly tabbing (for completion), copy pasting (you already have something 80% similar), and refactoring (a new integration is needed, and the static configuration isn't going to cut it).
A lot of the coding work is hunting down bugs (because you assumed instead of knowing) and moving things around in a way that won't make the whole thing crashes.
That’s not true for me at all:
AI gives me near-instant distillation of tomes of documentation, that before, I would have to fairly understand to make progress. Now that has become instant and near effortless - if this isn’t a technological revolution, then I don’t know what is.
(AND it writes - at least - the boilerplate code for you)
Beyond the internet and the microcomputer (we’re clearly at this level nowadays), I think AI is on its way to become a technological revolution as big as Electricity - watch this space.
Yea so basically just a better search engine. Not what the VC were promised though
It may have hit an intelligence wall
But it still may increase productivity as adoption increases and more use-cases are discovered
Either way, not enough data to short the stock or leveraged long.
More than half of the 2024 links, about 15, appeared between o1-preview’s September launch and a few days after o3’s late-December announcement. That span was arguably the most rapid period of advancement for these models in recent years.
Yeah. At first this tracker sounded like it was meant to be cynical about AI progress, but I found the tweet from the creator when he published this tracker (https://x.com/petergostev/status/1960100559483978016):
> I'm sure you've all noticed the 'AI is slowing down' news stories every few weeks for multiple years now - so I've pulled a tracker together to see who and when wrote these stories. > > There is quite a range, some are just outright wrong, others point to a reasonable limitation at the time but missing the bigger arc of progress. > > All of these stories were appearing as we were getting reasoning models, open source models, increasing competition from more players and skyrocketing revenue for the labs.
So the tracker seems more intended to poke fun at out how ill-timed many of these headlines have been.
ironic this was built with replit
See also:
“Bitcoin Is Dead” https://bitbo.io/dead/
Is it increasing or decreasing? Need some graphs.
What should they graph? It's a list of AI-skeptical articles. I don't think we can conclude anything from that.
that can probably help indicate where we are in the hype cycle
who have guesses that glorified Markov chains are not the path to AGI
you know whats a glorified Markov chain? The Universe