Video file to subtitles file

Chris Angelico rosuav at gmail.com
Mon Aug 31 02:11:30 EDT 2020


On Mon, Aug 31, 2020 at 3:36 PM Christian Gollwitzer <auriocus at gmx.de> wrote:
>
> Am 30.08.20 um 21:43 schrieb MRAB:
> > On 2020-08-30 18:10, Christian Gollwitzer wrote:
> >> Well, with enough effort it is possible to build a system that is more
> >> useful than "entertaining". Google did that, English youtube videos can
> >> be annotated with subtitles from speech recognition. For example, try
> >> this video:
> >> https://www.youtube.com/watch?v=lYVLpC_8SQE
> >>
> >>
> > There's not much background noise there; it takes place in a quiet room.
>
>   It becomes a bit worse once the background music sets in, but still is
> usable. Feel free to try any other video. I think, it works with any
> video in English.
>
> I think that for "Hollywood"-style movies you will always have a crisp
> sound of the speech. They want the viewers to listen effortlessly -
> background music is typical, "true" noise is rare.
>
> Maybe try with this video:
> https://www.youtube.com/watch?v=nHn4XpKA6vM
>
> As soon as they are up in the air you have the engine sound overlaying
> the speech, and still the transcription is quite good. It sometimes
> mistakes the flapping of the engine as "applause" and misses a word or a
> sentence, but still very good.
>

But remember, the OP specifically does NOT want to use Google or
Amazon services for this. What you're showcasing here has been trained
on the gigantic corpus of Youtube videos, and that's simply not going
to be practical to recreate. When I said the results were
"entertaining", I was talking about what CMU Sphinx is capable of
without any assistance (and also what I've seen from numerous
real-time captioning tools across the internet). Sometimes it's
reasonable... sometimes it just isn't.

ChrisA


More information about the Python-list mailing list