High level overview of what's happening with OpenAI Whisper Speaker Diarization:
Using Open AI's Whisper model to seperate audio into segments and generate transcripts.
Then generating speaker embeddings for each segments.
Then using agglomerative clustering on the embeddings to identify the speaker for each segment.
Speaker Identification or Speaker Labelling is very important for Podcast Transcription or Conversations Audio Transcription. This code helps you do that.
Dwarkesh's Patel Tweet Announcement - / 1579672641887408129
Colab - colab.research...
huggingface.co...
Негізгі бет OpenAI Whisper Speaker Diarization - Transcription with Speaker Names
Пікірлер: 88