Skip to main content

Azure OpenAI Whisper Parser

Azure OpenAI Whisper Parser is a wrapper around the Azure OpenAI Whisper API which utilizes machine learning to transcribe audio files to english text.

The Parser supports .mp3, .mp4, .mpeg, .mpga, .m4a, .wav, and .webm.

The current implementation follows LangChain core principles and can be used with other loaders to handle both audio downloading and parsing. As a result of this the parser will yield an Iterator[Document].

Prerequisites

The service requires Azure credentials, Azure endpoint and Whisper Model deployment, which can be set up by following the guide here. Furthermore, the required dependencies must be installed.

%pip install -Uq  langchain langchain-community openai

Example 1

The AzureOpenAIWhisperParser's method, .lazy_parse, accepts a Blob object as a parameter containing the file path of the file to be transcribed.

from langchain_core.documents.base import Blob

audio_path = "path/to/your/audio/file"
audio_blob = Blob(path=audio_path)
API Reference:Blob
from langchain_community.document_loaders.parsers.audio import AzureOpenAIWhisperParser

endpoint = "<your_endpoint>"
key = "<your_api_key"
version = "<your_api_version>"
name = "<your_deployment_name>"

parser = AzureOpenAIWhisperParser(
api_key=key, azure_endpoint=endpoint, api_version=version, deployment_name=name
)
documents = parser.lazy_parse(blob=audio_blob)
for doc in documents:
print(doc.page_content)

Example 2

The AzureOpenAIWhisperParser can also be used in conjunction with audio loaders, like the YoutubeAudioLoader with a GenericLoader.

from langchain_community.document_loaders.blob_loaders.youtube_audio import (
YoutubeAudioLoader,
)
from langchain_community.document_loaders.generic import GenericLoader
# Must be a list
url = ["www.youtube.url.com"]

save_dir = "save/directory/"
name = "<your_deployment_name>"

loader = GenericLoader(
YoutubeAudioLoader(url, save_dir), AzureOpenAIWhisperParser(deployment_name=name)
)

docs = loader.load()
for doc in documents:
print(doc.page_content)

Was this page helpful?