Node Inputs

  • audio_file: The audio file that will be converted to text. This should be of type File and may look something like speech.mp3.
  • model: Select the AI Model you want to use for the conversion. Currently, we support “OpenAI Whisper”.

Node Output

  • transcript: This is the output of the node, which is the transcribed audio as text.

Node Functionality

When To Use

The “Speech to Text” node is perfect when you have audio recordings that need to be converted into written text. This could be useful in several scenarios such as transcribing interviews, meetings, lectures, or any audio content that you want to be searchable or more accessible.

Whether you’re a journalist looking for specific quotes within large recordings, a student who prefers reading lecture content rather than listening to it, or a professional needing written records of meetings, this node can save you hours of manual transcription. The node leverages powerful AI technology, specifically the “OpenAI Whisper” model, to ensure that your audio is translated to text as accurately as possible.