POST /stt

Transcribes audio files to text using OpenAI’s Whisper model via Mastra’s voice capabilities.

Request Body

Content-Type: multipart/form-data

audio

File

required

Audio file to transcribe. Supported formats: MP3, WAV, M4A, FLAC, OGG, WEBM, etc.

Request Example

// Using FormData for file upload
async function transcribeAudio(audioFile) {
  const formData = new FormData();
  formData.append('audio', audioFile);

  const response = await fetch('https://oyester.metaphy.live/stt', {
    method: 'POST',
    body: formData
  });

  if (!response.ok) {
    throw new Error(`Transcription failed: ${response.status}`);
  }

  const result = await response.json();
  return result.transcript;
}

Response

transcript

string

The transcribed text from the audio file.

success

boolean

Always true for successful transcriptions.

timestamp

string

ISO timestamp of when transcription was completed.

processingTime

number

Processing time in milliseconds.

Success Response (200)

{
  "transcript": "How can I help you find inner peace today?",
  "success": true,
  "timestamp": "2025-11-21T10:30:00.000Z",
  "processingTime": 2450
}

Error Responses

Code	Description
400	No audio file provided or invalid file format
500	Audio processing failed or API key issues

Supported Audio Formats

MP3
WAV
M4A
FLAC
OGG
WEBM
And other formats supported by OpenAI Whisper

File Size Limits

Maximum file size: 25MB (OpenAI API limit)
Recommended: Keep files under 10MB for faster processing

Language Support

Default: English (en-US)
Multi-language: Automatically detects multiple languages
Best results: Clear pronunciation for non-English audio

Frontend Integration Examples

File Input Handler

// HTML
<input type="file" id="audioInput" accept="audio/*">

// JavaScript
document.getElementById('audioInput').addEventListener('change', async (event) => {
  const file = event.target.files[0];
  if (file) {
    try {
      const transcript = await transcribeAudio(file);
      document.getElementById('transcript').textContent = transcript;
    } catch (error) {
      console.error('Transcription error:', error);
    }
  }
});

Drag and Drop

const dropZone = document.getElementById('dropZone');

dropZone.addEventListener('dragover', (e) => {
  e.preventDefault();
  dropZone.classList.add('dragover');
});

dropZone.addEventListener('dragleave', () => {
  dropZone.classList.remove('dragover');
});

dropZone.addEventListener('drop', async (e) => {
  e.preventDefault();
  dropZone.classList.remove('dragover');

  const files = e.dataTransfer.files;
  if (files.length > 0) {
    const file = files[0];
    if (file.type.startsWith('audio/')) {
      try {
        const transcript = await transcribeAudio(file);
        console.log('Transcription:', transcript);
      } catch (error) {
        console.error('Error:', error);
      }
    }
  }
});

Recording and Transcription

class AudioRecorder {
  constructor() {
    this.mediaRecorder = null;
    this.audioChunks = [];
  }

  async startRecording() {
    const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
    this.mediaRecorder = new MediaRecorder(stream);

    this.mediaRecorder.ondataavailable = (event) => {
      this.audioChunks.push(event.data);
    };

    this.mediaRecorder.onstop = async () => {
      const audioBlob = new Blob(this.audioChunks, { type: 'audio/wav' });
      const audioFile = new File([audioBlob], 'recording.wav', { type: 'audio/wav' });

      try {
        const transcript = await transcribeAudio(audioFile);
        console.log('Live transcription:', transcript);
      } catch (error) {
        console.error('Transcription failed:', error);
      }
    };

    this.audioChunks = [];
    this.mediaRecorder.start();
  }

  stopRecording() {
    if (this.mediaRecorder && this.mediaRecorder.state === 'recording') {
      this.mediaRecorder.stop();
    }
  }
}

// Usage
const recorder = new AudioRecorder();

// Start recording
document.getElementById('startBtn').addEventListener('click', () => {
  recorder.startRecording();
});

// Stop recording
document.getElementById('stopBtn').addEventListener('click', () => {
  recorder.stopRecording();
});

cURL Examples

Basic File Upload

# Using a local audio file
curl -X POST https://oyester.metaphy.live/stt \
  -F "audio=@/path/to/your/audio.mp3"

# Example response
{
  "transcript": "How can I help you today?",
  "success": true,
  "timestamp": "2025-11-21T10:30:00.000Z",
  "processingTime": 2450
}

With Custom Headers

# With custom headers
curl -X POST https://oyester.metaphy.live/stt \
  -H "Authorization: Bearer your-token" \
  -F "audio=@meditation_question.wav"

Error Handling

async function transcribeAudio(audioFile) {
  const formData = new FormData();
  formData.append('audio', audioFile);

  try {
    const response = await fetch('https://oyester.metaphy.live/stt', {
      method: 'POST',
      body: formData
    });

    if (!response.ok) {
      const errorData = await response.json();
      throw new Error(`Transcription failed: ${errorData.message || response.statusText}`);
    }

    const result = await response.json();
    return result.transcript;
  } catch (error) {
    console.error('Transcription error:', error);

    // Handle specific error types
    if (error.message.includes('400')) {
      throw new Error('Invalid audio file. Please check the format and try again.');
    } else if (error.message.includes('500')) {
      throw new Error('Server error. Please try again later.');
    } else {
      throw new Error('Network error. Please check your connection.');
    }
  }
}

Best Practices

File Format: Use MP3 or WAV for best compatibility and smaller file sizes.

Audio Quality: Higher quality audio generally produces better transcriptions.

File Size: Compress audio files when possible to reduce upload time and processing costs.

Error Handling: Always implement proper error handling for network issues and API failures.

Getting Started

Core API

Tools

Personalities

Reference

Speech-to-Text

POST /stt

Request Body

Request Example

Response

Success Response (200)

Error Responses

Supported Audio Formats

File Size Limits

Language Support

Frontend Integration Examples

File Input Handler

Drag and Drop

Recording and Transcription

cURL Examples

Basic File Upload

With Custom Headers

Error Handling

Best Practices

Getting Started

Core API

Tools

Personalities

Reference

​POST /stt

​Request Body

​Request Example

​Response

​Success Response (200)

​Error Responses

​Supported Audio Formats

​File Size Limits

​Language Support

​Frontend Integration Examples

​File Input Handler

​Drag and Drop

​Recording and Transcription

​cURL Examples

​Basic File Upload

​With Custom Headers

​Error Handling

​Best Practices

POST /stt

Request Body

Request Example

Response

Success Response (200)

Error Responses

Supported Audio Formats

File Size Limits

Language Support

Frontend Integration Examples

File Input Handler

Drag and Drop

Recording and Transcription

cURL Examples

Basic File Upload

With Custom Headers

Error Handling

Best Practices