🎵

Audio

Very Easy

Handle audio file processing, editing, and conversion tasks

Install

Use in AI Agents

Claude Code

# Install Skill (downloads SKILL.md to .claude/skills/)
clawhub install audio

# Then just tell Claude: "use Audio to help me..."

OpenAI Codex / Cursor / Windsurf

# Same install command — works with all SKILL.md-compatible AI coding tools
clawhub install audio

OpenClaw Ecosystem

This Skill is compatible with the OpenClaw standard. After installation, a SKILL.md file is auto-generated, usable by any OpenClaw-compatible AI Agent (Claude Code, Cursor, Windsurf, etc.).

Environment & Dependencies

🟢

Very Easy

Fully free & local — no account, no GPU needed

Runs Locally

No internet required — data never leaves your machine

SKILL.md

Requirements

Required:

ffmpeg / ffprobe — core audio processing

Optional (for advanced features):

sox — additional noise reduction
whisper — local transcription (or use API)
demucs — stem separation

Quick Reference

| Situation | Load | |-----------|------| | FFmpeg commands by task | commands.md | | Loudness standards by platform | loudness.md | | Podcast production workflow | podcast.md | | Transcription workflow | transcription.md |

Core Capabilities

| Task | Method | |------|--------| | Convert formats | FFmpeg (-acodec) | | Remove noise | FFmpeg filters or SoX | | Normalize loudness | ffmpeg-normalize or -af loudnorm | | Transcribe | Whisper → text, SRT, VTT | | Separate stems | Demucs (vocals, drums, bass, other) |

Execution Pattern

Clarify goal — What format? What loudness? What platform?
Analyze source — ffprobe for codec, sample rate, channels, duration
Process — FFmpeg/SoX for transformation
Verify — Check output plays, meets specs, sounds correct
Deliver — Provide file to user

Common Requests → Actions

| User says | Agent does | |-----------|------------| | "Convert to MP3" | -acodec libmp3lame -q:a 2 | | "Remove background noise" | Apply highpass/lowpass or dedicated denoiser | | "Normalize for podcast" | -af loudnorm=I=-16:TP=-1.5:LRA=11 | | "Transcribe this" | Whisper → output SRT/VTT/TXT | | "Extract audio from video" | -vn -acodec copy or re-encode | | "Make it smaller" | Lower bitrate: -b:a 128k or -b:a 96k | | "Speed up 1.5x" | -af atempo=1.5 |

Format Quick Reference

| Format | Use Case | Quality | |--------|----------|---------| | WAV | Master, editing | Lossless | | FLAC | Archive, audiophile | Lossless compressed | | MP3 | Universal sharing | Lossy, 128-320 kbps | | AAC/M4A | Apple, podcasts | Lossy, efficient | | OGG/Opus | WhatsApp, Discord | Lossy, very efficient |

Quality Defaults

Podcast: -16 LUFS (Spotify), -19 LUFS (Apple)
Music: -14 LUFS (Spotify), -16 LUFS (Apple Music)
MP3 quality: VBR -q:a 2 (~190 kbps) or CBR -b:a 192k
Sample rate: 44.1kHz for music, 48kHz for video sync

Scope

This skill:

Processes audio files user explicitly provides
Runs FFmpeg commands on user request
Does NOT access cloud services without user knowing
Does NOT store files persistently (user manages their files)

Code Example

clawhub audio convert input.wav --format mp3 --bitrate 256k output.mp3

Also popular in Audio & Voice

View all

Voice Transcribe

clawhubAudio & Voice Medium

3.6

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

clawhub install voice-transcribe

Podcast

clawhubAudio & Voice Easy Tested

3.6

Create and grow podcasts by planning episodes, producing audio or video, generating clips, and building audience across formats.

297

clawhub install podcast

Podcast Chaptering Highlights

clawhubAudio & Voice Easy

3.5

Create chapters, highlights, and show notes from podcast audio or transcripts. Use when a user wants chapter markers, highlight clips, or show-note drafts without publishing or distribution actions.

clawhub install podcast-chaptering-highlights

Text to Speech

clawhubAudio & Voice Medium

3.5

Generate speech audio from text using HeyGen's Starfish TTS model. Use when: (1) Generating standalone speech audio files from text, (2) Converting text to s...

5.4K

clawhub install text-to-speech-heygen

Podcast Production Pipeline

clawhubAudio & Voice Medium

3.4

端到端播客制作流水线 - 从选题到发布的完整自动化。支持录制前调研、大纲生成、节目笔记、社交媒体宣发。含国内平台适配（小宇宙/喜马拉雅/B站/小红书）。

clawhub install podcast-production-pipeline

Gemini Image Remix

clawhubAudio & Voice Medium

3.4

Generate or remix images using Gemini models with text prompts and multiple input images, supporting various styles, resolutions, and advanced model options.

clawhub install gemini-image-remix

Audio

Install

🤖Use in AI Agents

Claude Code

OpenAI Codex / Cursor / Windsurf

OpenClaw Ecosystem

Environment & Dependencies

SKILL.md

Requirements

Quick Reference

Core Capabilities

Execution Pattern

Common Requests → Actions

Format Quick Reference

Quality Defaults

Scope

Code Example

Also popular in Audio & Voice

Voice Transcribe

Podcast

Podcast Chaptering Highlights

Text to Speech

Podcast Production Pipeline

Gemini Image Remix

Use in AI Agents