speechall-cli
Install and use the speechall CLI tool for speech-to-text transcription.
Setup & Installation
Install command
clawhub install atacan/speechall-cliIf the CLI is not installed:
Install command
npx clawhub@latest install atacan/speechall-cliOr install with OpenClaw CLI:
Install command
openclaw skills install atacan/speechall-clior paste the repo link into your assistant's chat
Install command
https://github.com/openclaw/skills/tree/main/skills/atacan/speechall-cliWhat This Skill Does
CLI tool for transcribing audio and video files to text via the Speechall API. Routes requests through multiple speech-to-text providers from a single interface. Supports speaker diarization, subtitle formats (SRT, VTT), and custom vocabulary.
Provides access to speech-to-text models from OpenAI, Deepgram, AssemblyAI, Google, and others through a single CLI without separate SDKs or account integrations for each provider.
When to Use It
- Transcribing a recorded interview to a text file
- Generating SRT subtitles for a recorded presentation
- Identifying speakers in a multi-person meeting recording
- Processing domain-specific audio with custom terminology
- Listing available STT models to compare provider options
View original SKILL.md file
# speechall-cli CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more). ## Installation ### Homebrew (macOS and Linux) ```bash brew install Speechall/tap/speechall ``` **Without Homebrew**: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your `PATH`. ### Verify ```bash speechall --version ``` ## Authentication An API key is required. Provide it via environment variable (preferred) or flag: ```bash export SPEECHALL_API_KEY="your-key-here" # or speechall --api-key "your-key-here" audio.wav ``` The user can create an API key on https://speechall.com/console/api-keys ## Commands ### transcribe (default) Transcribe an audio or video file. This is the default subcommand — `speechall audio.wav` is equivalent to `speechall transcribe audio.wav`. ```bash speechall <file> [options] ``` **Options:** | Flag | Description | Default | |---|---|---| | `--model <provider.model>` | STT model identifier | `openai.gpt-4o-mini-transcribe` | | `--language <code>` | Language code (e.g. `en`, `tr`, `de`) | API default (auto-detect) | | `--output-format <format>` | Output format (`text`, `json`, `verbose_json`, `srt`, `vtt`) | API default | | `--diarization` | Enable speaker diarization | off | | `--speakers-expected <n>` | Expected number of speakers (use with `--diarization`) | — | | `--no-punctuation` | Disable automatic punctuation | — | | `--temperature <0.0-1.0>` | Model temperature | — | | `--initial-prompt <text>` | Text prompt to guide model style | — | | `--custom-vocabulary <term>` | Terms to boost recognition (repeatable) | — | | `--ruleset-id <uuid>` | Replacement ruleset UUID | — | | `--api-key <key>` | API key (overrides `SPEECHALL_API_KEY` env var) | — | **Examples:** ```bash # Basic transcription speechall interview.mp3 # Specific model and language speechall call.wav --model deepgram.nova-2 --language en # Speaker diarization with SRT output speechall meeting.wav --diarization --speakers-expected 3 --output-format srt # Custom vocabulary for domain-specific terms speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction" # Transcribe a video file (macOS extracts audio automatically) speechall presentation.mp4 ``` ### models List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic. ```bash speechall models [options] ``` **Filter flags:** | Flag | Description | |---|---| | `--provider <name>` | Filter by provider (e.g. `openai`, `deepgram`) | | `--language <code>` | Filter by supported language (`tr` matches `tr`, `tr-TR`, `tr-CY`) | | `--diarization` | Only models supporting speaker diarization | | `--srt` | Only models supporting SRT output | | `--vtt` | Only models supporting VTT output | | `--punctuation` | Only models supporting automatic punctuation | | `--streamable` | Only models supporting real-time streaming | | `--vocabulary` | Only models supporting custom vocabulary | **Examples:** ```bash # List all available models speechall models # Models from a specific provider speechall models --provider deepgram # Models that support Turkish and diarization speechall models --language tr --diarization # Pipe to jq for specific fields speechall models --provider openai | jq '.[].identifier' ``` ## Tips - On macOS, video files (`.mp4`, `.mov`, etc.) are automatically converted to audio before upload. - On Linux, pass audio files directly (`.wav`, `.mp3`, `.m4a`, `.flac`, etc.). - Output goes to stdout. Redirect to save: `speechall audio.wav > transcript.txt` - Errors go to stderr, so piping stdout is safe. - Run `speechall --help`, `speechall transcribe --help`, or `speechall models --help` to see all valid enum values for model identifiers, language codes, and output formats.
Example Workflow
Here's how your AI assistant might use this skill in practice.
User asks: Transcribing a recorded interview to a text file
- 1Transcribing a recorded interview to a text file
- 2Generating SRT subtitles for a recorded presentation
- 3Identifying speakers in a multi-person meeting recording
- 4Processing domain-specific audio with custom terminology
- 5Listing available STT models to compare provider options
Install and use the speechall CLI tool for speech-to-text transcription.
Security Audits
These signals reflect official OpenClaw status values. A Suspicious status means the skill should be used with extra caution.