zvukogram

Search & Research
v1.1.4
Benign

Text-to-Speech via Zvukogram API with SSML support.

2471 downloads471 installsby @erview

Setup & Installation

Install command

clawhub install erview/zvukogram

If the CLI is not installed:

Install command

npx clawhub@latest install erview/zvukogram

Or install with OpenClaw CLI:

Install command

openclaw skills install erview/zvukogram

or paste the repo link into your assistant's chat

Install command

https://github.com/openclaw/skills/tree/main/skills/erview/zvukogram

What This Skill Does

Generates speech from text using the Zvukogram API. Supports SSML markup for stress marks, pauses, and speed control. Can merge multiple audio fragments and handle English word transcription for Russian TTS.

SSML support with per-word stress control and English transcription aliases gives more accurate Russian-language output than basic TTS APIs.

When to Use It

  • Converting written articles to audio files
  • Adding voice notifications to automated pipelines
  • Recording podcast narration with multiple voices
  • Voicing news scripts with proper pronunciation control
  • Generating audio from long-form text documents
View original SKILL.md file
# Zvukogram TTS

Speech generation via Zvukogram API with SSML markup support.

## Requirements

To use this skill, you need:
- **Zvukogram API token** — get it at https://zvukogram.com/
- **Zvukogram account email**

### Setup

Create file `~/.config/zvukogram/config.json`:
```bash
mkdir -p ~/.config/zvukogram
```

```json
{
  "token": "your_api_token_here",
  "email": "your_email@example.com"
}
```

Or use environment variables:
```bash
export ZVUKOGRAM_TOKEN=your_api_token_here
export ZVUKOGRAM_EMAIL=your_email@example.com
```

## Quick Start

```bash
# Simple TTS
python3 scripts/tts.py --text "Hello, world!" --voice Алена --output hello.mp3

# With +20% speed
python3 scripts/tts.py --text "Fast text" --voice Алена --speed 1.2 --output fast.mp3

# Check balance
python3 scripts/balance.py
```

## Features

- **TTS generation** — text to speech
- **SSML support** — stress marks, pauses, speed
- **Audio merging** — combine fragments via ffmpeg
- **Transcription** — proper pronunciation of English words

## SSML Markup

### Stress Marks
Use `+` before stressed vowel:
```
З+амок — stress on "a"
зам+ок — stress on "o"
```

### Aliases (Transcription)
```xml
<sub alias="Оупен Эй Ай">OpenAI</sub>
<sub alias="Самсунг">Samsung</sub>
<sub alias="Ал+ьтман">Альтман</sub>
```

### Speed
```xml
<prosody rate="1.2">20% faster</prosody>
<prosody rate="fast">Fast text</prosody>
```

### Pauses
```xml
<break time="500ms"/>
```

## Available Voices

- **Алена** — female, neutral (recommended)
- **Андрей** — male, neutral (recommended)
- **Александра** — female, soft
- **Антон** — male, business

Full list: see [references/VOICES.md](references/VOICES.md)

## Examples

See [references/EXAMPLES.md](references/EXAMPLES.md) for:
- Dialogs and podcasts
- News voiceover
- Voice notifications
- Long texts

## Transcription

See [references/TRANSCRIPTION.md](references/TRANSCRIPTION.md) for proper pronunciation:
- OpenAI → Оупен Эй Ай
- GPT → Джи Пи Ти
- Samsung → Самсунг
- Altman → Ал+ьтман

## SSML Reference

- Full, agent-readable reference (recommended): [references/SSML.md](references/SSML.md)
- Quick lookup: [references/SSML_CHEATSHEET.md](references/SSML_CHEATSHEET.md)
- Official Zvukogram SSML docs: https://zvukogram.com/node/ssml/

## Troubleshooting

See [references/TROUBLESHOOTING.md](references/TROUBLESHOOTING.md) for:
- API errors
- Audio issues
- Diagnostics

## API Limitations

- Max 1000 characters per request (`/text`)
- Up to 1M characters via `/longtext`
- Do not rely on `<voice>` / `<speak>` wrappers for API usage. For multi-voice, generate and merge fragments (one request per voice).

## Links

- API docs: https://zvukogram.com/node/api/
- Voice rating: https://zvukogram.com/rating/
- Support: https://t.me/zvukogram

Example Workflow

Here's how your AI assistant might use this skill in practice.

INPUT

User asks: Converting written articles to audio files

AGENT
  1. 1Converting written articles to audio files
  2. 2Adding voice notifications to automated pipelines
  3. 3Recording podcast narration with multiple voices
  4. 4Voicing news scripts with proper pronunciation control
  5. 5Generating audio from long-form text documents
OUTPUT
Text-to-Speech via Zvukogram API with SSML support.

Share this skill

Security Audits

VirusTotalBenign
OpenClawBenign
View full report

These signals reflect official OpenClaw status values. A Suspicious status means the skill should be used with extra caution.

Details

LanguageMarkdown
Last updatedMar 7, 2026