Vocoflex
VST3 · AU · AAX · Standalone
Get
Voice Morphing · Real-Time

Every voice. Every colour.

Vocoflex is the real-time AI voice transformation plugin from Dreamtonics. Train a voice in 10 seconds, blend timbres on an X/Y light pad, and morph between any two singers — or invent a new one entirely. Runs locally on your CPU.

$199 perpetual VST3 · AU · AAX · Standalone 40 Voice Presets

Import. Map. Transform. In your DAW.

Load Vocoflex as a VST3, AU or AAX plugin on any vocal track — or plug a microphone into the standalone app for live performance. A ten-second sample becomes a target voice. The light cursor blends between them in real time.

01 — Import

10 seconds is enough

Drop a vocal sample as short as ten seconds of clean audio. Vocoflex analyses the timbre and turns it into a curve — each dot on the curve is a small chunk of voice character that can be placed anywhere on the X/Y pad. Or skip the import and randomise a brand-new voice via the colour-picker hex code.

● 10-sec source · curve analysis · hex-code voices
02 — Map

Voices in light and shadow

Drag voices onto the X/Y pad. Voices under the light cursor get blended into your input; voices in the shadow region get subtracted. Pick gender on one axis, brightness on the other. Map waypoints to MIDI controller knobs and sliders for tactile, performance-grade control.

● X/Y morph · MIDI waypoints · additive + subtractive
03 — Transform

35ms latency, in your CPU

Vocoflex runs entirely offline on your CPU — no cloud round-trip, no rendering queues. Highest-quality mode 105ms, lowest-latency mode 35ms — fast enough for a single performer to switch between several voices during one live performance. Output is automatically watermarked.

● 35–105ms · offline CPU · watermark embedded

Voices that add. Voices that subtract.

Most voice changers give you one knob and one direction — male to female, robotic to human, dark to bright. Vocoflex does something different. Voices placed inside the light region of the X/Y pad get blended additively into your input. Voices placed in the shadow region get attenuated from the output — actively pulling the result away from those characteristics.

That makes the canvas a true two-direction colour wheel for voice. Want bright J-pop minus chest resonance? Drop one in the light, one in the shadow. Want a voice that lives between four targets you can morph between with DAW automation or a MIDI XY pad? The waypoint system handles that natively — record the path, play it back, automate it per phrase.

And because the entire engine runs on your CPU — not in the cloud — you can do all of this during a live performance with 35ms round-trip latency. Sing into a microphone, tap a MIDI pad, transform the voice mid-phrase.

Built for everyone who works with voice.

Vocoflex is a serious music-production tool — not a meme voice changer. It earns its keep on vocal-heavy sessions, live performances, and any production where you wish you had access to one more singer than you do.

PR

Producers

Build harmony stacks and backing vocals from one lead take. Audition voice options on a demo before booking the session singer.

SW

Songwriters

Demo ideas in any voice without hiring a vocalist. Hear a tune in a soprano, a baritone, a smoky alto — pick the one that lands.

LV

Live performers

35ms latency is low enough that a single performer can switch between several voices on stage, controlled by MIDI pads or a foot controller.

CD

Character designers

Game and animation studios use it for character voice prototyping. Note: it's stronger for musical applications than for dialogue.

A colour picker for voices.

Every feature was built to make voice character into something you can drag around a canvas, sample as a hex code, share with a collaborator, and morph between with a MIDI knob. No prompts. No queue times. No browser tab.

10-second training

Train a custom voice from ten seconds of clean audio.

Drop a phrase. Vocoflex analyses the timbre and adds it to your spectrum as a new draggable voice. No upload, no cloud — the analysis happens locally. Each voice gets a unique hex code that you can copy, save, and share with a collaborator who'll see the same voice appear in their own Vocoflex.

SAMPLE_IN.WAV · 10.4s ● ANALYSING
HEX OUT #9C72FF
DAW Formats

Every DAW. Standalone too.

Ableton, Logic, Pro Tools, Cubase, Studio One — anywhere a plug-in loads. Plus a standalone app for mic input and live performance.

VST3 AU AAX Standalone
No latency

35ms minimum round-trip.

Lowest-latency mode runs at 35ms. Highest-quality mode at 105ms. Both real-time. Both offline. Both CPU-only — no GPU required.

MIDI control

Waypoints to your controller

Map any waypoint on the X/Y pad to a MIDI button, knob or slider. Use a foot pedal during live performance.

Ethics built in

KYC + watermark

ID verification at purchase. Tamper-proof inaudible audio watermark embedded in every output — survives mixing and phone compression.

KYC verification
Audio watermark
40 voice presets

A licensed library to start with.

Forty fully-licensed voice presets ship with the plugin — from bright J-pop to smoky baritone, breathy whisper to operatic mezzo. Each one a colour on your spectrum. Add your own on top.

Bright J-Pop Neutral Tenor Smoke Alto Bright Soprano Growl Baritone Whisper Femme Operatic + 33 more

Where Vocoflex wins — and where it doesn't.

The voice-AI category has gotten crowded. Vocoflex's edge is the morph paradigm, real-time CPU rendering and the Dreamtonics ethics stack. It's not the cheapest, not the most preset-heavy, and not the right pick for sound-design dialogue work.

Vocoflex SoundID VoiceAI Kits AI Antares Vocodist
Real-time local rendering35–105ms · CPUDAW-basedCloud onlyPlugin only
X/Y morph + light/shadowSignature featureNoNoNo
Train custom voice10-sec clipPresets onlyCloud trainSynth-based
Voice hex codesShareableNoNoNo
MIDI controller mappingWaypointsAutomationNoYes
Watermark + KYCBuilt-inTerms onlyTerms onlyNo
Pitch correctionTimbre onlyCompanion appNoAuto-Tune
Pricing model$199 perpetual$99 + tokensSubscription$179 perpetual
Capability matrix as of June 2026. Vendor positioning per published materials.

What producers, mixers and engineers actually say.

★★★★★
So many useful options for vocal treatment, backing vocals and harmonies when you have limited singers or voice types in a production. The morph paradigm is genuinely novel — I've not used another voice tool that gives you a canvas instead of a list of presets.
FE
Flood
Producer · U2, NIN, Sade
★★★
Honest review: for solo singing vocals it's stunning — gender flips and timbre morphs that hold up in a finished mix. But for sound design work on dialogue or character voices it shows limits. Multi-vocal tracks struggle, artefacts appear on extreme transformations, and there's no pitch correction — that's a Synth V job. Brilliant for music. Workable for film. Buy it for the music.
CC
Cory Choy
Sound designer · postPerspective
★★★★★
The voice generation is on another level. Unlike other AI tools, there tends to remain artefacts — the experience with Vocoflex has been seamless, the quality of output is top notch and clean. And honestly, the X/Y light pad is just fun. I keep finding voices I didn't know existed.
MX
Mixing Engineer
Indie label · London

A colour picker, but for human voice.

Vocoflex was built by Dreamtonics, the Tokyo studio behind Synthesizer V — the singing voice synthesiser that became something of a cult product among producers, anime composers and J-pop arrangers. Founded by Kanru Hua, who'd spent years in singing voice conversion research, Dreamtonics had quietly built one of the best vocal-synthesis engines in the industry. Vocoflex is what happened when that engine got pointed at existing vocals rather than generating new ones.

The plugin launched in July 2024 at an introductory price of $159, settling at $199 perpetual a month later. The pricing is deliberate — Dreamtonics doesn't do subscriptions, and there's no token system. You pay once, you own the plugin, and your KYC-verified license travels with you across machines.

The interface is the part most reviewers struggle to describe in words. There's an X/Y pad. There's a glowing light cursor you can drag. There are voice "swatches" — visual representations of voice timbres, each one bearing a hex-style colour code that you can share with collaborators. Voices placed inside the light region of the canvas get blended into your input. Voices placed in the shadow region get subtracted. Dreamtonics describe it as "a colour picker for voices", and after using it for an hour the analogy genuinely fits.

Ethics gets unusual treatment here. Vocoflex requires ID verification through a KYC partner before activation — you can't buy it anonymously. Every output gets an inaudible, tamper-proof watermark embedded in the audio that survives mixing with background music and even lossy compression through a phone call. The watermark encodes a user-unique license ID, so if a voice gets used somewhere it shouldn't, Dreamtonics can trace it back to the creator. This is the kind of safeguard most voice-AI tools talk about and then don't ship.

It's worth being honest about what Vocoflex isn't. It doesn't change pitch — only timbre — so a male-to-female conversion needs Synthesizer V or pitch correction for the final octave shift. It struggles with multi-vocal tracks — best on a clean solo vocal stem. It's better for music than for sound design — dialogue transformation works but the plugin's interface assumes you're working in a DAW on musical phrases, not Pro Tools editing scene-by-scene. And the licensing is strict: you cannot use Vocoflex with Synthesizer V voice databases from Dreamtonics' partners without explicit permission.

For everything else — singing vocals, harmony stacking, demo recording, live performance, character voicing, voice prototyping — it's the most novel and most musical voice-AI tool currently shipping. The morph paradigm is genuinely new, and the offline-CPU rendering means it's the rare AI tool that works on an airplane.

Questions, answered.

How is Vocoflex different from a voice cloning tool?

Voice cloning tools like ElevenLabs or Resemble.ai typically train on minutes-to-hours of source audio and generate new speech in that voice via text-to-speech or speech-to-speech. Vocoflex trains on just ten seconds, doesn't generate speech from text, and operates entirely on existing vocals — it transforms the timbre of a singer or speaker into another voice in real time. It also doesn't generate pitch — only timbre. Think of it as a vocal effects processor rather than a voice cloner.

What does Vocoflex cost?

$199 perpetual license. One-time payment, no subscription, no tokens, no rendering credits. The license is tied to your KYC-verified identity, not a single machine, so you can activate it on multiple computers you own. There's no free trial — Dreamtonics offers a demo video walkthrough but not a time-limited install. The pricing was $159 during the introductory period in July–August 2024 and has been $199 since.

Why does Vocoflex require ID verification (KYC)?

Voice AI is powerful enough to be misused — to impersonate, defraud, or create non-consensual content. Dreamtonics requires ID verification through a KYC partner before your activation code unlocks, so that every license is traceable to a verified human. Combined with the audio watermark embedded in every output, this means any audio produced by Vocoflex can in principle be traced back to the licensee — a deterrent against unethical use. This is friction, but it's a deliberate choice to make the plugin auditable.

What is the audio watermark and is it removable?

Every Vocoflex output contains an inaudible, tamper-proof watermark integrated directly into the voice-generation model — not added as a post-process layer that could be stripped. Dreamtonics describe it as resilient against mixing with background music, lossy compression, and even transmission through a telephone-quality channel. The watermark encodes your unique license ID. You cannot remove it; that's the design.

What are the system requirements?

Windows 11 or macOS 11.0+, with at least an Intel Core i5-7300U, AMD Ryzen 3 3300U, or Apple Silicon (M1 or later) CPU. 4 GB RAM minimum. 200 MB of storage. No GPU required — Vocoflex runs entirely on the CPU. An internet connection is needed for initial license verification but not for ongoing use; the plugin runs offline once activated.

Will it change pitch, or just timbre?

Timbre only. Vocoflex transforms voice character — the colour and quality of the sound — but it preserves the original pitch and timing. If you sing in C and want the output to sound a fifth higher, you'll need a pitch correction tool or Dreamtonics' sister product Synthesizer V, which generates singing voice from MIDI. The reason for this design choice is intentional: changing pitch and timbre simultaneously introduces compounding artefacts. Vocoflex stays in one lane and does it well.

Does it work for sound design and dialogue, not just music?

It can be used for sound design and dialogue, but the design centre is music. Reviewers (notably postPerspective) have noted that the interface assumes you're working on musical phrases — automation curves, MIDI controllers, X/Y morphs that map well to song structure but less obviously to scene-by-scene dialogue editing. Quality on solo speech is good; quality on multi-character dialogue tracks degrades. If your primary use case is post-production dialogue, you may find it useful for one-off voice transformations and less useful as a daily driver.

What are the honest limits I should know about?

Four limits worth naming. One: $199 is real money with no free trial — budget for it. Two: KYC + watermark is great ethics but it's friction; if you want anonymous voice transformation, this isn't it. Three: it doesn't change pitch — only timbre — so cross-octave transformations need pitch correction on top. Four: extreme transformations and multi-vocal tracks can produce artefacts; the official advice is to use subtler settings or solo'd stems. Within those limits, it's the most musical voice-AI plugin shipping right now.

Voices, like colours. Painted in real time.

The voice morphing plugin from Dreamtonics. VST3, AU, AAX and standalone. $199 perpetual. KYC-verified, watermarked, ethically built.