From 0b3a55b9fedb95a5824ae6d85e99c4b829c243e6 Mon Sep 17 00:00:00 2001
From: Arnaud A <arnaud.alcabas@gmail.com>
Date: Sun, 3 Nov 2024 03:15:54 +0100
Subject: [PATCH] docs: Update documentation for text-to-audio feature
 regarding response_format (#4038)

---
 docs/content/docs/features/text-to-audio.md | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/docs/content/docs/features/text-to-audio.md b/docs/content/docs/features/text-to-audio.md
index 0e82f7f0..f0bd2c0c 100644
--- a/docs/content/docs/features/text-to-audio.md
+++ b/docs/content/docs/features/text-to-audio.md
@@ -201,3 +201,21 @@ curl -L http://localhost:8080/tts \
 "input": "Bonjour, je suis Ana Florence. Comment puis-je vous aider?"
 }' | aplay
 ```
+
+## Response format
+
+To provide some compatibility with OpenAI API regarding `response_format`, ffmpeg must be installed (or a docker image including ffmpeg used) to leverage converting the generated wav file before the api provide its response.
+
+Warning regarding a change in behaviour. Before this addition, the parameter was ignored and a wav file was always returned, with potential codec errors later in the integration (like trying to decode a mp3 file from a wav, which is the default format used by OpenAI)
+
+Supported format thanks to ffmpeg are `wav`, `mp3`, `aac`, `flac`, `opus`, defaulting to `wav` if an unknown or no format is provided.
+
+```bash
+curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{
+  "input": "Hello world",
+  "model": "tts",
+  "response_format": "mp3"
+}'
+```
+
+If a `response_format` is added in the query (other than `wav`) and ffmpeg is not available, the call will fail.