Word level timestamp

Nicolas Fournier · Nicolas Fournier · commit 302884de5f00 · 2025-01-15T15:09:48.000+01:00
diff --git a/chapters/live-stt/features.mdx b/chapters/live-stt/features.mdx
@@ -7,6 +7,7 @@ description: "Features overview of Gladia's Real-Time speech-to-text (STT) API."
 |-----------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
 | [Language(s)](/chapters/settings/language-options) | Configure the model languages and/or enable multi-languages transcription. |
 | [Custom Vocabulary](/chapters/settings/vocabulary-spelling) | Enhance the transcription precision of words you know.|
+| [Word-level timestamps](/chapters/settings/word-timestamps) | Know the exact timestamp for each word, giving you a more precise transcription. |
 
 
 | **Audio Intelligence**                                                                       | **Description**                                                                                            |
@@ -23,43 +24,6 @@ description: "Features overview of Gladia's Real-Time speech-to-text (STT) API."
 <Note>All the configuration properties described below are defined in the [POST /v2/live endpoint](/api-reference/v2/live/init).</Note>
 
 
-
-## Word-level timestamps
-
-Instead of just getting timestamps for when utterances begin and end, Gladia's real-time API provides **word-level timestamps**. This lets you know the exact timestamp for each word,  giving you a more precise transcription, facilitating detailed analysis and more accurate synchronization with audio and video files.
-
-To enable it, pass the following configuration:
-
-```json
-{
-  "realtime_processing": {
-    "words_accurate_timestamps": true
-  }
-}
-```
-
-Under each utterance, you'll find a `words` property, like this:
-
-```json
-{
-  // ... other utterance properties
-  "words": [
-    {
-      "word": "Split",
-      "start": 0.21001999999999998,
-      "end": 0.69015,
-      "confidence": 1
-    },
-    {
-      "word": " infinity",
-      "start": 0.91021,
-      "end": 1.55038,
-      "confidence": 0.95
-    },
-  ]
-}
-```
-
 ## Multiple channels
 
 If you have multiple channels in your audio stream, specify the count in the configuration:
diff --git a/chapters/pre-recorded-stt/features.mdx b/chapters/pre-recorded-stt/features.mdx
@@ -16,41 +16,6 @@ Discover our state-of-the-art ASR model [ Whisper Zero now.](https://www.gladia.
  into the transcription process by including extra parameters in the transcription request.
 
 
-
-## Word-level timestamps
-
-Instead of just getting utterances start and end timestamps, **Gladia** Speech-to-text API provides by **default** the
-**Word-level timestamps** feature. It lets you know the exact timestamp for each word and give you a more precise transcription.
- This feature is particularly useful for detailed analysis, as it allows you to pinpoint the exact moment each word is spoken, facilitating
-  a more accurate synchronization with audio or video files.
-
-Under each utterance, you'll find a `words` property like this:
-
-```json
-// other properties...
-"utterances": [
-    {
-      "words": [
-        {
-          "word": "Split",
-          "start": 0.21001999999999998,
-          "end": 0.69015,
-          "confidence": 1
-        },
-        {
-          "word": " infinity",
-          "start": 0.91021,
-          "end": 1.55038,
-          "confidence": 0.95
-        },
-        ...
-      ]
-    }
-  ]
-```
-
-
-
 ## Export SRT or VTT caption files
 
 You can export completed transcripts in both SRT and VTT format, which can be used for subtitles and captions in videos.
diff --git a/chapters/settings/word-timestamps.mdx b/chapters/settings/word-timestamps.mdx
@@ -0,0 +1,59 @@
+---
+title: Word-level timestamps
+description: "Get the exact timestamp for each word in your audio file."
+---
+
+<Icon icon="check" iconType="solid" color="green" size="20" /> **Asynchronous STT** &nbsp; &nbsp; &nbsp;
+<Icon icon="check" iconType="solid" color="green" size="20" /> **Real-Time STT**
+
+Instead of providing only the start and end timestamps of an utterance, Gladia API delivers precise timestamps for each individual word. This feature is useful for detailed analyses, enabling you to pinpoint the exact moment each word is spoken. It also facilitates synchronization with audio or video files for enhanced accuracy.
+
+## Configuration
+
+<Tabs>
+
+<Tab title='Asynchronous STT'>
+
+Word-level timestamps is always enabled for asynchronous STT.
+
+</Tab>
+
+<Tab title='Real-Time STT'>
+
+World-level configuration is set within the `realtime_processing` object in your transcription request. API reference is available [here](https://docs.gladia.io/api-reference/v2/live/init#body-realtime-processing-words-accurate-timestamps).
+
+```json
+{
+  "realtime_processing": {
+    "words_accurate_timestamps": true
+  }
+}
+```
+
+</Tab>
+
+</Tabs>
+
+## Results
+
+Each utterance will contains a `words` property:
+
+```json
+{
+  // ... other utterance properties
+  "words": [
+    {
+      "word": "Split",
+      "start": 0.21001999999999998,
+      "end": 0.69015,
+      "confidence": 1
+    },
+    {
+      "word": " infinity",
+      "start": 0.91021,
+      "end": 1.55038,
+      "confidence": 0.95
+    },
+  ]
+}
+```
diff --git a/mint.json b/mint.json
@@ -185,7 +185,8 @@
       "pages": [
         "chapters/settings/language-options",
         "chapters/settings/vocabulary-spelling",
-        "chapters/settings/formatting"
+        "chapters/settings/formatting",
+        "chapters/settings/word-timestamps"
       ]
     },
     {

Original file line number	Diff line number	Diff line change
`@@ -185,7 +185,8 @@`
`185`	`185`	`"pages": [`
`186`	`186`	`"chapters/settings/language-options",`
`187`	`187`	`"chapters/settings/vocabulary-spelling",`
`188`		`- "chapters/settings/formatting"`
	`188`	`+ "chapters/settings/formatting",`
	`189`	`+ "chapters/settings/word-timestamps"`
`189`	`190`	`]`
`190`	`191`	`},`
`191`	`192`	`{`