update readme and add plots
Browse filesSigned-off-by: monica-sekoyan <[email protected]>
- .gitattributes +1 -0
- README.md +9 -9
- plots/asr.png +3 -0
- plots/en_x.png +3 -0
- plots/x_en.png +3 -0
.gitattributes
CHANGED
|
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
canary-1b-v2.nemo filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
canary-1b-v2.nemo filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -42,7 +42,7 @@ We will soon release a comprehensive **Canary-1b-v2 technical report** detailing
|
|
| 42 |
|
| 43 |
### Automatic Speech Recognition (ASR)
|
| 44 |
|
| 45 |
-
 using MUSAN music and noise samples \[16] on the [LibriSpeech Clean test set](https://www.openslr.org/12).
|
| 327 |
**Metric**: Word Error Rate (**WER**)
|
| 328 |
|
| 329 |
-
| **SNR (dB)**
|
| 330 |
-
| ---------------
|
| 331 |
-
| **`Canary-1b-v2`** | 2.
|
| 332 |
|
| 333 |
|
| 334 |
### Hallucination Robustness
|
|
@@ -346,8 +346,8 @@ Number of characters per minute on [MUSAN](https://www.openslr.org/17) \[16] 48
|
|
| 346 |
|
| 347 |
| **Dataset** | **WER ↓** |
|
| 348 |
| ----------------------- | --------- |
|
| 349 |
-
| Earnings-22 |
|
| 350 |
-
| This American Life |
|
| 351 |
|
| 352 |
**Note:** Presented WERs do not include Punctuation and Capitalization errors.
|
| 353 |
|
|
|
|
| 42 |
|
| 43 |
### Automatic Speech Recognition (ASR)
|
| 44 |
|
| 45 |
+

|
| 46 |
|
| 47 |
*Figure 1: ASR WER comparison across different models. This does not include Punctuation and Capitalisation errors.*
|
| 48 |
|
|
|
|
| 52 |
|
| 53 |
#### X → English
|
| 54 |
|
| 55 |
+

|
| 56 |
|
| 57 |
*Figure 2: AST X → En COMET scores comparison across different models*
|
| 58 |
|
| 59 |
#### English → X
|
| 60 |
|
| 61 |
|
| 62 |
+

|
| 63 |
|
| 64 |
*Figure 3: AST En → X COMET scores comparison across different models*
|
| 65 |
|
|
|
|
| 283 |
|
| 284 |
| **WER ↓** | Fleurs-25 Langs | CoVoST-13 Langs | MLS - 6 Langs |
|
| 285 |
| --------------- | -------------------- | -------------------- | ------------------ |
|
| 286 |
+
| **`Canary-1b-v2`** | 8.40% | 8.85% | 7.27% |
|
| 287 |
|
| 288 |
|
| 289 |
**Note:** Presented WERs do not include Punctuation and Capitalization errors.
|
|
|
|
| 326 |
Performance across different Signal-to-Noise Ratios (SNR) using MUSAN music and noise samples \[16] on the [LibriSpeech Clean test set](https://www.openslr.org/12).
|
| 327 |
**Metric**: Word Error Rate (**WER**)
|
| 328 |
|
| 329 |
+
| **SNR (dB)** | 100 | 10 | 5 | 0 | -5 |
|
| 330 |
+
| --------------- | ----- | ----- | ----- | ----- | ----- |
|
| 331 |
+
| **`Canary-1b-v2`** | 2.18% | 2.29% | 2.80% | 5.08% | 19.38% |
|
| 332 |
|
| 333 |
|
| 334 |
### Hallucination Robustness
|
|
|
|
| 346 |
|
| 347 |
| **Dataset** | **WER ↓** |
|
| 348 |
| ----------------------- | --------- |
|
| 349 |
+
| Earnings-22 | 13.51% |
|
| 350 |
+
| This American Life | 8.65% |
|
| 351 |
|
| 352 |
**Note:** Presented WERs do not include Punctuation and Capitalization errors.
|
| 353 |
|
plots/asr.png
ADDED
|
Git LFS Details
|
plots/en_x.png
ADDED
|
Git LFS Details
|
plots/x_en.png
ADDED
|
Git LFS Details
|