Commit Graph

9 Commits

Author SHA1 Message Date
Akash Mahajan
3ec7bfffe0
py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001)
* patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer

* typo fix
2023-06-25 13:50:14 +03:00
AsukaMinato
94aa56f19e
minor : improve C++ and Python style (#768)
* use some STL functions

* use self.field than setattr, use pathlib.Path

* recover some format

* const some iter

* Keep the original

* 2 space
2023-04-29 10:06:25 +03:00
Ivan Gorin
62b51c3070
models : change convert-pt-to-ggml to use .tiktoken tokenizer files (#725) 2023-04-14 19:50:39 +03:00
Georgi Gerganov
00f46dbc1d
models : add usage comments to the HF convert script (#157) 2022-11-23 23:22:40 +02:00
Georgi Gerganov
e70e5c8b53
models : simplify the conversion script
"transformers" dependency is not actually needed
2022-11-16 19:22:32 +02:00
Georgi Gerganov
46a68fb9b5
minor : remove one more redundant line 2022-11-11 18:02:58 +02:00
Georgi Gerganov
ccd56a9c5b
minor : fix double float32 conversion in python script 2022-11-11 17:58:51 +02:00
Joonas Pihlajamaa
4e887dc350 Add enconding parameter to vocab.json opening to fix errors 2022-10-23 11:55:01 +03:00
Georgi Gerganov
6b45e37b2b Update README.md and finalize the whisper.wasm example 2022-10-22 18:54:01 +03:00