瀏覽代碼

use AutoProcessor with use_fast=False since there's a bug with use_fast=True where whitespace is removed on single token decodes

Alex Cheema 1 年之前
父節點
當前提交
2d20000964
共有 1 個文件被更改,包括 2 次插入2 次删除
  1. 2 2
      exo/api/chatgpt_api.py

+ 2 - 2
exo/api/chatgpt_api.py

@@ -73,8 +73,8 @@ def resolve_tinygrad_tokenizer(model_id: str):
 
 async def resolve_tokenizer(model_id: str):
   try:
-    if DEBUG >= 2: print(f"Trying to AutoProcessor for {model_id}")
-    processor = AutoProcessor.from_pretrained(model_id)
+    if DEBUG >= 2: print(f"Trying AutoProcessor for {model_id}")
+    processor = AutoProcessor.from_pretrained(model_id, use_fast=False)
     processor.eos_token_id = processor.tokenizer.eos_token_id
     processor.encode = processor.tokenizer.encode
     return processor