Alex Cheema
|
394935711b
add all chat endpoints without v1 prefix to support ollama / openwebui. related: #175
|
11 달 전 |
Alex Cheema
|
70172d7cb9
add /v1/models endpoint and change Content-Type of stremed response to text/event-stream. fixes #175
|
11 달 전 |
Alex Cheema
|
d917778e2b
update mlx to 0.17.1 (not sure where 0.17.0 went on PyPi disappeared)g
|
11 달 전 |
Alex Cheema
|
2667c8af44
cleaner download_progress
|
11 달 전 |
Alex Cheema
|
f46d077beb
fix font dependencies for tinychat. related: #172
|
11 달 전 |
Alex Cheema
|
8a4928f80c
fix gitignore to not ignore tinychat static files
|
11 달 전 |
Alex Cheema
|
d515d9efa3
explicitly use absolute paths for tinychat deps
|
11 달 전 |
Alex Cheema
|
a386c35fde
script to update tinychat deps
|
11 달 전 |
Alex Cheema
|
3791e669a4
download tinychat dependencies all to local dir so we dont need internet
|
11 달 전 |
Alex Cheema
|
85bab25ac0
fix local check if dir does not exist
|
11 달 전 |
Alex Cheema
|
59c4393d95
first try loading tokenizer from local path instead of always going to the internet first. significant speed ups
|
11 달 전 |
Alex Cheema
|
784e6bae21
print traceback on topology collection error
|
11 달 전 |
Alex Cheema
|
8cad0e1849
only use_fast tokenizer for Mistral Large until this inconsistency bug is fixed #171
|
11 달 전 |
Alex Cheema
|
85279007b3
hotfix edge case where we try to render before tokenizer is set
|
11 달 전 |
Alex Cheema
|
09a8468395
upgrade mlx to 0.17.0
|
11 달 전 |
Alex Cheema
|
1f9d16ec78
run tokenizers test in ci, run all models available
|
11 달 전 |
Alex Cheema
|
6243846eeb
ci logs
|
11 달 전 |
Alex Cheema
|
cfe980bdaa
simplify ci
|
11 달 전 |
Alex Cheema
|
9513c4fd17
ci tail log files
|
11 달 전 |
Alex Cheema
|
7a02acdcd5
fix ci output streaming
|
11 달 전 |
Alex Cheema
|
ad695696a5
run on every commit on main, reuqire approval on other branches
|
11 달 전 |
Alex Cheema
|
710e5a31e7
TODO for why use_fast=False is giving inconsistent behaviour (no spaces decoding invididual tokens) for Mistral-Large-Instruct-2407-4bit
|
11 달 전 |
Alex Cheema
|
e17e5f9a41
tests for tokenizers. unfortunately use_fast=False and use_fast=True give different behaviour
|
11 달 전 |
Alex Cheema
|
0d218e244e
use fast AutoProcessor fixes #164 tokenizer issues with mistral-large.
|
11 달 전 |
Alex Cheema
|
23ae5e92c5
hold circleci tests for approval on non-main branches
|
11 달 전 |
Alex Cheema
|
d54944f4ca
stream outputs from chatgpt api integration test
|
11 달 전 |
Alex Cheema
|
1133e27ad3
Merge pull request #166 from exo-explore/formatting
|
11 달 전 |
Alex Cheema
|
f53056dede
more compact operator formatting
|
11 달 전 |
Alex Cheema
|
14f2846a9c
yapf set blank_line_before_nested_class_or_def to false
|
11 달 전 |
Alex Cheema
|
ea70c9fb76
reformat with yapf format.py
|
11 달 전 |