Alex Cheema
|
84afdbcbe8
add --download-quick-check flag to bypass the hf api calls / remote file checks
|
10 months ago |
Alex Cheema
|
1f736dacd7
Merge pull request #135 from exo-explore/exo_interfaces
|
10 months ago |
Alex Cheema
|
71591d2ebc
display all interfaces web chat and chatgpt api are available on fixes #134
|
10 months ago |
Alex Cheema
|
3bd5a116df
ignore files that dont match allow patterns
|
10 months ago |
Alex Cheema
|
3db3e8294c
make download panel slightly larger
|
10 months ago |
Alex Cheema
|
9e78c42b4b
Merge pull request #124 from exo-explore/refactor_model_download
|
10 months ago |
Alex Cheema
|
5112f53a37
trigger ci
|
10 months ago |
Alex Cheema
|
047ef48c1d
use separate hf cache dirs for chatgpt api integration test. its an unusual setup where we're running 2 exo instances on the same device which share a disk and hf cache
|
10 months ago |
Alex Cheema
|
2be446546f
refactor tinygrad, only load necessary layers for each shard fixes #128, enable JIT (much faster), prefill all layers not just the first shard fixes #12, use new ShardDownloader for more robust, parallel downloads
|
10 months ago |
Alex Cheema
|
357331c55f
remove some logs, make get_allow_patterns out of class
|
10 months ago |
Alex Cheema
|
b1eb05ed47
debug level 7 for tests
|
10 months ago |
Alex Cheema
|
09a9abc065
fix inference engine test
|
10 months ago |
Alex Cheema
|
67c269076b
Merge branch 'main' into refactor_model_download
|
10 months ago |
Alex Cheema
|
dd41026c5b
cache completed download paths
|
10 months ago |
Alex Cheema
|
706488732f
disable prefix matching on prompts. causes subsequent requests to fail with cannot be broadcast. hotfix for #130
|
10 months ago |
Alex Cheema
|
35b7042e70
upgrade mlx to 0.16.1
|
10 months ago |
Alex Cheema
|
b181f8aa82
handle writing responses errors
|
10 months ago |
Alex Cheema
|
7ec660bba6
fix shard download
|
10 months ago |
Alex Cheema
|
29f154597c
init active_downloadsa
|
10 months ago |
Alex Cheema
|
6bddb2a9dc
download edge cases
|
10 months ago |
Alex Cheema
|
f29963f41e
preemptively start downloads when any node starts processing a prompt. this fixes #104
|
10 months ago |
Alex Cheema
|
7a65a96e52
download progress styling
|
10 months ago |
Alex Cheema
|
c59ceab821
viz spacing
|
10 months ago |
Alex Cheema
|
0a588d0443
viz styles
|
10 months ago |
Alex Cheema
|
d9f232b313
cleaner download progress ui
|
10 months ago |
Alex Cheema
|
476a714bbb
make a separate ShardDownloader abstract class w HFShardDownloader. this opens up plugging in different methods of downloading model shards e.g. #79 / #16
|
10 months ago |
Alex Cheema
|
d22ed12e7b
bring tinygrad to parity with mlx on llama models, show progress of each download file
|
10 months ago |
Alex Cheema
|
45142dab26
tests
|
10 months ago |
Alex Cheema
|
545a486ed3
separate hf_helpers, make extra dir with download_hf script, unify downloading so tinygrad uses the same method as mlx and interoperable model formats
|
10 months ago |
Alex Cheema
|
9014efae86
minimal script to download from hf async with progress
|
10 months ago |