Alex Cheema
|
75681a9707
use async for all file ops, cache fetch_file_list, cache commit hash, quickly check file sizes on disk before making requests
|
1 year ago |
Alex Cheema
|
6c1bf127b3
add --max-parallel-downloads flag that limits the number of downloads at a time with asyncio.semaphore
|
1 year ago |
Alex Cheema
|
440fd35ea7
upgrade aiohttp
|
1 year ago |
Alex Cheema
|
8e6414b257
spacing
|
1 year ago |
Alex Cheema
|
e8267e7387
LAPTOP GPU and Laptop GPU prefixes
|
1 year ago |
Alex Cheema
|
31641d1023
tinygrad select model size
|
1 year ago |
Alex Cheema
|
7b704e8d98
Merge pull request #136 from exo-explore/exo_interfaces
|
1 year ago |
Alex Cheema
|
e6902b2fcf
add --download-quick-check flag to bypass the hf api calls / remote file checks
|
1 year ago |
Alex Cheema
|
84afdbcbe8
add --download-quick-check flag to bypass the hf api calls / remote file checks
|
1 year ago |
Alex Cheema
|
1f736dacd7
Merge pull request #135 from exo-explore/exo_interfaces
|
1 year ago |
Alex Cheema
|
71591d2ebc
display all interfaces web chat and chatgpt api are available on fixes #134
|
1 year ago |
Alex Cheema
|
3bd5a116df
ignore files that dont match allow patterns
|
1 year ago |
Alex Cheema
|
3db3e8294c
make download panel slightly larger
|
1 year ago |
Alex Cheema
|
9e78c42b4b
Merge pull request #124 from exo-explore/refactor_model_download
|
1 year ago |
Alex Cheema
|
5112f53a37
trigger ci
|
1 year ago |
Alex Cheema
|
047ef48c1d
use separate hf cache dirs for chatgpt api integration test. its an unusual setup where we're running 2 exo instances on the same device which share a disk and hf cache
|
1 year ago |
Alex Cheema
|
2be446546f
refactor tinygrad, only load necessary layers for each shard fixes #128, enable JIT (much faster), prefill all layers not just the first shard fixes #12, use new ShardDownloader for more robust, parallel downloads
|
1 year ago |
Alex Cheema
|
357331c55f
remove some logs, make get_allow_patterns out of class
|
1 year ago |
Alex Cheema
|
b1eb05ed47
debug level 7 for tests
|
1 year ago |
Alex Cheema
|
09a9abc065
fix inference engine test
|
1 year ago |
Alex Cheema
|
67c269076b
Merge branch 'main' into refactor_model_download
|
1 year ago |
Alex Cheema
|
dd41026c5b
cache completed download paths
|
1 year ago |
Alex Cheema
|
706488732f
disable prefix matching on prompts. causes subsequent requests to fail with cannot be broadcast. hotfix for #130
|
1 year ago |
Alex Cheema
|
35b7042e70
upgrade mlx to 0.16.1
|
1 year ago |
Alex Cheema
|
b181f8aa82
handle writing responses errors
|
1 year ago |
Alex Cheema
|
7ec660bba6
fix shard download
|
1 year ago |
Alex Cheema
|
29f154597c
init active_downloadsa
|
1 year ago |
Alex Cheema
|
6bddb2a9dc
download edge cases
|
1 year ago |
Alex Cheema
|
f29963f41e
preemptively start downloads when any node starts processing a prompt. this fixes #104
|
1 year ago |
Alex Cheema
|
7a65a96e52
download progress styling
|
1 year ago |