lqb/exo

Author	SHA1 Message	Date
Alex Cheema	ea70c9fb76 reformat with yapf format.py	10 months ago
Alex Cheema	647ffb94eb increase cli generation timeout	10 months ago
Alex Cheema	dd24e7db1e only ignore CancelledError inside stop	10 months ago
Alex Cheema	2e1233357c ignore CancelledError when stopping the server	10 months ago
Alex Cheema	ae35ada19b fix headless mode with --disable-tui	10 months ago
Alex Cheema	b95916e0b5 show prompts and outputs in tui	10 months ago
Alex Cheema	e84304317c add a cli that can be triggered with --run-model <model> --prompt <prompt>	10 months ago
Alex Cheema	7ddb80e245 f-string expression part cannot include a backslash fixes #142	10 months ago
Alex Cheema	6c1bf127b3 add --max-parallel-downloads flag that limits the number of downloads at a time with asyncio.semaphore	10 months ago
Alex Cheema	e6902b2fcf add --download-quick-check flag to bypass the hf api calls / remote file checks	10 months ago
Alex Cheema	71591d2ebc display all interfaces web chat and chatgpt api are available on fixes #134	10 months ago
Alex Cheema	6bddb2a9dc download edge cases	10 months ago
Alex Cheema	f29963f41e preemptively start downloads when any node starts processing a prompt. this fixes #104	10 months ago
Alex Cheema	476a714bbb make a separate ShardDownloader abstract class w HFShardDownloader. this opens up plugging in different methods of downloading model shards e.g. #79 / #16	10 months ago
Alex Cheema	d22ed12e7b bring tinygrad to parity with mlx on llama models, show progress of each download file	10 months ago
Alex Cheema	545a486ed3 separate hf_helpers, make extra dir with download_hf script, unify downloading so tinygrad uses the same method as mlx and interoperable model formats	10 months ago
Alex Cheema	0bfb8e3b6d sticky node ids #16	11 months ago
Alex Cheema	d6a7e46324 async model downloading with download progress. fixes #102. related: #16 #104	11 months ago
Alex Cheema	57b2f2a4e2 fix ruff lint errors	11 months ago
Alex Cheema	9a373c2bb0 make configurable discovery timeout	11 months ago
Alex Cheema	63a05d5b4f make configurable discovery timeout	11 months ago
Alex Cheema	174cff071e Merge pull request #58 from jakobdylanc/main	11 months ago
Alex Cheema	b0e7dd9d2d add max-generate-tokens flag fixes #54	11 months ago
JakobDylanC	f2f61ccee6 inference engine selection improvements	11 months ago
Alex Cheema	4e46232364 add simple prometheus metrics collection, with a prometheus / grafana instance for live dashboard. related: #22	11 months ago
Alex Cheema	2e419ba211 Merge pull request #48 from itsknk/intel-mac	11 months ago
itsknk	e934664168 implement dynamic inference engine selection	11 months ago
Alec Potluri	db583a863f disable tui flag	11 months ago
Alex Cheema	e49924e1b9 add chatgpt-api-response-timeout-secs flag, set this to 20 mins in test	11 months ago
Alex Cheema	a342e1abd8 add web url and chatgpt api endpoint to panel (fixes #43), fix a rounding error in the partition to shard mapping implementation	11 months ago

Newer Older

Commit History Find

Commit History