lqb/exo

Author	SHA1 Message	Date
Alex Cheema	f051ebe6e0 remove accidentally added files	1 year ago
Mark Kockerbeck	5eafd5a305 try/except for decode, #75	1 year ago
Mark Kockerbeck	d2fa7b247e Showing the message only if successfully decoded, #75	1 year ago
Mark Kockerbeck	f1cd5ae7a6 Merge branch 'main' of github.com:xeb/exo	1 year ago
Mark Kockerbeck	4f5ab78d9d Addressing issue #75 to avoid decoding binary packets	1 year ago
Alex Cheema	5a23376059 add log_request middleware if DEBUG>=2 to chatgpt api to debug api issues, default always to llama-3.1-8b	1 year ago
Alex Cheema	2084784470 per-request kv cache, remove all explicit reset functionality as it wasnt used. fixes #67	1 year ago
Alex Cheema	dd8c5d63a9 add support for mistral nemo and mistral large	1 year ago
Alex Cheema	03fe7a058c more robust message parsing fixes #81	1 year ago
Alex Cheema	942012577a styling for tinychat model selector	1 year ago
Alex Cheema	5ac6b6a717 clearer documentation on accessing web UI and chatgpt-api	1 year ago
Alex Cheema	9a373c2bb0 make configurable discovery timeout	1 year ago
Alex Cheema	63a05d5b4f make configurable discovery timeout	1 year ago
Alex Cheema	8d2bb819bf add llama-3.1 notice to README	1 year ago
Alex Cheema	7a2fbf22b9 add model selection to tinychat	1 year ago
Alex Cheema	bbfd5adc20 add support for llama3.1 (8b, 70b, 405b). bump mlx up to 0.16.0 and mlx-lm up to 0.16.1. fixes #66	1 year ago
Alex Cheema	5496cd85f5 Revert "smart model downloading for mlx #16"	1 year ago
Alex Cheema	3a230f3b44 smart model downloading for mlx #16	1 year ago
Alex Cheema	174cff071e Merge pull request #58 from jakobdylanc/main	1 year ago
Alex Cheema	b0e7dd9d2d add max-generate-tokens flag fixes #54	1 year ago
JakobDylanC	f2f61ccee6 inference engine selection improvements	1 year ago
Alex Cheema	4e46232364 add simple prometheus metrics collection, with a prometheus / grafana instance for live dashboard. related: #22	1 year ago
Alex Cheema	2e419ba211 Merge pull request #48 from itsknk/intel-mac	1 year ago
itsknk	e934664168 implement dynamic inference engine selection	1 year ago
Alex Cheema	1fcbe18baa fix m2 ultra flops	1 year ago
Alex Cheema	9d9d257eb2 reduce chatgpt api response timeout in test	1 year ago
Alex Cheema	8850187b8a tell the mofo in the workflow to keep responses concise	1 year ago
Alex Cheema	052ee1c7e9 cache isolation per workflow job	1 year ago
Alex Cheema	ce41e653c0 check cached files in workflow	1 year ago
Alex Cheema	3d82338c21 debug cached files in workflow	1 year ago

Newer Older

Commit History Find

Commit History