lqb/exo

Author	SHA1 Message	Date
Alex Cheema	faa1319470 disable chatgpt api integration test, github changed something in their mac runners? perhaps time to switch over to circleci like mlx	1 year ago
Alex Cheema	67a1aaa823 check processes in github workflow	1 year ago
Alex Cheema	9a3ac273a9 Merge pull request #77 from Cloud1590/main	1 year ago
Alex Cheema	628d8679b0 force mlx inference engine in github workflow, where it defaults to tinygrad because it's running on 'model': 'Apple Virtual Machine 1', 'chip': 'Apple M1 (Virtual)'	1 year ago
Alex Cheema	e856d7f7f9 log chatgpt integration test output from each process on github workflow failure	1 year ago
Alex Cheema	5a23376059 add log_request middleware if DEBUG>=2 to chatgpt api to debug api issues, default always to llama-3.1-8b	1 year ago
Alex Cheema	2084784470 per-request kv cache, remove all explicit reset functionality as it wasnt used. fixes #67	1 year ago
Alex Cheema	dd8c5d63a9 add support for mistral nemo and mistral large	1 year ago
Alex Cheema	03fe7a058c more robust message parsing fixes #81	1 year ago
Cloud1590	0770c59d5f Update main.py	1 year ago
Cloud1590	e1792e29b9 chore: Update argparse action for --disable-tui flag	1 year ago
Cloud1590	2c71a4b1ac Update device_capabilities.py	1 year ago
Alex Cheema	942012577a styling for tinychat model selector	1 year ago
Alex Cheema	5ac6b6a717 clearer documentation on accessing web UI and chatgpt-api	1 year ago
Alex Cheema	9a373c2bb0 make configurable discovery timeout	1 year ago
Alex Cheema	63a05d5b4f make configurable discovery timeout	1 year ago
Alex Cheema	8d2bb819bf add llama-3.1 notice to README	1 year ago
Alex Cheema	7a2fbf22b9 add model selection to tinychat	1 year ago
Alex Cheema	bbfd5adc20 add support for llama3.1 (8b, 70b, 405b). bump mlx up to 0.16.0 and mlx-lm up to 0.16.1. fixes #66	1 year ago
Alex Cheema	5496cd85f5 Revert "smart model downloading for mlx #16"	1 year ago
Alex Cheema	3a230f3b44 smart model downloading for mlx #16	1 year ago
Alex Cheema	174cff071e Merge pull request #58 from jakobdylanc/main	1 year ago
Alex Cheema	b0e7dd9d2d add max-generate-tokens flag fixes #54	1 year ago
JakobDylanC	f2f61ccee6 inference engine selection improvements	1 year ago
Alex Cheema	4e46232364 add simple prometheus metrics collection, with a prometheus / grafana instance for live dashboard. related: #22	1 year ago
Alex Cheema	2e419ba211 Merge pull request #48 from itsknk/intel-mac	1 year ago
itsknk	e934664168 implement dynamic inference engine selection	1 year ago
Alex Cheema	1fcbe18baa fix m2 ultra flops	1 year ago
Alex Cheema	9d9d257eb2 reduce chatgpt api response timeout in test	1 year ago
Alex Cheema	8850187b8a tell the mofo in the workflow to keep responses concise	1 year ago

Newer Older

Commit History Find

Commit History