lqb/exo

Autor	SHA1 Mensaxe	Data
Alex Cheema	047ef48c1d use separate hf cache dirs for chatgpt api integration test. its an unusual setup where we're running 2 exo instances on the same device which share a disk and hf cache	hai 1 ano
Alex Cheema	2be446546f refactor tinygrad, only load necessary layers for each shard fixes #128, enable JIT (much faster), prefill all layers not just the first shard fixes #12, use new ShardDownloader for more robust, parallel downloads	hai 1 ano
Alex Cheema	357331c55f remove some logs, make get_allow_patterns out of class	hai 1 ano
Alex Cheema	b1eb05ed47 debug level 7 for tests	hai 1 ano
Alex Cheema	09a9abc065 fix inference engine test	hai 1 ano
Alex Cheema	67c269076b Merge branch 'main' into refactor_model_download	hai 1 ano
Alex Cheema	dd41026c5b cache completed download paths	hai 1 ano
Alex Cheema	706488732f disable prefix matching on prompts. causes subsequent requests to fail with cannot be broadcast. hotfix for #130	hai 1 ano
Alex Cheema	35b7042e70 upgrade mlx to 0.16.1	hai 1 ano
Alex Cheema	b181f8aa82 handle writing responses errors	hai 1 ano
Alex Cheema	7ec660bba6 fix shard download	hai 1 ano
Alex Cheema	29f154597c init active_downloadsa	hai 1 ano
Alex Cheema	6bddb2a9dc download edge cases	hai 1 ano
Alex Cheema	f29963f41e preemptively start downloads when any node starts processing a prompt. this fixes #104	hai 1 ano
Alex Cheema	7a65a96e52 download progress styling	hai 1 ano
Alex Cheema	c59ceab821 viz spacing	hai 1 ano
Alex Cheema	0a588d0443 viz styles	hai 1 ano
Alex Cheema	d9f232b313 cleaner download progress ui	hai 1 ano
Alex Cheema	476a714bbb make a separate ShardDownloader abstract class w HFShardDownloader. this opens up plugging in different methods of downloading model shards e.g. #79 / #16	hai 1 ano
Alex Cheema	d22ed12e7b bring tinygrad to parity with mlx on llama models, show progress of each download file	hai 1 ano
Alex Cheema	45142dab26 tests	hai 1 ano
Alex Cheema	545a486ed3 separate hf_helpers, make extra dir with download_hf script, unify downloading so tinygrad uses the same method as mlx and interoperable model formats	hai 1 ano
Alex Cheema	9014efae86 minimal script to download from hf async with progress	hai 1 ano
Alex Cheema	4b4dfb7fd0 Merge pull request #117 from inksong/main	hai 1 ano
Alex Cheema	55bcad98e3 standardise tinygrad models/tokenizers so it can handle mlx hf	hai 1 ano
jinke	6b1960baec fix nvidia capabilities	hai 1 ano
Alex Cheema	4a5c6cc580 t	hai 1 ano
Alex Cheema	f93ae2b545 disable tinygrad test for now. need a larger runner or smalelr model	hai 1 ano
Alex Cheema	201996af8a macos	hai 1 ano
Alex Cheema	0eb5c0c624 mac runners	hai 1 ano

Posterior Anterior

Commit History Buscar

Commit History