lqb/exo

Author	SHA1 Message	Date
Rory Clear	3384fc7294 update tinygrad version	5 months ago
Nel Nibcord	8b71d57da7 Removed inference state entirely	5 months ago
Nel Nibcord	65fdc99ccc Call no longer needs request_id	5 months ago
Nel Nibcord	90518a3bbe Hoisted caching to a wrapper class	5 months ago
Nel Nibcord	8205a5aebc Implemented per-request caching in tinygrad	5 months ago
Nel Nibcord	13572e6a40 Some stability improvements for tinygrad inference	5 months ago
Nel Nibcord	527c7a6e49 Applied new interface to tinygrad and dummy inference engines	5 months ago
Ogden Wells	fbec1d2b10 formatted changes	5 months ago
Ogden Wells	af01b23a07 added rope_scaling and tie_word_embeddings to llama transformer	5 months ago
Alex Cheema	f53056dede more compact operator formatting	8 months ago
Alex Cheema	14f2846a9c yapf set blank_line_before_nested_class_or_def to false	8 months ago
Alex Cheema	ea70c9fb76 reformat with yapf format.py	8 months ago
Alex Cheema	803dffd1c4 always call convert_from_huggingface with tinygrad models. this was broken by shard layer filtering which made the check sometimes fail. fixes #144	8 months ago
Alex Cheema	2be446546f refactor tinygrad, only load necessary layers for each shard fixes #128, enable JIT (much faster), prefill all layers not just the first shard fixes #12, use new ShardDownloader for more robust, parallel downloads	8 months ago
Alex Cheema	55bcad98e3 standardise tinygrad models/tokenizers so it can handle mlx hf	9 months ago
Alex Cheema	4cb36a7f55 increase max line length to 200	9 months ago
Alex Cheema	ce761038ac formatting / linting	9 months ago
Alex Cheema	46d618abed tiny fixes	9 months ago
Alex Cheema	dd8d18128c add an opaque inference_state that inference engines can use to pass around small state to other devices	9 months ago
Alex Cheema	5bbde22a23 move everything under exo module	9 months ago

Commit History Find

Commit History