Commit History

Author SHA1 Message Date
  Rory Clear 3384fc7294 update tinygrad version 5 months ago
  Nel Nibcord 8b71d57da7 Removed inference state entirely 5 months ago
  Nel Nibcord 65fdc99ccc Call no longer needs request_id 5 months ago
  Nel Nibcord 90518a3bbe Hoisted caching to a wrapper class 5 months ago
  Nel Nibcord 8205a5aebc Implemented per-request caching in tinygrad 5 months ago
  Nel Nibcord 13572e6a40 Some stability improvements for tinygrad inference 5 months ago
  Nel Nibcord 527c7a6e49 Applied new interface to tinygrad and dummy inference engines 5 months ago
  Ogden Wells fbec1d2b10 formatted changes 5 months ago
  Ogden Wells af01b23a07 added rope_scaling and tie_word_embeddings to llama transformer 5 months ago
  Alex Cheema f53056dede more compact operator formatting 8 months ago
  Alex Cheema 14f2846a9c yapf set blank_line_before_nested_class_or_def to false 8 months ago
  Alex Cheema ea70c9fb76 reformat with yapf format.py 8 months ago
  Alex Cheema 803dffd1c4 always call convert_from_huggingface with tinygrad models. this was broken by shard layer filtering which made the check sometimes fail. fixes #144 8 months ago
  Alex Cheema 2be446546f refactor tinygrad, only load necessary layers for each shard fixes #128, enable JIT (much faster), prefill all layers not just the first shard fixes #12, use new ShardDownloader for more robust, parallel downloads 8 months ago
  Alex Cheema 55bcad98e3 standardise tinygrad models/tokenizers so it can handle mlx hf 9 months ago
  Alex Cheema 4cb36a7f55 increase max line length to 200 9 months ago
  Alex Cheema ce761038ac formatting / linting 9 months ago
  Alex Cheema 46d618abed tiny fixes 9 months ago
  Alex Cheema dd8d18128c add an opaque inference_state that inference engines can use to pass around small state to other devices 9 months ago
  Alex Cheema 5bbde22a23 move everything under exo module 9 months ago