Nel Nibcord
|
03924cf9af
Need tokens. Also, for some reason this gets mad if we have non-integral tokens but this isn't a problem elsewhere?
|
8 mesiacov pred |
Alex Cheema
|
854a7c22ac
Merge pull request #436 from blindcrone/unit-tests
|
8 mesiacov pred |
Nel Nibcord
|
e463cd8196
Ok not sure we're using this but just in case
|
8 mesiacov pred |
Nel Nibcord
|
7e3ad9abc8
Missed a spot
|
8 mesiacov pred |
Alex Cheema
|
b8b4ea3633
Merge pull request #435 from blindcrone/unit-tests
|
8 mesiacov pred |
Nel Nibcord
|
1cd3efbe4c
Fixed unit tests
|
8 mesiacov pred |
Alex Cheema
|
b400a442ee
Merge pull request #420 from blindcrone/refactor-inference
|
8 mesiacov pred |
Nel Nibcord
|
65fdc99ccc
Call no longer needs request_id
|
8 mesiacov pred |
Nel Nibcord
|
90518a3bbe
Hoisted caching to a wrapper class
|
8 mesiacov pred |
Nel Nibcord
|
bf33ffde87
This doesn't need to be a tuple really
|
8 mesiacov pred |
Nel Nibcord
|
10e9f44a10
one-line output buffering
|
8 mesiacov pred |
Nel Nibcord
|
52ef6ee4a3
Made temperature and top_p available to the inference engine sample interfaces
|
8 mesiacov pred |
Nel Nibcord
|
8205a5aebc
Implemented per-request caching in tinygrad
|
8 mesiacov pred |
Nel Nibcord
|
13572e6a40
Some stability improvements for tinygrad inference
|
8 mesiacov pred |
Nel Nibcord
|
aefc0d7c51
I think this is more faithful to how it was originally done
|
8 mesiacov pred |
Nel Nibcord
|
c06b5f3b56
Corrected type annotations
|
8 mesiacov pred |
Nel Nibcord
|
9b66758b59
Make sure they're np arrays
|
8 mesiacov pred |
Nel Nibcord
|
b9d0fb6825
Since infer_prompt is a thin wrapper that works the same for all inference engines, we can de-abstract it
|
8 mesiacov pred |
Nel Nibcord
|
527c7a6e49
Applied new interface to tinygrad and dummy inference engines
|
8 mesiacov pred |
Nel Nibcord
|
52b91de817
Changed model classname due to the sharding being done elsewhere
|
8 mesiacov pred |
Nel Nibcord
|
34019e4608
Forgot an abstractmethod
|
8 mesiacov pred |
Nel Nibcord
|
82cce4408e
Some initial inference engine refactors for enabling training
|
8 mesiacov pred |
Alex Cheema
|
4713bc5acd
Merge pull request #431 from exo-explore/qwen32b
|
8 mesiacov pred |
Alex Cheema
|
e9ba815c21
add qwen2.5 coder 3b,14b,32b
|
8 mesiacov pred |
Alex Cheema
|
a0b6adad85
Merge pull request #430 from austinbv/patch-1
|
8 mesiacov pred |
Austin
|
5435671cd9
Add 32b Qwen 2.5
|
8 mesiacov pred |
Alex Cheema
|
526f8a7ad5
Merge pull request #429 from exo-explore/readme_hf_home
|
8 mesiacov pred |
Alex Cheema
|
167e756b31
add documentation of HF_HOME model storage location in README. fixes #427
|
8 mesiacov pred |
Alex Cheema
|
b41b7d778a
Merge pull request #426 from exo-explore/tinygrad_ci_test
|
8 mesiacov pred |
Alex Cheema
|
9e4366f36b
tinygrad ci
|
8 mesiacov pred |