inference
|
b01f69bb6b
add support for multiple concurrent requests with request ids
|
1 year ago |
networking
|
e6f387a690
handle is_finished
|
1 year ago |
orchestration
|
e6f387a690
handle is_finished
|
1 year ago |
topology
|
36b8456798
collect global topology with local peer visibility, ring memory weighted partitioning strategy
|
1 year ago |
.gitignore
|
850b72d3ea
make StatefulShardedModel callable, add some tests for mlx sharded inference
|
1 year ago |
example_user.py
|
36b8456798
collect global topology with local peer visibility, ring memory weighted partitioning strategy
|
1 year ago |
example_user_2.py
|
e6f387a690
handle is_finished
|
1 year ago |
main.py
|
36b8456798
collect global topology with local peer visibility, ring memory weighted partitioning strategy
|
1 year ago |
main_dynamic.py
|
7077652c8e
graceful node shutdown
|
1 year ago |
requirements.txt
|
3a66a0a4a8
add requirements.txt
|
1 year ago |