Alex Cheema
|
b01f69bb6b
add support for multiple concurrent requests with request ids
|
1 year ago |
Alex Cheema
|
445eda156c
dynamically assign shards to nodes deterministically weighted by memory
|
1 year ago |
Alex Cheema
|
36b8456798
collect global topology with local peer visibility, ring memory weighted partitioning strategy
|
1 year ago |
Alex Cheema
|
6c8c9ee7b1
topology with partitioning strategy
|
1 year ago |
Alex Cheema
|
563dcb56b0
mlx sharded implementation with example of distributed inference
|
1 year ago |
Alex Cheema
|
a21f59ff45
scaffolding for networking, inference and orchestration
|
1 year ago |