|
@@ -45,6 +45,17 @@ exo will automatically discover other devices using the best method available. Z
|
|
|
|
|
|
exo provides a ChatGPT-compatible API for running models. It's a one-line change in your application to run models on your own hardware using exo.
|
|
|
|
|
|
+### Device Equality
|
|
|
+
|
|
|
+Unlike other distributed inference frameworks, exo does not use a master-worker architecture. Instead, exo devices connect p2p. As long as a device is connected somewhere in the network, it can be used to run models.
|
|
|
+
|
|
|
+Exo supports different partitioning strategies to split up a model across devices. The default partitioning strategy is [ring memory weighted partitioning](topology/ring_memory_weighted_partitioning.py). This runs an inference in a ring where each device runs a number of model layers proportional to the memory of the device.
|
|
|
+
|
|
|
+<picture>
|
|
|
+ <img alt="ring topology" src="docs/ring-topology.png" width="50%" height="50%">
|
|
|
+</picture>
|
|
|
+
|
|
|
+
|
|
|
## Installation
|
|
|
|
|
|
The current recommended way to install exo is from source.
|