pycoral.pipeline
pycoral.pipeline.pipelined_model_runner
The pipeline API allows you to run a segmented model across multiple Edge TPUs.
For more information, see Pipeline a model with multiple Edge TPUs.
-
class
pycoral.pipeline.pipelined_model_runner.
PipelinedModelRunner
(interpreters)[source]¶ Manages the model pipeline.
To create an instance:
interpreter_a = tflite.Interpreter(model_path=model_segment_a, experimental_delegates=delegate_a) interpreter_a.allocate_tensors() interpreter_b = tflite.Interpreter(model_path=model_segment_b, experimental_delegates=delegate_b) interpreter_b.allocate_tensors() interpreters = [interpreter_a, interpreter_b] runner = PipelinedModelRunner(interpreters)
Be sure you first call
allocate_tensors()
on each interpreter.- Parameters
interpreters – A list of
tf.lite.Interpreter
objects, one for each segment in the pipeline.
-
pop
()[source]¶ Returns a single inference result.
This function blocks the calling thread until a result is returned.
- Returns
Dictionary with key of type string, and value of type
numpy.array
representing the model’s output tensors, where keys are the tensor names. Returns None when apush()
receives an empty dict input, indicating there are no more output tensors available.- Raises
RuntimeError – error during retrieving pipelined model inference results.
-
push
(input_tensors)[source]¶ Pushes input tensors to trigger inference.
Pushing an empty dict is allowed, which signals the class that no more inputs will be added (the function will return false if inputs were pushed after this special push). This special push allows the
pop()
consumer to properly drain unconsumed output tensors.Caller will be blocked if the current input queue size is greater than the queue size max (use
set_input_queue_size()
). By default, input queue size threshold is unlimited, in this case, call to push() is non-blocking.- Parameters
input_tensors – A dictionary with key of type string, and value of type
numpy.array
representing the model’s input tensors, where keys are the tensor names.- Raises
RuntimeError – error during pushing pipelined model inference request.
-
set_input_queue_size
(size)[source]¶ Sets the maximum number of inputs that may be queued for inference.
By default, input queue size is unlimited.
Note: It’s OK to change the queue size max when PipelinedModelRunner is active. If the new max is smaller than current queue size, pushes to the queue are blocked until the current queue size drops below the new max.
- Parameters
size (int) – The input queue size max
-
set_output_queue_size
(size)[source]¶ Sets the maximum number of outputs that may be unconsumed.
By default, output queue size is unlimited.
Note: It’s OK to change the queue size max when PipelinedModelRunner is active. If the new max is smaller than current queue size, pushes to the queue are blocked until the current queue size drops below the new max.
- Parameters
size (int) – The output queue size max
API version 2.0
Is this content helpful?