pycoral.pipeline

pycoral.pipeline.pipelined_model_runner

The pipeline API allows you to run a segmented model across multiple Edge TPUs.

For more information, see Pipeline a model with multiple Edge TPUs.

class pycoral.pipeline.pipelined_model_runner.PipelinedModelRunner(interpreters)¶

Manages the model pipeline.

To create an instance:

interpreter_a = tflite.Interpreter(model_path=model_segment_a,
                                   experimental_delegates=delegate_a)
interpreter_a.allocate_tensors()
interpreter_b = tflite.Interpreter(model_path=model_segment_b,
                                   experimental_delegates=delegate_b)
interpreter_b.allocate_tensors()
interpreters = [interpreter_a, interpreter_b]
runner = PipelinedModelRunner(interpreters)

Be sure you first call allocate_tensors() on each interpreter.

Parameters: interpreters – A list of tf.lite.Interpreter objects, one for each segment in the pipeline.

interpreters()¶: Returns list of interpreters that constructed PipelinedModelRunner.

pop()¶

Returns a single inference result.

This function blocks the calling thread until a result is returned.

Returns: List of numpy.array objects representing the model’s output tensor. Returns None when a push() receives an empty list, indicating there are no more output tensors available.

push(input_tensors)¶

Pushes input tensors to trigger inference.

Pushing an empty list is allowed, which signals the class that no more inputs will be added (the function will return false if inputs were pushed after this special push). This special push allows the pop() consumer to properly drain unconsumed output tensors.

Caller will be blocked if the current input queue size is greater than the queue size max (use set_input_queue_size()). By default, input queue size threshold is unlimited, in this case, call to push() is non-blocking.

Parameters: input_tensors – A list of numpy.array as the input for the given model, in the appropriate order.
Returns: True if push is successful; False otherwise.

set_input_queue_size(size)¶

Sets the maximum number of inputs that may be queued for inference.

By default, input queue size is unlimited.

Note: It’s OK to change the queue size max when PipelinedModelRunner is active. If the new max is smaller than current queue size, pushes to the queue are blocked until the current queue size drops below the new max.

Parameters: size (int) – The input queue size max

set_output_queue_size(size)¶

Sets the maximum number of outputs that may be unconsumed.

By default, output queue size is unlimited.

Parameters: size (int) – The output queue size max

API version 1.0

Is this content helpful?