pycoral.pipeline

pycoral.pipeline.pipelined_model_runner

The pipeline API allows you to run a segmented model across multiple Edge TPUs.

For more information, see Pipeline a model with multiple Edge TPUs.

class pycoral.pipeline.pipelined_model_runner.PipelinedModelRunner(interpreters)

Manages the model pipeline.

To create an instance:

interpreter_a = tflite.Interpreter(model_path=model_segment_a,
                                   experimental_delegates=delegate_a)
interpreter_a.allocate_tensors()
interpreter_b = tflite.Interpreter(model_path=model_segment_b,
                                   experimental_delegates=delegate_b)
interpreter_b.allocate_tensors()
interpreters = [interpreter_a, interpreter_b]
runner = PipelinedModelRunner(interpreters)

Be sure you first call allocate_tensors() on each interpreter.

Parameters

interpreters – A list of tf.lite.Interpreter objects, one for each segment in the pipeline.

interpreters()

Returns list of interpreters that constructed PipelinedModelRunner.

pop()

Returns a single inference result.

This function blocks the calling thread until a result is returned.

Returns

List of numpy.array objects representing the model’s output tensor. Returns None when a push() receives an empty list, indicating there are no more output tensors available.

push(input_tensors)

Pushes input tensors to trigger inference.

Pushing an empty list is allowed, which signals the class that no more inputs will be added (the function will return false if inputs were pushed after this special push). This special push allows the pop() consumer to properly drain unconsumed output tensors.

Caller will be blocked if the current input queue size is greater than the queue size max (use set_input_queue_size()). By default, input queue size threshold is unlimited, in this case, call to push() is non-blocking.

Parameters

input_tensors – A list of numpy.array as the input for the given model, in the appropriate order.

Returns

True if push is successful; False otherwise.

set_input_queue_size(size)

Sets the maximum number of inputs that may be queued for inference.

By default, input queue size is unlimited.

Note: It’s OK to change the queue size max when PipelinedModelRunner is active. If the new max is smaller than current queue size, pushes to the queue are blocked until the current queue size drops below the new max.

Parameters

size (int) – The input queue size max

set_output_queue_size(size)

Sets the maximum number of outputs that may be unconsumed.

By default, output queue size is unlimited.

Note: It’s OK to change the queue size max when PipelinedModelRunner is active. If the new max is smaller than current queue size, pushes to the queue are blocked until the current queue size drops below the new max.

Parameters

size (int) – The output queue size max