Guide

Probing

Probe a variable

To get a stream of the value of the variable named a in the function f, pass the selector "f > a" to probing():

def f(x, y):
    a = x * x
    b = y * y
    return a + b

with probing("f > a").values() as values:
    f(12, 5)

assert values == [{"a": 144}]

The function f should be visible in the scope of the call to probing (alternatively, you can provide an explicit environment as the env argument).

Probe the return value

To probe the return value of f, use the selector f() as result (you can name the result however you like):

def f(x, y):
    return x + y

with probing("f() as result").values() as values:
    f(2, 5)

assert values == [{"result": 7}]

Probe multiple variables

Ptera is not limited to probing a single variable in a function: it can probe several at the same time (this is different from passing more than one selector to probing).

When probing multiple variables at the same time, it is important to understand the concept of focus variable. The focus variable, if present, is the variable that triggers the events in the pipeline when it is assigned to (note that parameters are considered to be “assigned to” at the beginning of the function):

  1. probing("f(x) > y"): The focus is y, this triggers when y is set. (Probe type: Immediate)

  2. probing("f(y) > x"): The focus is x, this triggers when x is set. (Probe type: Immediate)

  3. probing("f(x, y)"): There is no focus, this triggers when f returns. (Probe type: Total – these may be a bit less intuitive, see the section on Total probes but don’t feel like you have to use them)

To wit:

def f():
    x = 1
    y = 2
    x = 3
    y = 4
    x = 5
    return x

# Case 1: focus on y
with probing("f(x) > y").values() as values:
    f()

assert values == [
    {"x": 1, "y": 2},
    {"x": 3, "y": 4},
]

# Case 2: focus on x
with probing("f(y) > x").values() as values:
    f()

assert values == [
    {"x": 1},  # y is not set yet, so it is not in this entry
    {"x": 3, "y": 2},
    {"x": 5, "y": 4},
]

# Case 3: no focus
# See the section on total probes
with probing("f(x, y)", raw=True).values() as values:
    f()

assert values[0]["x"].values == [1, 3, 5]
assert values[0]["y"].values == [2, 4]

Note

The selector syntax does not necessarily mirror the syntax of actual function calls. For example, f(x) does not necessarily refer to a parameter of f called x. As shown above, you can put any local variable between the parentheses. You can also probe global/closure variables that are used in the body of f.

Note

The selector f(x, !y) is an alternative syntax for f(x) > y. The exclamation mark denotes the focus variable. There can only be one in a selector.

Probe across scopes

Sometimes you would like to get some context about whatever you are probing, and the context might not be in the same scope: it might be, for example, in the caller. Thankfully, Ptera has you covered.

def outer(n):
    x = 0
    for i in range(n):
        x += inner(i)
    return x

def inner(x):
    a = x * x
    return a + 1

with probing("outer(n) > inner > a").values() as values:
    outer(3)

assert values == [
    {"n": 3, "a": 0},
    {"n": 3, "a": 1},
    {"n": 3, "a": 4},
]

As you can see, this probe gives us the context of what the value of n is in the outer scope, and that context is attached to every entry.

Note

The selector outer > inner > a does not require inner to be called directly within outer. The call can be indirect, for example if outer calls middle, and middle calls inner, the selector will still match. This makes it even more practical, since you can easily capture context quite removed from the focus variable.

Probe sibling calls

Now we’re getting into power features that are a bit more niche, but Ptera goes even beyond probing across caller/callee scopes: it can also attach results from sibling calls!

def main(x):
    return negmul(side(3), side(6))

def side(x):
    return x + 1

def negmul(x, y):
    a = x * y
    return -a

with probing("main(x, side(x as x2), negmul(!a))", raw=True).values() as values:
    main(12)

assert values == [
    {"x": 12, "x2": 6, "a": 28}
]

Here we use the ! notation to indicate the focus variable, but it is not fundamentally different from doing ... > negmul > a. The probe above gives us, all at once:

  • The value of x in the main function.

  • The latest value of x in side (under a different name, to avoid clashing)

  • The value of the local variable a in negmul

Total probes

A probe that does not have a focus variable is a “total” probe. Total probes function differently:

  • Instead of triggering when a specific focus variable is set, they trigger when the outermost function in the selector ends.

  • Instead of providing the latest values of all the variables, they collect all the values the variables have taken (hence the name “total”).

  • Since the default interface of probing assumes there is only one value for each variable in each entry, total probes will fail if multiple values are captured for the same variable in the same entry, unless you pass raw=True to probing. This will cause Capture instances to be provided instead.

For example, if we remove the focus from the previous example (and add raw=True):

def main(x):
    return negmul(side(3), side(6))

def side(x):
    return x + 1

def negmul(x, y):
    a = x * y
    return -a

with probing("main(x, side(x as x2), negmul(a))", raw=True).values() as values:
    main(12)

assert values[0]["x"].values == [12]
assert values[0]["x2"].values == [3, 6]
assert values[0]["a"].values == [28]

In this example, each call to main will produce exactly one event, because main is the outermost call in the selector. You can observe that x2 is associated to two values, because side was called twice.

Note

You can in fact create a total probe that has a focus with probing(selector, probe_type="total"). In this case, it will essentially duplicate the data for the outer scopes for each value of the focus variable.

Global probes

The global_probe() function can be used to set up a probe that remains active for the rest of the program. Unlike probing it is not a context manager.

def f(x):
    a = x * x
    return a

gprb = global_probe("f > a")
gprb.print()

f(4)  # prints 16
f(5)  # prints 25

gprb.deactivate()

f(6)  # prints nothing

Note

Probes can only be activated once, so after calling deactivate you will need to make a new probe if you want to reactivate it.

Note

Reduction operators such as min() or sum() are finalized when the probe exits. With probing, that happens at the end of the with block. With global_probe, that happens either when deactivate is called or when the program exits.

Wrapper probe

Warning

This is a less mature feature, use at your own risk.

A wrapper probe is a probe that has two focuses. On the first focus, it generates an opening event, and on the second focus, it generates a closing event. These events can be fed into a context manager or generator using wrap(), kwrap() (subscribers), or wmap() (operator).

The first focus works as normal and can be specified with !. The second focus is specified with !!. In the example below we compute the elapsed time between a = 1 and b = 2:

def main(x):
    for i in range(1, x + 1):
        a = 1
        time.sleep(i)
        b = 2

def _timeit():
    t0 = time.time()
    yield
    t1 = time.time()
    return t1 - t0

with probing("main(!a, !!b)") as prb:
    times = prb.wmap(_timeit).accum()
    main(3)

print(times)  # Approximately [0.1, 0.2, 0.3]

The wmap method takes a generator that yields exactly once. It is called when the first focus is triggered (captured values may be passed as keyword arguments). Then it must yield and will be resumed when the second focus is triggered (yield returns the captured data). The return value becomes the next value of the resulting stream.

The wrap and kwrap functions are similar, but they do not return streams. They work like subscribe and ksubscribe, but you can pass either a generator that yields once or an arbitrary context manager.

You can use meta-variables if needed:

  • main(!#enter, !!#exit) can be used to wrap the entire function.

  • main(!#loop_i, !!#endloop_i) can be used to wrap each iteration of the for loop that uses an iteration variable named i.

Note

If prb is a stream that contains multiple wrapper probes and you only want to wrap one of them, you can pass the name of the focus variable of its selector as the first argument to wmap.

Important

Wrapper probes work a little like with statements, but not really: if an error occurs between the two focuses, the wrapper probe will not be informed. The second focus will simply not happen and the generator will not be called back (it will just hang somewhere forever, wasting memory).

There is one safe special case: if you use a selector like f(!#enter, #error, !!#exit), it should always complete because the special meta-variable #exit is always emitted when a function ends, even if there is an error. The error, if there is one, will be offered as #error. You can get that from the dictionary returned by yield in the handler you pass to wmap.

Operations

In all of the previous examples, I have used the .values() method to gather all the results into a list. This is a perfectly fine way to use Ptera and it has the upside of being simple and easy to understand. There are however many other ways to interact with the streams produced by probing.

Printing

Use .print(<format>) or .display() to print each element of the stream on its own line.

def f(x):
    y = 0
    for i in range(1, x + 1):
        y = y + x
    return y

with probing("f > y").print("y = {y}"):
    f(3)

# Prints:
# y = 0
# y = 1
# y = 3
# y = 6

If print is given no arguments it will use plain str() to convert the elements to strings. display() displays dictionaries a bit more nicely.

Subscribe

You can, of course, subscribe arbitrary functions to a probe’s stream. You can do so with:

  1. The >> operator

  2. The subscribe method (passes the dictionary as a positional argument)

  3. The ksubscribe method (passes the dictionary as keyword arguments)

For example:

def f(x):
    y = 0
    for i in range(1, x + 1):
        y = y + x
    return y

with probing("f > y") as prb:
    # 1. The >> operator
    prb >> print

    # 2. The subscribe method
    @prb.subscribe
    def _(data):
        print("subscribe", data)

    # 3. The ksubscribe method
    @prb.ksubscribe
    def _(y):
        print("ksubscribe", y)

    f(3)

# Prints:
# {"y": 0}
# subscribe {"y": 0}
# ksubscribe 0
# ...

Map, filter, reduce

Let’s say you have a sequence and you want to print out the maximum absolute value. You can do it like this:

def f():
    y = 1
    y = -7
    y = 3
    y = 6
    y = -2

with probing("f > y") as prb:
    maximum = prb["y"].map(abs).max()
    maximum.print("The maximum is {}")

    f()

# Prints: The maximum is 7
  • The [...] notation indexes each element in the stream (you can use it multiple times to get deep into the structure, if you’re probing lists or dictionaries. There is also a .getattr() operator if you want to get deep into arbitrary objects)

  • map maps a function to each element, here the absolute value

  • min reduces the stream using the minimum function

Note

map is different from subscribe. The pipelines are lazy, so map might not execute if there is no subscriber down the pipeline.

If the stream interface is getting in your way and you would rather get the maximum value as an integer that you can manipulate normally, you have two (pretty much equivalent) options:

# With values()
with probing("f > y")["y"].map(abs).max().values() as values:
    f()

assert values == [7]

# With accum()
with probing("f > y") as prb:
    maximum = prb["y"].map(abs).max()
    values = maximum.accum()

    f()

assert values == [7]

That same advice goes for pretty much all the other operators.

Overriding values

Using overridable=True, Ptera’s probes are able to override the values of the variables being probed (unless the probe is total; nonlocal variables are also not overridable). For example:

def f(x):
    hidden = 1
    return x + hidden

assert f(10) == 11

with probing("f > hidden", overridable=True) as prb:
    prb.override(2)

    assert f(10) == 12

The argument to override() can also be a function that takes the current value of the stream. Also see koverride().

Warning

override() only overrides the focus variable. Recall that the focus variable is the one to the right of >, or the one prefixed with !.

This is because a Ptera selector is triggered when the focus variable is set, so realistically it is the only one that it makes sense to override.

Be careful, because it is easy to write misleading code:

# THIS WILL SET y = x + 1, NOT x
OverridableProbe("f(x) > y")["x"].override(lambda x: x + 1)

Note

override will only work at the end of a synchronous pipe (map/filter are OK, but not e.g. sample)

If the focus variable is the return value of a function (as explained in Probe the return value), override will indeed override that return value.

Note

Operations subscribed to probing(selector, overridable=True) happen before those that are subscribed to probing(selector). If you want a probe to see the values after the override, that probe needs to be the non-overridable type, otherwise it will see the values before the override. You can use both probe types at the same time:

def f():
    return 1

with probing("f() as ret", overridable=True) as oprb:
    with probing("f() as ret") as prb:
        oprb.override(2)

        oprb.print()  # will print {"ret": 1} (because concurrent with override)
        prb.print()   # will print {"ret": 2} (because after override)

        print(f())    # will print 2

Asserts

The fail() method can be used to raise an exception. If you put it after a filter, you can effectively fail when certain conditions occur. This can be a way to beef up a test suite.

def median(xs):
    # Don't copy this because it's incorrect if the length is even
    return xs[len(xs) // 2]

with probing("median > xs") as prb:
    prb.kfilter(lambda xs: len(xs) == 0).fail("List is empty!")
    prb.kfilter(lambda xs: list(sorted(xs)) != xs).fail("List is not sorted!")

    median([])               # Fails immediately
    median([1, 2, 5, 3, 4])  # Also fails

Note the use of the kfilter() operator, which receives the data as keyword arguments. Whenever it returns False, the corresponding datum is omitted from the stream. An alternative to using kfilter here would be to simply write prb["xs"].filter(...).

Conditional breakpoints

Interestingly, you can use probes to set conditional breakpoints. Modifying the previous example:

def median(xs):
    return xs[len(xs) // 2]

with probing("median > xs") as prb:
    prb.kfilter(lambda xs: list(sorted(xs)) != xs).breakpoint()

    median([1, 2, 5, 3, 4])  # Enters breakpoint
    median([1, 2, 3, 4])     # Does not enter breakpoint

Using this code, you can set a breakpoint in median that is triggered only if the input list is not sorted. The breakpoint will occur wherever in the function the focus variable is set, in this case the beginning of the function since the focus variable is a parameter.

Selected operators

Here is a classification of available operators.

Filtering

  • filter(): filter with a function

  • kfilter(): filter with a function (keyword arguments)

  • where(): filter based on keys and simple conditions

  • where_any(): filter based on keys

  • keep(): filter based on keys (+drop the rest)

  • distinct(): only emit distinct elements

  • norepeat(): only emit distinct consecutive elements

  • first(): only emit the first element

  • last(): only emit the last element

  • take(): only emit the first n elements

  • take_last(): only emit the last n elements

  • skip(): suppress the first n elements

  • skip_last(): suppress the last n elements

Mapping

  • map(): map with a function

  • kmap(): map with a function (keyword arguments)

  • augment(): add extra keys using a mapping function

  • getitem(): extract value for a specific key

  • sole(): extract value from dict of length 1

  • as_(): wrap as a dict

Reduction

  • reduce(): reduce with a function

  • scan(): emit a result at each reduction step

  • roll(): reduce using overlapping windows

  • kmerge(): merge all dictionaries in the stream

  • kscan(): incremental version of kmerge

Arithmetic reductions

Most of these reductions can be called with the scan argument set to True to use scan instead of reduce. scan can also be set to an integer, in which case roll is used.

Wrapping

  • give.wrap(): give a special key at the beginning and end of a block

  • give.wrap_inherit(): give a special key at the beginning and end of a block

  • give.inherit(): add default key/values for every give() in the block

  • given.wrap(): plug a context manager at the location of a give.wrap

  • given.kwrap(): same as wrap, but pass kwargs

Timing

  • debounce(): suppress events that are too close in time

  • sample(): sample an element every n seconds

  • throttle(): emit at most once every n seconds

Debugging

  • breakpoint(): set a breakpoint whenever data comes in. Use this with filters.

  • tag(): assigns a special word to every entry. Use with breakword.

  • breakword(): set a breakpoint on a specific word set by tag, using the BREAKWORD environment variable.

  • print(): print out the stream.

  • display(): print out the stream (pretty).

  • accum(): accumulate into a list.

  • values(): accumulate into a list (context manager).

  • subscribe(): run a task on every element.

  • ksubscribe(): run a task on every element (keyword arguments).

Miscellaneous

Meta-variables

There are a few meta-variables recognized by Ptera that start with a hash sign:

  • #enter is triggered immediately when entering a function. For example, if you want to set a breakpoint at the start of a function with no arguments you can use probing("f > #enter").breakpoint().

  • #value stands in for the return value of a function. f() as x is sugar for f > #value as x.

  • #error stands for the exception raised by the function, if there is one.

  • #exit is triggered when exiting a function, both on a normal return and when there is an error.

  • #yield is triggered whenever a generator yields.

  • #receive stands for the output of yield.

  • #loop_X and #endloop_X are triggered respectively at the beginning and end of each iteration of a for X in ...: loop (the meta-variables are named after the iteration variable). If there are multiple iteration variables, you can use any of them. There is no way to differentiate loops that have the same iteration variables.

The #enter and #receive meta-variables both bear the @enter tag (meaning that they are points at which execution might enter the function). You can therefore refer to both using the selector $x::@enter. Conversely, #exit and #yield bear the @exit tag. You can leverage this feature to compute e.g. how much time is spent inside a function or generator.

Generic variables

It is possible to indiscriminately capture all variables from a function, or all variables that have a certain “tag”. Simply prefix a variable with $ to indicate it is generic. When doing so, you will need to set raw=True if you want to be able to access the variable names. For example:

def f(a):
    b = a + 1
    c = b + 1
    d = c + 1
    return d

with probing("f > $x", raw=True) as prb:
    prb.print("{x.name} is {x.value}").

    f(10)

# Prints:
# a is 10
# b is 11
# c is 12
# d is 13

Note

$x will also pick up global and nonlocal variables, so if for example you use the sum builtin in the function, you will get an entry for sum in the stream. It will not pick up meta-variables such as #value, however.

Selecting based on tags

This feature admittedly clashes with type annotations, but Ptera recognizes a specific kind of annotation on variables:

def f(a):
    b = a + sum([1])
    c: "@Cool" = b + 1
    d: "@Cool & @Hot" = c + 1
    return d

with probing("f > $x:@Cool", raw=True) as prb:
    prb.print("{x.name} is {x.value}")

    f(10)

# Prints:
# c is 12
# d is 13

In the above code, only variables tagged as @Cool will be instrumented. Multiple tags can be combined using the & operator.

Probe methods

Probing methods works as one would expect. When using a selector such as self.f > x, it will be interpreted as cls.f(self = <self>) > x so that it only triggers when it is called on this particular self.

Absolute references

Ptera inspects the locals and globals of the frame in which probing is called in order to figure out what to instrument. In addition to this system, there is a second system whereas each function corresponds to a unique reference. These references always start with /:

global_probe("/xyz.submodule/Klass/method > x")

# is essentially equivalent to:

from xyz.submodule import Klass
global_probe("Klass.method > x")

The slashes represent a physical nesting rather than object attributes. For example, /module.submodule/x/y means:

  • Go in the file that defines module.submodule

  • Enter def x or class x (it will not work if x is imported from elsewhere)

  • Within that definition, enter def y or class y

The helper function refstring() can be used to get the absolute reference for a function.

Note

  • Unlike the normal notation, the absolute notation bypasses decorators. /module/function will probe the function inside the def function(): ... in module.py, so it will work even if the function was wrapped by a decorator (unless the decorator does not actually call the function).

  • Use /module.submodule/func, not /module/submodule/func. The former roughly corresponds to from module.submodule import func and the latter to from module import submodule; func = submodule.func, which can be different in Python. It’s a bit odd, but it works that way to properly address Python quirks.