Modeling plate reading capabilities

starting another post on capabilities, this one about plate reading. see also: modeling arm capabilities and big architecture discussion on capabilities.

I don’t really have strong thoughts or even detailed proposals on this right now, just raising a couple of points for discussion here, this thread is meant for discussion on how we model plate reading capabilities in PLR v1. @vcjdeboer and @CamilloMoschner talked about this in different posts/chats, some of which I have summarized below.

Splitting classes

it is obvious we need to split the current plate reader into at least 3 classes:

  • absorbance (spectrophotometer)
  • fluorescence (fluorometer)
  • luminescence (luminometer)

since we already have plate readers in PLR that only support one modality and not all 3, the byonoys. this part seems pretty obvious to me, but happy to hear people’s thoughts.

Return types

another point of discussion for this thread is return types. These are difficult to change over time, so we have to design a pattern that will work for any machine and is not likely to change over time, and if it does change it should be easy to have depression warnings. This makes me want to use data classes, because the attributes can have warnings if needed in the future.

As camillo pointed out, it is nice to return the data reading per well-name rather than having this encoded in the position of a 2d array (add byonoy luminescence and absorbance plate readers by rickwierenga · Pull Request #617 · PyLabRobot/pylabrobot · GitHub). I would like to adopt that suggestion.

Kinematics / time series

I think splitting single point reads vs time series into different methods rather than overloading one “read_absorbance” method to do everything, as is currently the case.

Some plate readers support kinematics on the firmware level, but we could consider just doing that on the front end in PLR so that every imager supports it. Is there a benefit to having it exist on the backend level? For example, thermocyclers do benefit from entire protocols being “uploaded” once because they do some planning - theoretically you can imagine just calling “set temp” many times but that does not work as well. I am not sure where plate readers fall on this.

Another point of discussion here is streaming, it would be cool to have callbacks or generators or something to stream data as it comes in.

Plates vs arbitrary positions

I would like to turn @hazlamshamin’s amazing work on turning plate readers into universal imagers (and @CamilloMoschner’s implementation of that on the clariostar) into something that has first class support in PLR.

On the front end, we should obviously still have methods for reading plates since that’s something people often do and we can provide some utilities for it.

On the backend end, we should have the smallest set of atomic commands which are easy to implement to make it easy to add new machines. I am thinking just read_positions will be enough.

[Many plate readers still work on the “plate abstraction”, so on the backend the implementation is necessarily “hacky”, but I think it makes sense to think about a nice abstraction in PLR and then having backends deal with translating that to a specific machine rather than adopting the PLR standard to firmware APIs. Balancing those is kind of an art of course, but this is the general principle]

What I am loosely thinking, without really having looked at implementing:

front end:

  • read_plate: read wells
  • read_positions: read coordinates wrt plate lfb
  • read_grid(size, resolution): read a space at a given resolution

backend:

  • read_positions: read at specific coordinates
2 Likes

Great topic. A few thoughts on kinematics and streaming.

Firmware kinetics matters for fast reads.

Calcium flux, ion channel, fast enzyme assays, these need millisecond-resolution reads on a single well right after injection. The firmware parks on one well and runs a tight internal loop capturing hundreds of reads per second. A software loop cannot do that i think. Same principle as thermocyclers uploading a full protocol indeed.

For slow plate-mode kinetics, a software loop is fine. But the backend should indeed expose firmware kinetics where available.

On streaming, a signal-based pattern could work nicely.

related to my comment in the plr architecture thread: 3. Signals — alongside Protocols

The idea: a driver declares what it can emit as typed signals, and consumers subscribe before the run starts. Data flows through a buffered queue so nothing is lost.

Here’s a worked-out example. A driver declares its signals as class attributes:

  class SynergyH1Driver(CanReadAbsorbance):
      absorbance = SignalR(unit="OD", dtype="array")
      temperature = SignalR(unit="°C", dtype="number")

      async def start_kinetic(self, n_cycles, interval, wavelength):
          await self._send("start_kinetic", n_cycles, interval, wavelength)
          # Firmware runs the loop — each cycle, data arrives over the wire
          asyncio.create_task(self._consume_cycles(n_cycles))

      async def _consume_cycles(self, n_cycles):
          for i in range(n_cycles):
              raw = await self._read_cycle_result()
              self.absorbance.emit(raw)
              self.temperature.emit(await self._read_temp())

The capability opens a stream before telling the firmware to start — subscribe first, then go:

  class AbsorbanceCapability:
      async def read_kinetic(self, plate, n_cycles, interval, wavelength):
          positions = self._plate_to_positions(plate)

          # Open the gate BEFORE firmware starts — nothing missed
          stream = self.backend.absorbance.stream()
          await self.backend.start_kinetic(n_cycles, interval, wavelength)

          # Each cycle arrives as it's measured
          async for data in stream:
              yield data

User code:

  async for reading in reader.absorbance.read_kinetic(
          plate, n_cycles=10, interval=60, wavelength=600):
      print(f"Mean OD: {reading.mean():.3f}")
      # Arrives every 60s — not batched at the end

The SignalR is simple — just a typed channel with a subscriber queue:

  class SignalR:
      def __init__(self, unit, dtype="number"):
          self.unit = unit
          self.dtype = dtype
          self._subscribers = []

      def emit(self, value):
          for queue in self._subscribers:
              queue.put_nowait(value)

      async def stream(self):
          queue = asyncio.Queue()
          self._subscribers.append(queue)
          try:
              while True:
                  value = await queue.get()
                  yield value
          finally:
              self._subscribers.remove(queue)

This same pattern works across dimensions, the signal doesn’t care what’s advancing. Time series (each cycle emits one plate array), spatial scanning (each position emits one measurement), or wavelength sweep (each λ emits one plate array).

This signal pattern is borrowed from bluesky/ophiod; the hardware abstraction layer used at NSLS-II synchrotron beamlines. Their SignalR / SignalRW / SignalX hierarchy solves the same problem, typed, subscribable channels between hardware drivers and experiment orchestration. We’d be adapting it for lab instruments instead of beamline components. In ophyd-async these are primarily used for continuously updating sensors, but the same subscribe-and-iterate mechanism likely works for discrete measurement sequences like kinetic reads, the signal just emits one value per cycle instead of continuously.

1 Like

Splitting classes

yes this looks good. The split also future-proofs things, modes like TRF, fluorescence polarization, or AlphaScreen, would each become their own capability without touching the existing ones.

Return types

+1 on dataclasses for return types, and keyed by well name rather than array position.

One thought on return types (but it might be off topic): if we know the shape and dtype of the result before the read happens, we could have the capability declare a descriptor upfront. The descriptor is the schema of what data is about to be produced, shape, dtype, units, declared once before any measurement starts.

  # Endpoint read of a 96-well plate
  {"absorbance": {"dtype": "float", "shape": [8, 12], "unit": "OD"}}

  # Kinetic run — 10 cycles of a 96-well plate
  {"absorbance": {"dtype": "float", "shape": [10, 8, 12], "unit": "OD"}}

  # Spectrum scan — 96-well plate across 24 wavelengths
  {"absorbance": {"dtype": "float", "shape": [24, 8, 12], "unit": "OD"}}

The capability can build this from the plate geometry and the measurement parameters before the read starts. The actual data returned conforms to this schema.

Why a descriptor matters:

  • Stable return types: the schema is the contract. If it changes, you know exactly what changed — a shape dimension, a unit, a dtype. Deprecation warnings can target specific fields.
  • Consumers can prepare: storage backends allocate arrays, live plots set up axes, remote clients know what’s coming — all before the first measurement.
  • Kinetic runs declare once: a 100-cycle kinetic run has one descriptor, not 100. Each cycle’s data conforms to the declared shape.
  • Self-documenting: you know what the dimensions mean (rows, columns, time, wavelength) without inspecting the data.

This descriptor pattern is also borrowed from bluesky - descriptors, where instruments declare their data schema upfront before producing measurements.

Kinematics / time series

see Modeling plate reading capabilities - #2 by vcjdeboer

Plates vs arbitrary positions

+1 on read_positions as the single backend method, all three frontend methods collapsing into one atomic backend call is clean.

Also a side note: if read_positions receives dataclasses (name + coordinate) rather than live PLR objects, the backend boundary becomes serializable. If we applied that principle across all capabilities — liquid handling, arms, shaking — the entire driver interface would be serializable by design. That would make @koeng protobuf/networking work significantly easier (no custom converters), and it also makes drivers easier to test (just pass in data, no need to construct a full resource tree) and easier to write for new contributors (the driver contract is explicit about what data it needs).

yes it will be a python type for now as we implement this first in PLR. but over time, yes any language can describe it

yes 100%

will respond in that thread

thanks, good to know

Adding some nuance to my earlier point about firmware vs software kinetics. I said “a software loop cannot do millisecond reads”, that’s true for request-response over USB serial, where the round-trip latency (USB frame interval is already 1ms on full-speed FTDI, plus OS scheduling jitter) makes 1000 Hz polling unrealistic. But if a plate reader’s firmware supports a streaming mode, one start command, then continuous data push while parked on a single well, Python can receive and process all the data. The per-reading timing precision still comes from the reader’s firmware though, not from Python’s receive clock, since FTDI drivers deliver data in buffered chunks.

So for plate reader kinetics specifically, the architecture should probably support three modes:

  1. Firmware-timed, firmware-stored: the reader runs the full kinetic protocol internally, software fetches the results block afterward

  2. Firmware-timed, software-streamed: the reader pushes per-cycle or per-well data live as it measures, Python receives via something like SignalR (I don’t know if readers like the Synergy H1 or CLARIOstar actually expose this, but the architecture should handle it)

  3. Software-timed: Python sends a read command each cycle, fine for plate-mode kinetics where the mechanical travel between wells is the limiting factor anyway, but not for fast single-well millisecond reads

The SignalR pattern I sketched fits mode 2 well, though emitted values should carry reader-side timestamps rather than relying on when Python processes them.

in terms of modeling the commands it is clear we need a separate kinematics command.

100%

1 Like

I think this SignalR pattern is interesting. Queues are a conventional way of doing streaming. In terms of our API, the primary thing we still want is for the read methods to return data directly:

data = await pr.read()

where read just returns data at the end. This is nice and simple. People might want to worry about streaming, but that should not be the default we require.

For streaming, I think we will need to have a separate method (which can also be the backend method) so typing is nice:

async for data in pr.stream():
  ...

the backend will only need to do streaming, as it is easy to convert a stream into a final object on the front end (just consume the stream until it ends)

2 Likes

On splitting the classes, I have no objection to this and I think separating them would benefit on expanding more microplate reader instruments in a more obvious sense. One question is, how are we going to handle a machine that support many classes? Are we going to completely separate the backends, or will this be under capabilities for PLR v2, ie machine.frontend_attribute.capability, eg tecan_infinite.absorbance_reader.read_absorbance?

On the time series, i agree this should be supported natively in PLR since this is a core function for many wet labs. So I believe it should be in standard.py. I partially agree that we could just handle the time series on the frontend, however using your example with the thermocycler, certain microplate readers might also do planning, mainly calibration, for the whole time series. Additionally we need to carefully look at how much is the initial warming-up for each device backend before it actually do the reading. It needs to be efficient and swift enough if we want to do time series via frontend. Many OEM UI errors during time series come from the time needed to read the defined wells (warming up, if any + actual reading) takes much longer than the defined time interval. For more transparency, I think it is better to return both the frontend and the backend read time, so developers can use these data however they want to analyse, and less pain to standardise strictly.

The streaming is feasible too I think, depending on backend (tecan infinite allows it, but currently we do pooling for multiple wells. if streaming is enabled, this can be an argument, like stream=True, for reading multiple wells too i think, instead of wait till end before getting the reads for all wells. tho streaming for one endpoint is less necessary than during many timepoints). But one challenge that I anticipate is making sure the machine stay alive, especially when time interval is hour(s), as well as error handling and making fallback for the data when connection abruptly closes mid experiment. Additionally for time series, we usually do shaking and temperature control using the machine. How are we going to model these? Will the microplate reader (any 3 class) also be a HeaterShaker or an Incubator? Or will they not, since canonically, in labs, we dont call them that despite them having these capabilities. To think of it in another way, one may also use a microplate reader that can do shaking/incubating, as an incubator but I think this is more like repurposing rather than its primary purpose. So, I think, having temperature controlling and shaking as the capabilities under the frontend attribute suffice, instead of also modelling it under less relevant classes.

Lastly, on the arbitrary position reading, thank you for looking more on this, and I agree on the suggested frontend/backend functions. In fact I love this :slight_smile:

second option, different attributes

It will be possible for what is currently backends, will be drivers to implement multiple “capability backends”, but they can also be separate objects that then point to the driver.

See this post for more info: Updating PLR API for machine interfaces discussion - #62 by rickwierenga

whether something exists in standard.py is just a question of whether we want backends to support it. In this case you and @vcjdeboer have made convincing arguments that time series are an atomic command, not just a loop on the client side.

why is frontend/client read time important? assuming backend here means machine reported time for each measurement.

I was thinking two separate methods for stream/read. Backend only does streaming so

class AbsorbanceReaderBackend:
  async def stream_absorbance(self) -> AsyncGenerator[dict, None]: ...

and on the front end:

class AbsorbanceReader:
  async def stream_absorbance(self) -> AsyncGenerator[dict, None]: ...
  async def read_absorbance(self) -> List[dict]:
    all_data = []
    async for data in pr.stream():
      all_data.append(data)
    return data

Changing the return type based on a parameter is ugly.

Mostly a topic for the main architecture thread, but will answer briefly here. The idea is the Device (object user creates) will orchestrate different capabilities, so

class TecanInfinite:
  def __init__(self, ...):
    self.driver = TecanInfiniteDriver(...)
    self.absorbance_reader = AbsorbanceReader(backend=InfiniteAbsorbance(driver=self.driver))
    self.temperature_controller = TemperatureController(backend=InfiniteTemperatureController(driver=self.driver))
    self.shaker = Shaker(backend=InfiniteShaker(driver=self.driver))

The capabilities are universal, and mimic the existing front/backend pattern. In this specific example I assume we have different InfiniteAbsorbance, InfiniteTemperatureController, InfiniteShaker objects (I assume we want this for big backends), but we can also have TecanInfiniteDriver implement AbsorbanceReaderBackend etc. directly.