PLR Protocol Library?

Good morning!
I was wondering if there was space for example protocol library within PLR. I am working with multiple groups on various robots, and I have build, and will continue to build protocols, is there any existing SOP for adding general protocols to the project/is this a wanted feature?

I saw this feature in this thread. as well as several others.

I know companies such as OT have their own libraries, but PLR does not currently have this feature. OT also includes features like a pre-download config menu which customizes the variables in the script before downloading.

I–as someone who just started with lab automation–think that it would be extremely helpful to see verified examples of the workflow and capabilities of PLR.

Some questions:

  • Is there space for this within the existing architecture/does this already exist?
  • Which protocols would be the most useful to get operational initially?
  • Are there set standards for what should be part of each script?
    - Specifically, what should each include beyond the simple machine instructions because many scripts require more functionality to become useful.
    - Ex: I have a turbidostat script with PLR plate placements & pipetting instructions, but also have a database/graphing and a data simulation modules, which are necessary features for the script to run.
  • If no SOP’s, how what should be included?
2 Likes

this is definitely something that we have been wanting to do! (see related thread from 2 days ago: Example codes for Hamilton robot experiments).

While it doesn’t already exist, I think we should have an extra header on the docs website next to β€œuser guide” that is β€œprotocol library”.

We have to discuss what the goal of this PL will be:

  • example protocols users can use as inspiration
  • protocols with the intent of being downloadable and instantly usable in the lab

my vision has always been the latter, enabled by the universal API PLR provides. Examples belong in the user guide IMO. But it’s open for discussion.

Whatever people need & want to contribute!

I think this is highly dependent on the specific protocol. For some protocols like sample prep, maybe we can have a quick QC at the end. For more complex ones like turbidostat, some graphs over time would be nice. I think in general:

  • overview of supported/tested hardware
  • logging
  • example deck layout

are nice to have.

But none of these are strict requirements. Everything in PLR is user-contributed so it’s just up to how much time and effort they want to put in. I just ensure there is a basic level of quality to each contribution, but in terms of breadth it’s entirely up to the contributor.

if you want to quickstart this, why don’t you put up the turbidostat script?

1 Like

Thank you for the feedback, I will make a PR with the Turbidostat script. I have a version as an ipynb as well as a scrubbed version as a .py file as a basis for how to use PLR. I will add each to the folder layout that you described.

Thank you!!

2 Likes

sweet!

i think ipynbs are actually the best format for protocols so if you wanna keep it in that that’s definitely fine! (we usually have ipynbs for protocols and small py files for shared stuff like workcell definitions.)

lmk how i can help! happy to (have codex) put a header on the website & do boilerplate stuff etc

1 Like

@aidan-baydush, that is a wonderful idea! Thank you for pushing this forwards; we only wanted to make this new docs section once we have complete protocols to place in there and you’re accelerated PLR’s timeline :slight_smile:

Regarding organisation of the PLR Protocol Library:
We will undoubtedly have to iterate through multiple versions to achieve the most intuitive and productive page architecture.
To start with, I propose…

  1. …we start organising the PLR Protocol Library by protocol function:
    i.e.

    • automated Protocol (aP) for normalisation
    • aP for ELISA
    • aP for DNA Prep
    • aP for automated growth assay preparation (and automated measurements?)

    Question: What function classification would you assign to your turbidostat notebook?

  1. …have tags for all the machines used in an aP

  2. …try to have minimal non-PLR-covered library dependencies, AND if Python libraries outside the PLR-installed ones are required, clearly list these extra libraries at the top of the Notebook file (including each library’s version).

I am sure we will uncover more requirements along the way of community-driven Protocol Library development, and can write an SOP for contributing new aPs in the next couple of months :slight_smile:

2 Likes

I added the notebook in its current configuration in a PR to main, please let me know if I did this correctly. I will work on cleaning it in its branch, just wanted to get the ball rolling with this example.

2 Likes

@aidan-baydush, I’m still a bit confused:

What is the aim of this automated protocol (aP) and how would you classify it functionally?

Is this aP functional on the machine?

or better, a requirements.txt in the folder for a certain protocol

to expand on what we said above: there is currently no SOP. To just theorize about something without actually using it is difficult. So this protocol will be the one where we develop an SOP alongside it, to be refined for the 2nd protocol, and the 3rd, etc. :slight_smile:

Hey, sorry to not respond today, working on other things within the repo. I am still making some readability edits and more documentation within the notebook for the turbidostat, and I will rearrange the folder structure to be more like you described @CamilloMoschner.

I will also generate a requirements.txt, that is a great idea!!

The aim of this aP is to hold bacteria at a designated optical density (measurement of bacterial density) for a prolonged period of time. I would therefore classify this as a aP under the umbrella of bacterial culture.

Currently have not tested this aP on a machine, I will have the capability to test it on a few different platforms within the next 2 weeks.

1 Like

I added this as an idea (included in last commit to PR):

# Protocol Library
A basis and quickstart applications library for the PyLabRobot platform!


## File Structure

### General
    protocol_library/
    β”œβ”€β”€ protocol_family/
    β”‚  β”œβ”€β”€ protocol_type/
    β”‚  β”‚  β”œβ”€β”€ generic_examples/
    β”‚  β”‚  β”‚  β”œβ”€β”€ example_1/
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ README.md
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ protocol_example.py
    β”‚  β”‚  β”‚  β”‚  └── requirements.txt
    β”‚  β”‚  β”‚  β”œβ”€β”€ example_2/
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ README.md
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ protocol_example.py
    β”‚  β”‚  β”‚  β”‚  └── requirements.txt
    β”‚  β”‚  β”‚  └── ...
    β”‚  β”‚  └── plug_n_play/
    β”‚  β”‚     β”œβ”€β”€ example_1/
    β”‚  β”‚     β”‚  β”œβ”€β”€ README.md
    β”‚  β”‚     β”‚  β”œβ”€β”€ protocol_example.py
    β”‚  β”‚     β”‚  └── requirements.txt
    β”‚  β”‚     └── ...
    β”‚  └── protocol_type_2/ ...
    └── protocol_family_2/ ...

### Example
    protocol_library/
    β”œβ”€β”€ bacterial_culture/
    β”‚  β”œβ”€β”€ turbidostat/
    β”‚  β”‚  β”œβ”€β”€ generic_turbidostat/
    β”‚  β”‚  β”‚  β”œβ”€β”€ turbidostat_with_pumps/
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ README.md
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ turbidostat.ipynb
    β”‚  β”‚  β”‚  β”‚  └── requirements.txt
    β”‚  β”‚  β”‚  β”œβ”€β”€ turbidostat_no_pumps/
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ README.md
    β”‚  β”‚  β”‚  β”‚  β”œβ”€β”€ turbidostat.ipynb
    β”‚  β”‚  β”‚  β”‚  └── requirements.txt
    β”‚  β”‚  β”‚  └── ...
    β”‚  β”‚  └── turbidostat_opentrons/
    β”‚  β”‚     β”œβ”€β”€ README.md
    β”‚  β”‚     β”œβ”€β”€ protocol_example.py
    β”‚  β”‚     └── requirements.txt
    β”‚  β”‚     └── ...
    β”‚  └── protocol_type_2/ ...
    └── protocol_family_2/ ...



Other things to think about with the file structure:
- .gitignore for each
- ideas to include in each README.md?
  - troubleshooting guidelines
  - explaination of proceedure
  - how to cite
  - requirements of system
  - how to configure for systems 
    - ex. for systems with and without pumping troughs


This library is modular and has the ability to accomodate both user added plug n’ play examples, while also allowing for some grooming for the general type protocols. This also allows for research institutions to refrence other protocols within a local environment to troubleshoot and learn about PLR functionaltiy.

I am concerned that this might seem a bit to complex, but I was thinking of many different protocol types, and was trying to think about how to organize a large number. Please let me know your thoughts.

Ideally every protocol is universal/generic. It is an interesting case to see how well PLR’s universal design principles hold up with multiple people running the same protocol on different machines.

Naturally, the person to first contribute a certain protocol might not have all hardware. Future users can adopt the protocol to work on their hardware and submit that code controlled by a config flag. e.g. pumps: bool. Because we will have validation.json for every tested configuration/setup, it is ensured that the original protocol (and every update after that) will keep working as it did when first submitted.

I like this

1 Like

Should require using

! pip install nbstripout
! nbstripout --install

for all notebooks.

sometimes the output can be helpful actually, so people reading the documentation can know what to expect. for logs, i think it’s better to write the output to a file than the notebook stdout. so in theory it shouldn’t be needed.

1 Like

Hi @aidan-baydush,

I really like this approach (particularly the plug and play approach) but I have 2 general questions:

  1. Do you think this automated Protocol (aP) can only be used for bacterial growth control?
    (could the exact same protocol not also be used for any cell type we want to grow at a specific turbidity/β€œcloudiness”?)

  2. My initial suggestion for classifying aPs by their function was meant to ask the question: β€œWhat is the goal of the aP we want to add to the PLR Protocol Library (PL)?”
    Imagine a more classic software engineering example:
    Case 1 - building an app to achieve a specific goal: a company wants to build a weather app; engineers build a dashboard, and let it display weather data to users β†’ goal: information gain for users.
    Case 2 - building a tool: an engineer builds a dashboard, and open-sources the code so that anyone can use the tool however they see fit.

From my point of view (having built micro-chemostats for 5 years and compared them to turbidostats along the way :slight_smile: ), a turbidostat by itself is more of a tool which achieves a single outcome β†’ maintain all cell cultures at the same turbidity.
Leading to the question: What is the goal / product / insight of this?

But it also raises the question whether we want the PLR-PL to include both tool AND complete goal-achieving aPs or just one of the two?

…this is regarding the type of aP we want to integrate.


On the storage & deployment of aPs side: Do you think Docker Containers would be more efficient here?

Containers typically don’t have access to ports but with some major tradeoffs can be configured to do so.

A major advantage would be reusability (if the aim is to share complete, functional aPs that achieve a particular goal).
A major disadvantage would be plug-and-play of code sections (if the aim is to share code as a tool / inspiration / starting point).

I’m curious to hear your views :slight_smile:

1 Like

Interestingly: I can instantly imagine a dozen applications that would convert this tool-oriented aP into a goals-oriented aP:

e.g. a research technician has to generate competent cells (i.e. treat the cells so they readily take up DNA):
Let’s say they have 96 different cell strains and must make all of them competent.
It is crucial for downstream applications that all cell strains have the same competence, and for this purpose it is important to have the same number of cells in the output (i.e. the wells of competent cells).

The only way to achieve this is with your plate_reading-PID-based multi-wellplate turbidostat!

Summary:

  • goal: high-throughput generation of competent cells
    (from different cell strains! [if you’d want 96 wells of the same cell strain you’d just grow it in a single flask, manually check the OD and then aliquot from the one flask into 96 wells])
  • tool to achieve the goal: plate_reading-PID-based multi-wellplate turbidostat

(Note: this case study assumes that the ODs of different cell cultures correlates equivalently to the number of cells in that culture. We know that is incorrect but it’s a pretty universal assumption in plate reading, and has its utility as long as assumptions are clearly stated.)

1 Like

@CamilloMoschner good afternoon,

Regarding types of aPs,

Thank you for the clarification on what you meant by function, I certainly misinterpreted there.

  1. Theoretically yes, I hadn’t given it much thought as our work is mostly centered around bacteria. This aP would be applicable for things that grow in solution, such as yeast or other forms of bacteria, but these would require some adaptation of the data simulation/prediction software.

  2. I am starting to understand what you mean by your question

The Turbidostat is specifically a method designed to dilute cultures (of any kind, but, this implementation is specifically for bacteria), so the goal would depend on the framing, like you exemplify in your second reply. Right now this would still fall under the tool category, but the question is now raised how to categorize aPs.

Overall, I think that a goal oriented library might be challenging. As you mentioned, the framing for each tool generates different goals, so building and selling solutions to goals would be challenging because they are context dependent, whereas building a tool library would allow users to customize what they use each one for.

In my file structure, I accounted for different aP types, the general, which I was imagining that it was filled with skeleton code or PLR scaffolds that allow users to generate their own structures with the given scaffold. The plug and play modules would be aimed at being more tool oriented to help those who are getting started or looking for rapid low-code optimizations to their lab workflows. This way the user is able to define their own goals and use the given tools to build their own solution. This approach also helps to keep all things general.

The tools can be segregated into an organizational structure that they might help with; Ex. SPRI bead protocol falling under purification.

I think that answers your question, please let me know if this makes sense. I have never worked with libraries of this nature, and I am just thinking out loud, so to say.

Regarding deployment

Would the validation.json files that @rickwierenga (see PR) would solve the problem of reusability, at least for now? Perhaps in the future that would be really helpful for avoidance of package issues. I have never worked with containers in this way, but from my understanding they require funding; would funding become an issue?

1 Like

it would solve the problem of:

  • making sure the protocol keeps working as we update PLR
    • thought: if the firmware commands we generate are the same, and respond in the same way to responses from the machine, behavior will literally be the same - no matter how those firmware commands are generated

and moving forward it solves the problem of:

  • users proving their protocol has been run at least once
    • (technically possible to spoof but you gotta have the bar somewhere)

no. just python files. docker is a fundamentally twisted idea. why simulate an entire os just to run a python script. it’s bloatware imo

PLR is quite simple, has a few simple dependencies (the bare minimum is tiny). i think docker is overcomplicating it & not needed. let’s keep it simple.

simple => its easy for people to adopt and do whatever they want. if someone wants to release/use them in containers they should go ahead but in plr we can just host the python file, it will work

1 Like

I also don’t think Docker Containers might be the right solution here.
Even though they are the most widely used deployment solution with an estimated 20 million Devs using it each month (What Is Docker? | IBM), proving it is absolutely useful, I do see the port access issue as a major constraint.
It is simply not designed for robotics but for pure software.
But I think it is very important to consider why Docker - besides this major obstacle - might be interesting for the reliable deployment of entire automated Protocols:

What do Containers solve?
Mainly

  • ensuring identical environments during code deployment, leading to truly robust deployment
  • scalability

Robustness

On the host side:
Everyone has different host computers, different OS, different versions of OS, different Python distros, different Python path configurations, running via venv, conda, or some funkier virtual environment solutions, … All of this affects even our purposefully slim PyLabRobot.
E.g. 1. I recently had an issue with an HID machine because I was using anaconda on a Mac and the path couldn’t find the necessary Python hid library, apparently not an issue when using venv or a different OS(?)
E.g. 2. to my knowledge, the order in which we’re installing PLR via GitHub clone matters: if I install jupyterlab first and then PLR, the PLR installation fails due to a jsonschema conflict with jupyterlab’s installed jsonschema version. The other way around doesn’t matter. And this seems to be a hard constraint because Opentrons refuses to upgrade their library’s jsonschema requirement(?).
Do we now have to know this every time we want to run e.g. an ELISA aP?

On the aP/script side:
I’ve never written or seen an aP that just ran on PyLabRobot, there are always extra dependencies, e.g. numpy, pandas, opencv, pytorch, scikit-learn/image, sql libraries, …
These extra dependencies need to be correctly installed and their dependency trees have to be carefully managed.
And different aPs have different dependency needs.
Do we expect this dependency management to be redone every time from scratch on every newly installed host PC for every aP?

Both of these considerations, varying host PCs + varying dependencies, are big obstacles for deployment.
But of course, these are not new problems, and as a response containers were invented, with Docker simply being the most popular containerisation solution:
Setup your environment and all dependencies once, package them into a Docker image, and anytime you need to run the application (even if the application is executed only via a single Python file) you simply spin up a Container from your image.
That container recreates your entire OS + environment + dependencies exactly as you’ve set them up… A very elegant solution - but the port access limitation, which is an obstacle for robotics applications (that use port-based machine communication), is a designed isolation feature of Containers :sweat_smile:

Validation files do not address these issues, they are powerful tools afterwards, i.e. they solve the issue of how to validate an aP but they do not help with the setup/deployment of the aP.

Scalability

One example: What if you have multiple workcells (identical or almost identical), and you want to run many different aPs.
But you need to constantly adjust your throughput: simple example: you’re a cloud lab and you’re hired to quickly scale up your diagnostics aP.
Ideally you’d just have one functional diagnostics aP Image ready for this purpose and send it to all available workcells for execution. This β€œmulti-tenancy” deployment would ensure that each workcell performs the same operations in less than an hour, even if the local control PCs for each workcell are completely different OSs, hardware, …

The alternative would be to install the diagnostics aP virtual environment on each control PC separately (?).
But then you might encounter issues with specific setups which interfere with your new installation and you’d have to spend time to debug why the installation of dependency X worked on workcell 4 but not on workcell 9.

The reason why this matters is because we all want to generate a PLR Protocol Library in which every added aP can be used robustly by anyone with the necessary machines :slight_smile:

We don’t have to find solutions instantly but I’d recommend keeping this in mind during development.