Reverse Engineering the ClarioStar Plate Reader: A Guide and Case Study
Hey y’all,
I’ve been working on reverse engineering the ClarioStar plate reader, and I’ve seen posts here asking for guidance on similar undertakings involving figuring out how to send serial commands to automatable instruments. I want to share some general principles I’ve found incredibly useful, along with a salient example from a recent breakthrough I accomplished with @Eric, where we pinpointed the exact function and encoding approach of a single byte in the serial stream.
This particular reverse engineering case has both advantages and disadvantages. Conveniently, the instrument’s proprietary software logs all serial commands sent and received. These logs are easily accessible and can even be “cleaned up” with their built-in log viewer. This significantly simplifies the process by bypassing the need for serial sniffing and manual parsing. However, the parameter space for controlling the plate reader is incredibly vast, with dozens of parameters encoding physical dimensions, categorical actions, numeric settings, and more.
I started this project last December, and it was on hold for about six months while I moved labs and shifted priorities. After many hours of work, I’ve successfully decoded how various aspects like plate dimensions, wells to read, reading type, shaker actuation, well scan patterns, and reading modes are encoded and learned a lot about reverse engineering sending commands in the process.
My Reverse Engineering Principles
Here’s the approach that has worked for me:
1. Define Your Bare Minimum
Before you dive in, consider what features are absolutely essential for your needs. Do you need a complete wholesale reverse engineering, or would basic patching of a single command suffice for your specific goals? Can you get away with just a dictionary of pre-defined commands? Relatedly, are you developing this for the community, or do you really need it just for your specific use case?
Answering this can help you structure your approach and gives you good backups to fall back to if a complete or more thorough reverse engineering isn’t feasible on the timescale you need it.
My objective was to enable use of multiple different protocols used in my lab. These covered a large enough swath of the parameter space to justify a full reverse engineering effort. I kept the full commands of protocols saved so we could basically patch it in if needed, but work using the plate reader is only now getting back into swing.
2. Isolate Encodings (Iterative Process)
- 2.1 Isolate Encodings: This is often the most monotonous, frustrating, and labor-intensive step (but also potentially automatable). You need to gather enough measurements across various parameters and meticulously record the details of each command, just to get some initial patterns you can hone in on.
- 2.2 One Thing at a Time: Once you have some leads, focus on figuring out exactly how things are encoded by changing only one variable at a time. This is usually the best starting point. However, with the ClarioStar’s extensive degrees of freedom, this approach would have taken far too long for the entire plate reader. It was also sadly impossible for the plate dimensions and well encodings as multiple values changed at once, but thankfully the techniques described below still led to me figuring out how that information is communicated.
- Techniques for Decoding:
- XOR/Difference Analysis: For “one-difference” commands (where only a single parameter changes), using XOR or direct difference analysis between command pairs can quickly highlight the relevant bytes or bits.
- Direct Numeric Encoding: For numeric parameters, look for direct numeric encoding. Conversely, if a parameter has a small, fixed number of options, it’s likely encoded categorically, maybe up to the number of options. For categorical, I saw both an integer based (0-5 for 6 categories of shaker patterns) and also some that started at values such as 120-124, so it may not be consistent for your machine.
- Correlation Analysis: Analyze correlations at the bit, bit pair, nibble, byte, and byte pair levels. This is overkill, but if you set up the output right, it should be easy to filter. This is particularly useful for identifying how larger values or multiple discrete options are packed into a command and can quickly give a head start. Be wary of spurious or confounded correlations, especially with the integer value interpretation.
- Binary Search Approach for Numeric Parameters: For numeric parameters, compare the minimum and maximum values (e.g., using XOR) and analyze the bits, bit pairs, nibbles(4 bit segments), and bytes. The range of values can guide you: a variable spanning 1 to 1000 will require at least 10 bits (2^{10} = 1024 values). Often, a full 2 byte 16-bit number (up to 65,535 values) would be used… but not always! If still unclear, try a value in the middle and continue the binary search in this way until you’ve nailed it down. You should also develop and test predictions as you go to validate your encodings. Not only does this rigorously assess your encoding approach, but Given how frustrating and time consuming this process can be, the psychological buttressing effect and satisfaction it provides is actually quite important on my experience.
- Techniques for Decoding:
For my large-scale reverse engineering, this involved generating extensive reports on correlations at different bit levels, and then parsing these to only examine the strongest and clearest associations.
3. Convert to an API and Contribute!
Once you’ve deciphered the commands, the next step is to wrap them into an accessible API, making them usable. If you’re contributing to an open-source project like PyLabRobot, here’s how to proceed:
- Post here on the forum: Share your findings and progress with the community.
- Message one of the PyLabRobot developers: Reach out directly to discuss your contributions.
- Refer to the documentation on developing new machines: Follow the established guidelines for integrating new instrument drivers.
- Make a pull request: Submit your code changes for review.
- Incorporate feedback: Be open to suggestions and refine your code based on community and maintainer input.
Case Study: The Command Type Byte
Before all messages sent to or from the plate reader, there are four message start bytes before the main body of any communication. As hexidecimal, The first two are consistently 0x02 and 0x00 (00000010 00000000), and the last is always 0x0C (00001100). The third byte, however, varies. It indicates the command type: e.g., for fetching EEPROM values it would be 0x0F, for querying for status it’s 0x0A, for sending status it’s 0x0E, for stop commands 0x0D etc. Most of these are discrete categories.
We had initially assumed all “protocol run” commands were flagged with 0x86. But yesterday, while trying to run some commands, we encountered issues. We reverted to the baseline software and tested different protocols. To our surprise, the third byte differed with each measurement type!
To our chagrin, it was also unclear how this information was encoded. We basically performed the “isolate” step of our pipeline, changing a bunch of parameters and observing the third byte. We managed to narrow down the relevant parameters to three aspects: reading type, whether well scan is on, and the number of wavelengths/chromatics. This was achieved through hours of trial and error.
With these parameters isolated, we systematically perturbed individual variables using the method described above. Ultimately, after about 12 measurements, we determined the encoding:
- Fluorescence: Flagged by 112 + (num_chromatics - 1) * 20
- Absorbance: Flagged by 120 + (num_wavelengths - 1) * 2
- Luminescence: Encoded by a base nibble of 0x2 where the first four bits (nibble 1) encode 8 + num_chromatics - 1.
Interestingly, enabling well scan adds 5 to the integer value. Why the engineers chose this flagging method, we have no idea! These values do overlap, and the Luminescence, Fluorescence, and Absorbance flags, well scan, and wavelength/chromatics are included and likely parsed from the main run command body. As far as we know, the command won’t run properly if these aren’t flagged correctly in this message start (though we haven’t exhaustively tested this new aspect yet, so it’s always possible something else was causing the original issue).
We validated our predictions by changing other parameters and ensure this byte remained unaffected.
Future Considerations
i suspect commands for pre-plate reading steps like pump priming and gain adjustment also have their own unique command bytes that will need similar investigation. I also dream of agentic models being able to tackle this kind of work, though I have my doubts about their current capabilities for nuanced reverse engineering, particularly concerning their tendency to need babysitting, drift off task, or confidently assert incorrect information.
All that said, I hope this helps others starting this kind of process. Comments, questions, and additional advice are welcome and encouraged. And I wish your intuitions about what bytes encode what features be correct!