Modeling arm capabilities

follow up post to Updating PLR API for machine interfaces discussion, specifically to discuss what the Arm capability should look like.

the question for this thread is: what are the models we need to model any robotic arm in PLR?

Arm hierarchy

I think we need at least three separate capabilities, each extending the capabilities of the previous one:

  1. Arm
    • Manages picking up and dropping resources in the resource model
    • Has tracking for which resource is picked up
    • Converts resource locations to pick up locations (+rotations)
    • Used by arms that do not support rotation, just pickup
    • Atomic methods:
      • pick_up_at_location(location: Coordinate, plate_width: float)
      • drop_at_location(location: Coordinate, plate_width: float)
      • get_location
    • Example: core grippers
  2. OrientableArm
    • Inherits shared logic from Arm
    • Adds a rotation parameter for pickup/drop operations. So supports resources picked up in different rotations like landscape or portrait mode. Theoretically it can be any rotation, but backends can constrain that. For example, iswap can (currently) only deal with resources that are rotated in multiples of 90Ëš around the z-axis.
    • These arms can rotate, but crucially do not have all joints.
    • Atomic methods:
      • pick_up_at_location(location: Coordinate, plate_width: float, rotation: Rotation)
      • drop_at_location(location: Coordinate, plate_width: float, rotation: Rotation)
    • Example: iswap
  3. JointArm
    • In addition to having rotations, these arms consist of a series of well-defined, ordered joints, like SCARA style and I think also including articulated arms.
    • Atomic methods:
      • pick_up_at_location(location: Coordinate, plate_width: float, rotation: Rotation)
      • drop_at_location(location: Coordinate, plate_width: float, rotation: Rotation)
      • pick_up_at_joint_location(location: Dict[int, float], plate_width: float)
      • drop_at_joint_location(location: Dict[int, float], plate_width: float)
    • Example: PF400

Each capability-frontend will have an associated capability-backend for it, like ArmBackend, OrientableArmBackend, JointArmBackend.

As an example, the new STAR class will have an iswap: OrientableArm attribute.

Backend responsibility

Currently, the STARBackend, where the iswap functionality lives, has a bunch of math on converting locations and rotations of resources together with pickup rotation to center-coordinates. This can all be moved to the front end, since this logic is the same for all arms. All arms we have come across so far actually work on a CCC-basis (center in xyz). This means we can compute the location of a pickup in shared code and just pass the Coordinate and Rotation to the backend - along with backend kwargs that further specify a resource movement like lefty/righty on pf400. This really simplifies the implementation of new arms since that logic gets quite complex.

Addressable space

Each arm will have its own coordinate system. I think the cleanest way to model this is just in the same set by the Arm’s associated resource. I am currently thinking about the idea of adding a _reference_resource: Resource attribute to Arm to handle the space this arm is working in. For iswap and core grippers, this will just be the STAR’s deck. For the pf400, it will be a new resource.

Teach points

The main interface to working with arm is calibrating and updating teach points. In PLR this is currently a bit confusing, but I do not have a good proposal for improving it. (see iswap docs)

What is annoying: when you set a resource’s location, that is the location of its left front bottom LFB. This means that when you put a resource somewhere with an arm physically, and then you request the current arm location, the arm reports that location in CCB, which is not actually the location that the given resource should be set to. You have to figure out what plate you were holding to find the actual LFB of the destination. It also matters where on the resource it was gripped: obviously the bottom of the resource depends on the resource’s pickup height.

When you are just working with coordinates or joint positions, this is not a big deal, but when you teach resources it is awkward.

maybe separate get_gripper_location from get_gripped_resource_location

Nice write-up Rick. clear solutions allready. A few things that came to mind:

the word backend confused me. Backend already means STARBackend in “old” PLR, so ArmBackend reads like a separate driver to me. But what you’re describing is more like the contract that defines what the driver has to implement right? What about naming them for what they require, like CanPickUpResource, CanControlJoints? Then the driver just inherits from those:

  class STARDriver(CanPickUpResource):
      ...

  class PF400Driver(CanPickUpResource, CanControlJoints):
      ...

# just like byonoy
  class ByonoyDriver(CanReadAbsorbance):
      ...

But i am not really sure about this part

Then about star.iswap; iSWAP and CoRe gripper are not really separate capabilities right? They both move resources around. It’s just two ways of doing the same thing. If we put iswap on the device we’re putting hardware names where capabilities should be.

if it’s just star.arm and then the user picks the mechanism through a kwarg like use_arm="iswap" when they need to? That is allready close to what it is in the existing code.

On the naming Arm → JointArm; this one is more of a feeling but: a real arm already has joints. The PF400 IS an arm. The CoRe gripper is really just an XYZ gantry with a tool attached, not really an “arm” in the physical sense. So the hierarchy goes from the thing that isn’t really an arm to the thing that actually is one. Not sure what to do with that but it felt off.

1 Like

kind of yes, I will respond in more detail in the thread about architecture. (I want to keep this thread to just talking about what arms are)

In my first post I explained why I think they are separate capabilities: an iswap arm can rotate (“OrientableArm”) whereas the core grippers cannot (proposing to make just a base Arm for them).

I will also respond to ’ if it’s just star.arm and then the user picks the mechanism through a kwarg like use_arm="iswap"’ in the other thread

yes this is the crux of what I want to discuss here

With “arm” I meant something that moves resources around in an arbitrary space. In that sense, a core gripper is an arm, even though it does not have joints. The iswap is kind of an interesting case since it has some joints, but also works on a x/y/z gantry rather than being a real scara / articulated arm. That is why I thought we should have the middle layer of OrientableArm. Do you think these levels make sense?

The naming is somewhat separate, and I agree that it is a little confusing right now. Maybe we should just call the base capability like ResourceHandler or something? And then have OrientableResourceHandler with JointArm as a subclass of that?

tangental: @CamilloMoschner and I also talked about something called “automated retrieval system”, like a cytomat or liconic, that has functions:

  • move resource from storage to loading tray (singular loading tray)
  • move resource from loading tray to storage (many storage locations, they are carriers)

but these systems do not rotate ever resources or have free/arbitrary/random movement, it’s really just “storage + retrieval”, and so I want to model them separately from an arbitrary arm / resource handler.

Very interesting discussion here :slight_smile:

First, I thought we already have a proposal for modelling Arms in PLR:

or do you mean here not the “resource model” modelling but instead building the driver inheritance system?

I do find the terms OrientableArm and JointArm very confusing because every arm has to be oriented in 3D/4D space(time) and has joints (even a gantry robot, like the Hamilton STAR CORE gripper system, technically has prismatic joints aka rails).

I found the hierarchy we already have for arms in PLR more useful for the following important arm characteristics that we should have written down somewhere in the docs for people who want to get into arms (its essential for manufacturing and CAD too :slight_smile:)

  • Every arm has joints and links in between them.

  • an object being transported can have up to 6 degrees of freedom depending on the robots total degrees of freedom:

    • Movement in X, Y, Z
    • Rotation around all 3 axes
      • Roll (X-axis) — tilting/spinning side to side
      • Pitch (Y-axis) — nodding up and down
      • Yaw (Z-axis) — turning left and right
  • the main 3 arm types are defined by the degrees of freedom of the object being transported:

    • gantry arm → x, y, z (example: Hamilton STAR CORE gripper)
    • SCARA (Selective Compliance Assembly Robot Arm**)** → x, y, z + yaw (example: PF400, KX2)
    • articulated arm → x, y, z + yaw + Roll + Pitch + Yaw (example: UR5, Kuka KR)
  • each arm brings with it 2 space systems:

    • a cartesian coordinate space (the origin of that space depends on the each arm’s space: for the PF400 it is at the x-y center of its first joint, for the KX2 we found it to be in the middle of seemingly nowhere and the KX2 can apparently not reach it :sweat_smile: )
    • a “pose space”: the list of each joint’s relative motor position (usually queryable via an absolute encoder)
  • all joints have physical limits: since cables have to run through them many arms limit the number of rotations in any directions, providing hard constraints in “pose space”, though some joints can have limitless rotations enabled by slip rings (important knowledge for modelling the allowed rotations)

Based on the space characteristic:
I think of arms the same way as I think of liquid handlers: devices that come with their own coordinate system and in PLR we have to define a larger coordinate system to make it easier for programming:
This could be that one device with a coordinate system (e.g. the STAR/EVO) is used to start the “workcell”, and the arm placed next to it extends the workcell space?
→ this then defines the addressable space of everything (and devices that move in the workcell space get helpers to convert workcell space into their local spaces)

I agree, the “backend” wording does confuse me here as well, and I hope we can smoothly transition to the term “driver” in its place :slight_smile:

I agree that this is the main interface to working right now… but that is not what I would suggest we base PLR’s arm development on → I would like PLR to be fully understandable to an agent.
Basing arm movement on human teaching points is incompatible with this objective imo.

Instead we should fully support human teaching, always, but ultimately work towards 2 “path finding & delivery” systems:

  1. purely based on PLR’s resource model / digital twin → if the model knows where everything is it can plan its own route
  2. based on both the resource model AND spatial fiducial markers (aruco, april tags) → in reality things move, move the expansion of metal in hot environments (like sunshine) or people knowing into the workcell (happens :eyes: ), having arms with computer vision/cameras integrated (like the PF400 and KX2, or even Formulatrix’ STACK shelving system) enables self correction.
    PLR should be built from the ground up to fully utilise this

I don’t have answer for this one yet, partly because it depends so much on the end-effector/gripper (which is changing very easily and quickly) … but I agree that this is very very annoying: this is what a “teaching session” of the iSWAP looks like atm for me (the docs do not have an example on how to do this yet :/)

(nothing proprietary here :slight_smile: )

I agree with this:

But I believe the issue is the term “capabilities” here → you are right that the capability is “move a resource”

But the device “features” that achieve this “capability” are different: iSWAP vs CORE gripper system

@rickwierenga on ResourceHandler: it’s nicely parallel to LiquidHandler and that’s not a coincidence. On the STAR the same gantry with a tip is a liquid handler, with the CoRe gripper it becomes a resource handler. Same mover, different tool, different capability. But in the lab we just say “the arm.” I’d stick with Arm.

On the levels, here’s what I keep coming back to. When a user writes arm.move(plate, to=reader) or arm.transfer(plate, to=reader), the capability knows both the source and destination. Both have orientations in the resource model (resource.get_absolute_rotation() already exists). So the capability can figure out if rotation is needed and whether this arm supports it.

But pick_up(plate) alone doesn’t know the destination. It can’t know if it will need to rotate. So the rotation question is really a move() question, not a pick_up() question. That makes me wonder if OrientableArm as a separate class is solving the right problem, the information it needs only exists at move() time, not at the type level.

@CamilloMoschner driver is great! and in relation to the arm classifiction:

The question I have: does the user need to declare upfront whether they’re working with a gantry, a SCARA or a UR5? Or should the capability handle that? If arm.transfer(plate, to=reader)works regardless of what’s underneath, does the classification need to be in the code at all?

I am not sure I fully understand you question @vcjdeboer, but let me answer both of the interpretations I have for it:

  1. the user (a Python-capable automator) would not need to declare during runtime or programming whether they are working with a gantry or SCARA.
    In the case of the the STAR the driver would just have a feature called STAR.iswap: (GantryArm, SCAR)* and one STAR.core_grippers: GantryArm - but, importantly, the user only ever needs to call STAR.iswap or STAR.core_grippers not the type.
    I think since these are functionally very different and have different safety concerns this makes sense (?), but happy to discuss whether it does or whether I am just too blindsided by doing this every day :sweat_smile:
  2. the user (PLR developer) who integrates the device will have to declare the type: for the device UR5.arm: ArticulatedArm (could also be a welding machine), PF400.arm: SCARA (could also use its computer vision system/barcode reader like PF400.barcode_reader)
  • core grippers do not. it is on a cartesian chassis
  • the iswap has two joints, but also runs on a chassis
  • I am guessing iswap?

for purely Cartesian arms like the core grippers, these spaces are actually the same …

(there might still be a translation and scaling wrt PLR of course, but such an arm does not have a dedicated joint space - its dimensions are always cartesian)

yes, of course we should support these things

but the reality is that unless you are using spatial fiducial markers, you will need to teach=calibrate locations. Currently for the pf400 we just do it in joint space since it’s the best it supports, and it’s kind of a pain, but it’s super fast. The annoying part is offsets in xy are not intuitive (z is a joint so it is intuitive, handy for grip height and lidding and such…) The point I was making is that teaching positions should be easy, and that there is a difference between gripper-location and gripped-resource-location.

they do, linked in my original post

but what does that mean :sweat_smile:

arm.move(plate, to=reader) or arm.transfer(plate, to=reader)

while that would be easiest, some arms need to know where on the resource to grip it. Arms that have rotation end effectors need to be told to grip the plate from the front/right/back/left, it is a required parameter. But this parameter has no meaning for cartesian/gantry arms.

the resource’s rotation, yes clearly

but I am talking about arm wrt to resource (or arm wrt its base)

it can’t work like that, see above

also: the base arm class can’t have a get_joint_positions method since some arms do not have joints. It is possible for them to just raise errors in that case, but this is the pattern I am trying to move away from. Capabilities should not declare methods that can’t actually run.

the distinction between SCARA and Cartesian and articulated do not make sense on the capability level. See my original post for that. articulated is essentially SCARA in what it can accept in terms of parameters (only xy rotations need to be forced to 0), could be an argument to split them tbh but slightly weaker. Importantly, there is a level in between SCARA and Cartesian. Cartesian = gantry arm does make sense, it is the base case.

I found it helpful to imagine an arm like core grippers that is mounted on a 3d gantry but able to rotate its gripper. Imagine we come across such an arm in the future, how to model it?

and then you realize the iswap is essentially that.

correct on who thinks about type of arm

although for the user, some functions they write will need to take in particular subclasses of an arm in case that function requires specific functionalities. Imagine a rotate function that uses an arm, passing a core gripper should give a typing error for that. That is because some methods of some arms will have more parameters than others, like rotation or get_joint_position.

Though my statement here was oversimplified, the “CORE gripper system” does have joints: linear rails are prismatic joints (that is actually what defines a gantry robot).

The iSWAP is a combination of 2 different arms:

  1. a gantry arm (enabling linear movement in x-y of the iSWAP channel)
    AND

  2. a SCARA with 2 revolute joints AND an end-effector that is a gripper (enabling linear movement in z, and rotational movement in x-y)

This is almost identical to the PF400 (the SCARA) + its linear rail (a simplified gantry robot that moves the SCARA linearly in 1 dimension).


But another common robotics concept that I believe is needed here and has so far not been used here is the clear distinction between “end-effectors” and “arms”:

End-effectors are the parts of the robot that directly interact with the world/objects (grippers, suction cups, tools like welding torches, etc.).

Arms (or more broadly, the kinematic chain/chassis) are responsible purely for positioning and orienting the end-effector in space.

This distinction matters because:

  • The CORE gripper system’s gripper is the end-effector, the gantry robot/Cartesian chassis is the arm
  • The iSWAP’s gripper is the end-effector, the gantry + SCARA combination is the arm
  • The PF400’s gripper is the end-effector, the SCARA + linear rail is the arm

Another example on a familiar device to give some context of end effectors:
Hamilton STAR’s Lid Tool Picking (which I would just call a “suction gripper/end-effector”)

This is an arm too:
feature = gantry arm + end-effector | capability = move resource from source to destination location, just that the end-effector is not 2 channels with grippers squeezing the transfer-object but instead the end-effector is 1 channel with a suction adapter that creates negative pressure inside its piston chamber to pick up the transfer-object.

This then also answers this question:

With a separation between arm and end-effector: and I believe you are talking about the Fluent’s robotic gripper arm (RGA) which is exactly that: a gantry arm (x-y-z movement) + a rotating end-effector (Yaw / z-rotation).

This is what the ISO 8373:2021(en) Robotics Vocabulary was created for; we should use the decades of work they have already done for us. :slight_smile:

2 Likes