Company Raises $675M to Develop Humanoid Robot

Today, Figure is announcing an astonishing US $675 million Series B raise, which values the company at an even more astonishing $2.6 billion. Figure is one of the companies working towards a multi or general purpose (depending on who you ask) bipedal or humanoid (depending on who you ask) robot. The astonishing thing about this valuation is that Figure’s robot is still very much in the development phase—although they’re making rapid progress, which they demonstrate in a new video posted this week.

This round of funding comes from Microsoft, OpenAI Startup Fund, Nvidia, Jeff Bezos (through Bezos Expeditions), Parkway Venture Capital, Intel Capital, Align Ventures, and ARK Invest. Figure says that they’re going to use this new capital “for scaling up AI training, robot manufacturing, expanding engineering headcount, and advancing commercial deployment efforts.” In addition, Figure and OpenAI will be collaborating on the development of “next generation AI models for humanoid robots” which will “help accelerate Figure’s commercial timeline by enhancing the capabilities of humanoid robots to process and reason from language.”

As far as that commercial timeline goes, here’s the most recent update:

Figure

And to understand everything that’s going on here, we sent a whole bunch of questions to Jenna Reher, Senior Robotics/AI Engineer at Figure.

What does “fully autonomous” mean, exactly?

Jenna Reher: In this case, we simply put the robot on the ground and hit go on the task with no other user input. What you see is using a learned vision model for bin detection that allows us to localize the robot relative to the target bin and get the bin pose. The robot can then navigate itself to within reach of the bin, determine grasp points based on the bin pose, and detect grasp success through the measured forces on the hands. Once the robot turns and sees the conveyor the rest of the task rolls out in a similar manner. By doing things in this way we can move the bins and conveyor around in the test space or start the robot from a different position and still complete the task successfully.

How many takes did it take to get this take?

Reher: We’ve been running this use case consistently for some time now as part of our work in the lab, so we didn’t really have to change much for the filming here. We did two or three practice runs in the morning and then three filming takes. All of the takes were successful, so the extras were to make sure we got the cleanest one to show.

What’s back in the Advanced Actuator Lab?

Reher: We have an awesome team of folks working on some exciting custom actuator designs for our future robots, as well as supporting and characterizing the actuators that went into our current robots.

That’s a very specific number for “speed vs human.” Which human did you measure the robot’s speed against?

Reher: We timed Brett [Adcock, founder of Figure] and a few poor engineers doing the task and took the average to get a rough baseline. If you are observant, that seemingly over-specific number is just saying we’re at 1/6 human speed. The main point that we’re trying to make here is that we are aware we are currently below human speed, and it’s an important metric to track as we improve.

What’s the tether for?

Reher: For this task we currently process the camera data off-robot while all of the behavior planning and control happens onboard in the computer that’s in the torso. Our robots should be fully tetherless in the near future as we finish packaging all of that onboard. We’ve been developing behaviors quickly in the lab here at Figure in parallel to all of the other systems engineering and integration efforts happening, so hopefully folks notice all of these subtle parallel threads converging as we try to release regular updates.

How the heck do you keep your robotics lab so clean?

Reher: Everything we’ve filmed so far is in our large robot test lab, so it’s a lot easier to keep the area clean when people’s desks aren’t intruding in the space. Definitely no guarantees on that level of cleanliness if the camera were pointed in the other direction!

Is the robot in the background doing okay?

Reher: Yes! The other robot was patiently standing there in the background, waiting for the filming to finish up so that our manipulation team could get back to training it to do more manipulation tasks. We hope we can share some more developments with that robot as the main star in the near future.

What would happen if I put a single bowling ball into that tote?

Reher: A bowling ball is particularly menacing to this task primarily due to the moving mass, in addition to the impact if you are throwing it in. The robot would in all likelihood end up dropping the tote, stay standing, and abort the task. With what you see here, we assume that the mass of the tote is known a-priori so that our whole body controller can compensate for the external forces while tracking the manipulation task. Reacting to and estimating larger unknown disturbances such as this is a challenging problem, but we’re definitely working on it.

Tell me more about that very zen arm and hand pose that the robot adopts after putting the tote on the conveyor.

Reher: It does look kind of zen! If you re-watch our coffee video you’ll notice the same pose after the robot gets things brewing. This is a reset pose that our controller will go into between manipulation tasks while the robot is awaiting commands to execute either an engineered behavior or a learned policy.

Are the fingers less fragile than they look?

Reher: They are more robust than they look, but not impervious to damage by any means. The design is pretty modular which is great, meaning that if we damage one or two fingers there is a small number of parts to swap to get everything back up and running. The current fingers won’t necessarily survive a direct impact from a bad fall, but can pick up totes and do manipulation tasks all day without issues.

Is the Figure logo footsteps?

Reher: One of the reasons I really like the figure logo is that it has a bunch of different interpretations depending on how you look at it. In some cases it’s just an F that looks like a footstep plan rollout, while some of the logo animations we have look like active stepping. One other possible interpretation could be an occupancy grid.