Senior Software Engineer

Senior Software Engineer - CUDA

No longer accepting applications

Overview

Job TypeOn-site

Experience5 years

Job PositionEmbedded

UpdatedApr 14, 2026

LocationHerzliya

SalaryN/A

Skills

C++ ꞏ 5y
Python
Docker
CUDA
Embedded Linux
GPU-accelerated services
NVIDIA Jetson
NVIDIA Nsight
ONNX Runtime
ROS2
TensorRT
Triton

Mentee Robotics is redefining humanoid automation with an AI-first approach. We integrate cutting-edge perception, reasoning, and dexterous manipulation into a fully autonomous humanoid robot that continuously adapts and learns. Our flagship product, Menteebot v3, is designed to perform complex tasks with human-like adaptability across industrial, logistics, and retail environments.

We are looking for a Senior Software Engineer to join our software team. In this role, you will be responsible for the high-performance software layer that bridges advanced AI models with physical robotic execution. Your work will focus on designing and implementing the core services responsible for real-time edge AI inference, ensuring that our systems process sensor data and execute commands with minimal latency and maximum reliability.

What You Will Do

Design & Optimize: Develop production-grade software in C++ and Python, specifically tailored for real-time inference and low-latency execution.
Edge AI Orchestration: Build and maintain the services that deploy and run neural networks directly on the robot’s edge hardware.
Sensor Integration: Develop robust pipelines to process high-frequency sensor data streams for real-time robotic perception.
Architect for Reliability: Create modular, well-architected components that ensure the robot remains stable and maintainable in complex, dynamic environments.
Cross-Functional Collaboration: Partner with AI researchers and hardware engineers to deploy and accelerate deep learning models on the edge.

Requirements:

5+ years of Software Engineering experience, with a strict focus on modern C++.
Proven, hands-on expertise in writing, profiling, and optimizing CUDA code for high-performance edge computing.
Deep understanding of modern C++ standards, memory management, concurrency, and parallelism. Extensive knowledge of Python is also a strict requirement.
Deep knowledge of developing, debugging, and profiling within embedded Linux environments.
Experience building highly reliable, production-grade software.

Advantages:

Familiarity with inference frameworks like Triton, TensorRT, or ONNX Runtime.
Experience with using NVIDIA Nsight to deeply analyze performance and pinpoint execution bottlenecks.
Practical experience with the ROS2 ecosystem.
Expertise in GPU-accelerated services and zero-copy mechanisms to minimize data transfer overhead.
Experience with NVIDIA Jetson or similar embedded edge compute modules.
Experience with containerization (Docker) tailored for embedded environments.

Your Account

Your Account