AI Robot Vision
Deep learning and classical computer vision based perception software for warehouse automation.
Introduced AI Vision System for Robotic Depalletization @Mujin
- Trained, optimized, and deployed an instance segmentation model, increasing the production rate by 50%. The model outperformed Mujin’s existing vision system’s mAP by 60% and detection speed by 800%.
- Collected, distilled, and annotated a large warehouse dataset of 20,000 images. CVAT, Datumaro, SAM, and Voxel51 were used to optimize data and annotations carefully.
- Developed an auto-annotation software to generate ground truth segmentation datasets with 94% mAP automatically.
Package Condition Sensitive Point Cloud Filtering
- Implemented package condition (damaged or tilted) estimation algorithm for cardboard boxes and packs of cans.
- Developed 3D point cloud filtering algorithms for damaged and tilted items using OpenCV and PCL.
- Integrated vision algorithms to robotic systems at Walmart warehouses enabling damaged and tilted item handling.
Skills/Tools: PyTorch, OpenCV, PCL, Open3D, ONNX, CVAT, Python, Linux, Git, Voxel-51, Datumaro.
I joined as the first Computer Vision Engineer at Mujin-US in 2023. Before I joined, for the past decade, Mujin had traditionally been a 100% geometric computer vision-based company. No machine learning, no deep learning. Previous attempts at using Deep Learning for vision purposes were unsuccessful. However, it had become abundantly clear to Mujin that geometric computer vision alone is struggling to keep up with the endless complexities and chaotic realities of warehouses.


I am an avid follower of Dr. Andrew Ng, one of the pioneers of Deep Learning. Equipped with Dr. Andrew Ng’s teachings from Deep Learning Specialization and Machine Learning in Production courses, I was perfectly positioned at Mujin. Within a few months, I was able to introduce the first AI vision system that outperformed Mujin’s geometric vision system of the past decade both in terms of speed and detection accuracy. The new system was nearly 99% accurate while being nearly 10x faster than our existing vision system. Currently, I am leading the R&D of a hybrid vision system that takes advantage of the unique strengths of both AI and geometric vision.


