A Mather Thesis entitled “Stereo Pointclouds for Safety Monitoring of Port Environments” has been conducted, as part of MOSES project, by Mrs. Middelhoek, Femke within the Programme of Mechanical Engineering, Vehicle Engineering, Cognitive Robotics at Delft University of Technology.

The file is accessible here.


The MOSES project develops an autonomous vessel equipped with an autonomous crane to optimise the supply chain of shortsea shipping. This study focusses on monitoring the safety of the port environment based on stereo camera data generated by sensors attached to the crane at 15m altitude, oriented 45° downward. The objective is to detect individuals and estimate their motion. Semi Global Block Matching is implemented for stereo pointcloud generation (a pointcloud based on the disparity image and stereo camera calibration information). Voxel averaged stereo pointcloud downsampling is performed for improved data compliance with CenterPoint. Background subtraction is implemented with Gaussian Mixture Models (GMMs). The study proposes a novel implementation to fit a GMM on per-point 3D spatial (xyz) and color information for enhanced background-foreground segmentation of the stereo pointclouds. 3D object detection and velocity prediction are based on CenterPoint, customised to take color features into account. The result is a robust detection pipeline with a top performance of 81.5% mAP, 4% Average Orientation Error and 9.4% Average Velocity Error on a simulated dense port environment dataset. Background subtraction is implemented to improve cross-environment generalisation, an important feature for MOSES considering the mobile nature of the vessel and the likelihood that it would attend unseen environments. Voxelaveraged downsampling of the stereo pointcloud advances this by creating a uniform data structure, further facilitating the transfer of learnt features to previously unobserved scenes. Including color information of the current frame reduces the impact of spatial uncertainty of the stereo pointcloud. It improves detection performance, particularly when excluding the color information of the temporal reference frames included for velocity prediction. The transferability of the pipeline developed in simulation to reality is demonstrated on a basic real-world scenario.