Computer vision

Innovation to enhance the computer’s comprehension of the visual world

How computer vision is going to help us?

A construction site is a complex system where materials, people and logistics interact making the site an extremely noisy and cluttered environment. Given this nature of construction the environment, it can neither be categorized as an indoor environment nor an outdoor environment, hence being affected by the peculiarities of both. Bad illumination, lack of texture, weather conditions are just some of the artifacts that any perceptual systems needs to contend with in a construction site. This makes the visual data captured on site incredibly challenging to deal with. We utilize advanced computer vision algorithms that leverage both texture and motion cues to extract semantic and geometric signals from a construction site allowing us to filter out the noise in the data.

Noise and clutter in a typical construction environment

Due to the inherent complexity of the construction environment, at Naska.AI we use the BIM as the reference signal for the data captured onsite. This 4D BIM can be discretized along the temporal axis to extract 3D models that provide individual reference geometries for onsite observations. Onsite observations are recorded through SLAM or SfM based reality capture devices

Computer vision research at Naska.AI

At Naska.AI we utilize various advanced computer vision and graphics algorithms to allow us to deal with the noise inherent to construction reality capture data. Photogrammetry techniques like SLAM and Structure from motion help us build accurate representations of the built environment. This combined with inverse rendering algorithms allows us to accurately distinguish between the various artifacts observed on site such as clutter, temporary works, moving objects, transient weather and the built structure.

Colored 3D reconstruction of a building

Understanding the semantics of the built environment allows us to efficiently distinguish between signal and noise in the built environment. One of the advantages of working in a construction setting is that a subset of the semantics of the built environment is readily available in the Building Information Model (BIM). Utilizing advances in 2D and 3D semantic segmentation we efficiently combine onsite observations with the modeled semantics to filter out noise from the data.

Semantic segmentation

Assigning semantic labels to each pixel of the image for detailed understanding of its contents.

3D reconstruction

Creating 3D models from 2D images or point clouds by using computer algorithms to extract information.


Real-time mapping and localization in unknown environments combining data from various sources.

Ongoing projects


Robust localization and mapping capabilities are fundamental on construction sites; cluttered environments that are not rich in features, with significant appearance changes over time.

We approach this problem by means of two different odometry systems that can operate independently, the visual-inertial and the LiDAR-inertial. These are loosely coupled by means of a pose graph representation that enables a late-fusion of pose measurements, being resilient to visual and geometric degeneracy.

By combining our deep multi-modal feature extraction with a global optimization system that enables the reduction of long term localization errors, we are able to achieve state of the art performance.

Semantic Segmentation

Semantic segmentation allows us to understand the built structure without prior knowledge. This when combined with information from the BIM allows us to build accurate representations of the built environment in the presence of noise and clutter.