Monocular Vision
Location Tracking and Positioning Technology Using Monocular Vision
Monocular Vision is a computer vision technology that acquires and analyzes images using a single camera. In contrast, Stereo Vision technology uses two cameras to measure depth and reflect it in positioning.
What is Monocular Vision?
Monocular vision is a visual technology that obtains 2D image information using a single camera. This technology is simple and easy to implement with a common monocular camera, making it widely used across various industries. Monocular vision is applied in various fields such as object recognition and tracking, autonomous driving, and robotics. For example, in the field of object recognition, technology that uses deep learning algorithms to recognize and track objects within monocular image sequences has become common. Additionally, in autonomous driving, it is used to recognize lanes, traffic lights, signs, and pedestrians on the road, and to estimate the position and speed of surrounding vehicles for safe operation.
Features of RTLS Using Monocular Vision
Since Monocular Vision technology uses a single camera, the cost required to build the system is relatively low. Compared to stereo cameras, monocular cameras have a lower unit price, allowing for more affordable RTLS deployment in large-scale installations or wide areas. Furthermore, when combined with machine learning technology, it enables accurate location estimation based on previously learned data. This ensures high accuracy, and by utilizing the latest deep learning techniques, learning is possible even with small amounts of data, reducing construction and operational costs. RTLS using Monocular Vision does not require tags, resulting in significant cost savings compared to traditional RTLS. The fact that separate tag terminals are not needed means that installation and operation are simple, and compatibility with existing facilities is high, making adoption easy. In addition, since there is no need to install equipment (anchors) to receive signals from tags, space utilization is increased, which can enhance the efficiency of the facility.
How does positioning using Monocular Vision work?
Object Detection
Vision-based RTLS uses deep learning object recognition technology to track targets in images. This technology is an AI algorithm trained to identify and classify various objects within an image. Currently, it can recognize approximately 80 types of objects, and additional training can be conducted upon customer request to recognize even more varieties.
Perspective Transform
Perspective transform can be used to convert specific points on a 2D image into real-world coordinates. To achieve this, a camera matrix is constructed using the camera's internal and external parameters, and this matrix is used to transform points from the 2D image into points in actual space.
How does Monocular Vision-based RTLS differ from Stereo Vision-based RTLS?
Monocular Vision and Stereo Vision operate fundamentally differently. Monocular Vision uses only one camera to track an object's position, while Stereo Vision uses two cameras and tracks the object's position using parallax information between the two cameras. Monocular Vision can be implemented at a relatively low cost and is easy to install and maintain. However, since it tracks positions using only 2D images provided by the camera, its accuracy and reliability may be relatively lower. On the other hand, Stereo Vision has higher overall costs, including adoption and installation, due to its hardware complexity, and its software processing is also more complicated. Even when processing the same image, Stereo Vision RTLS requires more computing resources than Monocular Vision RTLS because it must calculate parallax. Therefore, the choice between Monocular Vision and Stereo Vision for RTLS implementation should be decided by considering the intended use and budget.
Key Advantages of Monocular Vision
Relatively Inexpensive
Since Monocular Vision uses only one camera, the cost of hardware configuration is low. Furthermore, because it uses a single image sensor, software processing is relatively simple, and it can be implemented with low-spec hardware due to lower computing power requirements.
Fewer Installation Constraints
Monocular Vision can track the position and movement of objects regardless of where the camera is installed. This means the range of applicable fields is diverse, including indoor location tracking, pedestrian detection and tracking for autonomous vehicles, and player tracking in sports matches.
Diverse Applications
Since this technology can track objects with just one camera, installation and maintenance are simple and costs are relatively low. Consequently, it can be applied to smart homes, robots, and automobiles used in daily life.
ORBRO Solutions Implemented with Monocular Vision
Precisely connect real-time location data of assets and personnel with Monocular Vision-based RTLS, and explore representative solutions for each site.

Manage parking lot traffic flow in real-time to reduce congestion
Utilize AI technology optimized for parking lots to identify vehicle locations and space efficiency in real-time to adjust parking flow.
Learn More
Detect pedestrian movement in real-time to prevent intersection accidents
Identify pedestrians around crosswalks with AI-based detection technology and automatically warn vehicles of dangerous situations.
Learn More
Automatically count steel asset inventory and analyze the flow
Automatically record hourly inventory of coils and plates with AI and visually analyze asset flow.
Learn More
Detect dangerous behavior in real-time for immediate school safety response
Real-time detection of signs of violence or falls among students through AI video analysis with immediate alerts sent to administrators.
Learn More