Computer Vision 
![]() |
Object Size![]() |
![]()
Computer vision is the science and technology of machines that see. As a scientific discipline, computer vision is concerned with the theory for building artificial systems that obtain information from images. The image data can take many forms, such as a video sequence, views from multiple cameras
The organization of a computer vision system is highly application dependent. Some systems are stand-alone applications which solve a specific measurement or detection problem, while other constitute a sub-system of a larger design which, for example, also contains sub-systems for control of mechanical actuators, planning, information databases, man-machine interfaces, etc. The specific implementation of a computer vision system also varies if its functionality is pre-specified or if some part of it can be learned or modified during operation. There are, however, typical functions which are found in many computer vision systems.
- Image acquisition: A digital image is produced by one or several image sensors, which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (grey images or colour images).
- Pose estimation: estimating the position or orientation of a specific object relative to the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in an assembly line situation.
- Pre-processing: Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are
- Re-sampling in order to assure that the image coordinate system is correct.
- Noise reduction in order to assure that sensor noise does not introduce false information.
- Contrast enhancement to assure that relevant information can be detected.
- Scale-space representation to enhance image structures at locally appropriate scales.
- Feature extraction: Image features at various levels of complexity are extracted from the image data.
- Detection/Segmentation: At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing. Examples are
- Selection of a specific set of interest points
- Segmentation of one or multiple image regions which contain a specific object of interest.
- High-level processing: At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example:
- Verification that the data satisfy model-based and application specific assumptions.
- Estimation of application specific parameters, such as object pose or object size.
- Classifying a detected object into different categories.
|
Objet Pose
|



