Part 1- Exploring the Data Types in OpenCV4: A Comprehensive Guide

Jegathesan Shanmugam
16 min readFeb 11, 2023

--

OpenCV uses a combination of template-based and specialised data types to balance the need for both convenience and flexibility in computer vision programming. The template-based data types allow for generality and customization, while the specialised data types provide a more convenient and intuitive way of handling common operations in computer vision. OpenCV is a powerful tool for both new and experienced computer vision developers because of how it works.

The data types can be conveniently grouped into three categories. The first category is made up of basic data types that are constructed directly from C++ primitives, such as integers and floating-point numbers. These types include simple vectors and matrices, as well as representations of basic geometric concepts like points, rectangles, sizes, and others.

The second category consists of helper objects, which represent more abstract concepts such as garbage-collecting pointers, range objects for slicing, and abstractions such as termination criteria.

The third category is the large array types, which are objects designed to contain arrays or other collections of primitives or basic data types. The most prominent example of this category is the cv::Mat class, which is used to represent multi-dimensional arrays containing a variety of elements and is frequently used for representing images.

Basic Types:

cv::Vec:
Its is a template class that serves as a container for primitives, making it a fundamental part of the basic data types in OpenCV. The use of cv::Vec instead of STL classes is because cv::Vec is specifically designed for small vectors whose dimensions are known at compile time, providing a more efficient way of handling common operations in computer vision. This design choice allows OpenCV to provide a balance between the generality of STL classes and the efficiency of fixed-size vectors, making it a powerful tool for computer vision programming.


+-----------------------+----------------------------------------------------+
| Operation | Example |
+-----------------------+----------------------------------------------------+
| Default constructor | Vec2s v2s; Vec6f v6f; // etc... |
| Value constructors | Vec2f v2f(x0,x1); Vec6d v6d(x0,x1,x2,x3,x4,x5); |
| Member access | v4f[ i ]; v3w( j ); // (operator() and operator[] |
| Vector cross-product | v3f.cross( u3f ); |
+-----------------------+----------------------------------------------------+

cv::Point:
This class is a 2D point representation for integers. It has two members, x and y, that represent the point’s coordinates. For example, a point with x=10 and y=20 can be declared as follows:

In this example, two cv::Point objects p1 and p2 are created and initialized with the coordinates (10, 20) and (30, 40) respectively. Next, the x and y values of the two points are added together to create a new cv::Point object sum.
+--+-------------------------+-------------------------------+--+
| | Operation | Example | |
+--+-------------------------+-------------------------------+--+
| | Deafault Constructor | cv::Point2i p; | |
| | | cv::Point3f p; | |
| | Value Constructor | cv::Point2i p(x0, x1); | |
| | | cv::Point3f p(x0, x1, x2); | |
| | Member access | p.x; p.y; p.z; | |
+--+-------------------------+-------------------------------+--+

cv::Scalar:
Its is a 4-element vector in OpenCV that can be used to represent a scalar value or a color. The type Scalar is widely used in OpenCV to pass pixel values. Here's an example of how it can be used:

Three cv::Scalar objects are created to represent the colors White, Black, and Gray.
+--+-------------------------+----------------------------------+--+
| | Operation | Example | |
+--+-------------------------+----------------------------------+--+
| | Deafault Constructor | cv::Scalar s; | |
| | Value Constructor | cv::Scalar s(x0); | |
| | | cv::Scalar s(x0, x1, x2, x3); | |
+--+-------------------------+----------------------------------+--+

cv::Size:
This class is a representation for the size of a 2D object, with width and height members.

the cv::Size object new_size is used to specify the new size of the resized image. The cv::resize function takes three arguments: the original image, the resized image, and the size of the resized image specified as a cv::Size object.

+--+-------------------------+-------------------------+--+
| | Operation | Example | |
+--+-------------------------+-------------------------+--+
| | Deafault Constructor | cv:Size sz; | |
| | | cv::Sizei sz; | |
| | Value Constructor | cv::Size sz(w, h); | |
| | Member access | sz.width; sz.height; | |
+--+-------------------------+-------------------------+--+

cv::Rect:
This data type used in OpenCV for representing a rectangle. It stores the coordinates of the top-left corner, as well as the height and width of the rectangle. It is typically used for object detection bounding box, image cropping, and other image processing tasks.

The above code reads an image “example_image.jpg” using the cv::imread function. Then, it creates a cv::Rect object named rectangle with top-left corner at (100, 100) and width of 200 and height of 200. Finally, it extracts a region of interest (ROI) from the image image defined by the rectangle and saves it in a cv::Mat object called roi.

+--+--------------------------------+---------------------------------+--+
| | Operation | Example | |
+--+--------------------------------+---------------------------------+--+
| | Deafault Constructor | cv:Rect r; | |
| | Value Constructor | cv::Rect r(x, y, w, h); | |
| | Member access | r.x; r.y; r.width; r.height; | |
| | Computer area | r.area() | |
| | Extract upper-left corner | r.tl() | |
| | Extract bottom-right corner | r.br() | |
+--+--------------------------------+---------------------------------+--+

cv::RotatedRect:
It represents a rotated rectangle in the 2D image plane, defined by its center point, size, and rotation angle. The size of the rectangle is represented by its width and height, and the rotation angle represents the clockwise rotation of the rectangle from the horizontal axis. This class is useful in various computer vision applications, such as object detection and recognition, image segmentation, etc.

A cv::RotatedRect object is created using cv::Point2f(100, 100) as the center of the rectangle, cv::Size2f(200, 100) as the size of the rectangle and 30 degrees as the rotation angle.

+--+-----------------------------------------------------------+----------------------------------------+--+
| | Operation | Example | |
+--+-----------------------------------------------------------+----------------------------------------+--+
| | Default constructors | cv::RotatedRect rr(); | |
| | Value constructors; takes a point, a size, and an angle | cv::RotatedRect rr( p, sz, theta ) | |
| | Member access | rr.center; rr.size; rr.angle; | |
+--+-----------------------------------------------------------+----------------------------------------+--+

cv::Matx:
The fixed matrix classes in OpenCV are optimized for quick allocation and deallocation of memory, as well as fast operations, since their dimensions are known at compile time. These classes serve as the foundation for many other basic data types in the OpenCV C++ interface, with the fixed vector class deriving from the fixed matrix class, and other classes relying on casting to the fixed vector class for certain operations.

The code projects a 3D point in the world coordinate system to 2D image coordinates using a camera model represented by its intrinsic and extrinsic parameters. The camera intrinsic matrix and rotation matrix are created from the input sensor_msgs::CameraInfo object. Then, the 3D point is rotated by the rotation matrix, which is transposed to convert it to the world-to-camera coordinate system. The rotated point is then multiplied by the intrinsic matrix to obtain the 2D image coordinates, which are normalized by dividing them by the Z-coordinate to account for perspective distortion. Finally, the X-Y axis of the image coordinates are flipped to match the camera coordinate system.

+--+---------------------------------------------------------+-----------------------------------------------------------------------------------------------+--+
| | Operation | Example | |
+--+---------------------------------------------------------+-----------------------------------------------------------------------------------------------+--+
| | Default constructor | cv::Matx33f m33f; cv::Matx43d m43d; | |
| | Value constructors | cv::Matx21f m(x0,x1); cv::Matx44d m(x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15) | |
| | Matrix of identical elements | m33f = cv::Matx33f::all( x ); | |
| | Matrix of zeros | m23d = cv::Matx23d::zeros(); | |
| | Matrix of ones | m16f = cv::Matx16f::ones(); | |
| | Create a unit matrix | m33f = cv::Matx33f::eye(); | |
| | Create a matrix that can hold the diagonal of another | m31f = cv::Matx33f::diag(); // Create a matrix of size 3-by-1 of floats | |
| | Create a matrix with uniformly distributed entries | m33f = cv::Matx33f::randu( min, max ); | |
| | Create a matrix with normally distributed entries | m33f = cv::Matx33f::nrandn( mean, variance ); | |
| | Member access | m( i, j ), m( i ); // one argument for // one-dimensional matrices only | |
| | Matrix algebra | m1 = m0; m0 * m1; m0 + m1; m0 – m1; | |
| | Singleton algebra | m * a; a * m; m / a; | |
| | Comparison | m1 == m2; m1 != m2; | |
| | Dot product | m1.dot( m2 ); // (sum of element-wise // multiplications, precision of m) | |
| | Dot product | m1.ddot( m2 ); // (sum of element-wise multiplications, // double precision) | |
| | Reshape a matrix | m91f = m33f.reshape<9,1>(); | |
| | Cast operators | m44f = (Matx44f) m44d | |
| | Extract 2 × 2 submatrix at (i, j) | m44f.get_minor<2, 2>( i, j ); | |
| | Extract row i | m14f = m44f.row( i ); | |
| | Extract column j | m41f = m44f.col( j ); | |
| | Extract matrix diagonal | m41f = m44f.diag(); | |
| | Compute transpose | n44f = m44f.t(); | |
| | Invert matrix | n44f = m44f.inv( method ); // (default method is // cv::DECOMP_LU) | |
| | Solve linear system | m31f = m33f.solve( rhs31f, method ) | |
| | | m32f = m33f.solve<2>( rhs32f, method ); // (template forma); // default method is DECOMP_LU) | |
| | Per-element multiplication | m1.mul( m2 ); | |
+--+---------------------------------------------------------+-----------------------------------------------------------------------------------------------+--+

Helper Objects:

In addition to the basic data types and large data containers, there is a range of helper objects that are designed to aid in controlling different algorithms or perform operations on data containers. These helper objects include termination criteria objects, range objects, and slice objects.

cv::TermCriteria:
This object type in OpenCV used to specify the termination criteria for various algorithms, such as the iterative optimization algorithms. It allows you to specify the maximum number of iterations, the desired accuracy, or both. This object is used to control the algorithm’s convergence and stopping criteria.

In this example, cv::TermCriteria is being used as a stopping criterion for the k-means clustering algorithm. The criteria object is constructed with the following parameters: The COUNT + EPS flag, indicating that the algorithm should stop either when a certain number of iterations has been performed, or when the change in the error metric is below a certain threshold. The maximum number of iterations is set to 30. The threshold for the change in the error metric is set to 0.1.

cv::Range:
This class in OpenCV represents a range of values (start and end indices) for use in arrays, matrices, and other data structures. The range can be defined with two integers (start and end) or a single integer (end) in which case the range will start from 0. The range is used to define the portion of an array or matrix that is used in a specific operation, allowing you to perform actions on a subset of the data instead of the entire dataset.

In the above example, the range rowRange is used to select rows 1 to 5 of the matrix matrix, and the range colRange is used to select columns 2 to 8. The matrix(rowRange, colRange) expression returns a sub-matrix that is created from the specified range of rows and columns, and the scalar value cv::Scalar(255) is assigned to this sub-matrix.

cv::Ptr:
The cv::Ptr in OpenCV is similar to the boost::shared_ptr and std::shared_ptr smart pointers, in that it helps to manage dynamically allocated objects and automatically deletes the object when the reference count reaches zero. The main difference between cv::Ptr and boost::shared_ptr or std::shared_ptr is that cv::Ptr is a custom implementation specifically designed for use in OpenCV, while boost::shared_ptr and std::shared_ptr are part of the Boost and C++11 standard libraries, respectively.

In general, it is recommended to use std::shared_ptr when working with modern C++ code, as it is part of the C++ standard library and is widely supported by modern compilers. However, if you are working with code that uses OpenCV, using cv::Ptr may be a better option, as it is designed to work seamlessly with other OpenCV classes and functions.

How is this different from CV::Mat allocation?
Mat is a basic data type in OpenCV that represents a matrix or image that we will discuss later. It can be allocated on the stack or the heap, depending on the use case. When CV::Mat is allocated on the stack, it has an automatic storage duration, which means that it is automatically deleted when the scope it was declared in ends. On the other hand, when cv::Mat is allocated on the heap using the new operator, it has dynamic storage duration, which means that it persists until it is explicitly deleted or until the programme terminates.

In either case, the memory associated with a CV::Mat object must be managed manually. This can lead to memory leaks if the programmer forgets to delete the cv::Mat object or if there is an exception thrown before the cv::Mat object can be deleted.

In contrast, cv::Ptr is a smart pointer that automatically manages the memory of objects dynamically allocated on the heap. When a cv::Ptr is used to manage a cv::Mat object, the memory associated with the cv::Mat object is automatically deleted when the reference count reaches zero. This makes it easier to write code that works and doesn't cause errors, since the programmer no longer has to manage the memory for the cv::Mat object by hand.

In summary, both cv::Mat and cv::Ptr can be used to represent matrices or images in OpenCV. However, cv::Ptr provides a higher-level abstraction for memory management that makes it easier to write correct, exception-safe code.

cv::Exception:
An exception is represented by a class. It is used to handle errors in OpenCV functions. The cv:: The exception class contains information about the error, such as the error code, error description, and file and line number where the error occurred.

cv::InputArray and cv::OutputArray:
These objects in OpenCV that represent an input or output, respectively, of an OpenCV function. They are used to provide a uniform interface for functions that can accept different types of data (e.g., matrices, arrays, images). The InputArray can represent an input of any number of dimensions, and it automatically handles the conversion between different types of arrays. The OutputArray can represent an output of any number of dimensions and it will automatically allocate memory if necessary.

In the code above, we start with Method 1 which takes a reference to a Mat object and returns the processed image.

In Method 2, we use InputArray as the function argument instead of a reference to a Mat object. This allows us to pass in Mat, UMat, or even a std::vector as the argument. Using InputArray also has the benefit of making the input image read-only.

In Method 3, we use InputArray for the input image and OutputArray for the output image. This allows us to decide where the processed image is stored, either in a new Mat object or in the original Mat object. The OutputArraywhich wraps the same types but is treated as write-only.

In Method 4, we use InputOutputArray for the input image. This means the input image is processed in-place and no new Mat object is created.

cv::alignPtr:
cv::alignPtr is a utility function in OpenCV that is used to align a pointer to a specific boundary, such as a cache line boundary, to improve performance. By aligning the pointer to a specific boundary, you can ensure that the data that is accessed through the pointer is efficiently organized in memory, which can improve the performance of your code. This is especially important in computer vision and image processing, where data alignment can have a significant impact on performance. The basic syntax for using this function is as follows:

The function takes a pointer to an object as its first argument and an optional alignment parameter as its second argument. The default alignment is 32 bytes, which is a common cache line size. The function returns a pointer that has been aligned to the specified boundary. Here is an example of using this function to align an image buffer:

A block of memory is then dynamically allocated using the cv::fastMalloc function, passing the size of the memory required as the argument. The cv::fastMalloc function is a custom memory allocation function provided by OpenCV that is optimized for performance. The cv::alignPtr function is then used to align the raw pointer to a boundary of 32 bytes. The cv::alignPtr function takes the raw pointer as input and returns a new pointer that is aligned to the specified boundary. The aligned buffer can then be used to process the image data as needed. Finally, the memory is freed using the cv::fastFree function, passing the raw pointer as the argument.

cv::alignSize:
cv::alignSize() is a function in OpenCV library used to calculate the size of memory buffer required to store data with a certain alignment. The function takes two arguments: the size of data to be stored and the desired alignment. The function returns the size of the memory buffer that should be allocated to store data with the desired alignment.

For example, the following code calculates the size of memory buffer required to store data of size 100 with an alignment of 16:

The returned value can be used to allocate memory buffer with the desired alignment, which can improve the performance of some computer vision algorithms that require aligned data.For example,

cv::fastAtan2:
cv::fast Atan2 is a function in OpenCV that calculates the arctangent of y/x in degrees, with a resulting range of [-180, 180]. It is faster than the standard CV::atan2 function, as it uses a lookup table to calculate the arctangent instead of a mathematical formula.

This function is used in computer vision algorithms like “object tracking,” which need to figure out the orientation of objects quickly and accurately.

cv::cubeRoot :
cv::cubeRoot is a function in OpenCV which calculates the cube root of a given floating-point number. The function is available in the opencv_core module. Here’s an example usage of the cv::cubeRoot function:

CV_Assert:
CV_Assert is a macro function in OpenCV that allows you to check for runtime assertions. It is used to perform error checking in your code. If the condition specified in the macro evaluates to false, it will raise an error and print a diagnostic message. This macro is often used to ensure that inputs to a function are of the correct size, type, and format, or to check that an intermediate result has the expected value. Here is an example of how to use the CV_Assert macro:

CV_DbgAssert:
It is a macro in the OpenCV library that is used to check for runtime conditions in a debug build. The macro is similar to CV_Assert, but it only triggers an error if the code is compiled in a debug build. In a release build, the macro has no effect and the code will continue to run without checking the specified condition. Here’s an example,

The main difference between CV_Assert and CV_DbgAssert is that CV_Assert triggers an error in both debug and release builds, whereas CV_DbgAssert only triggers an error in debug builds. This makes CV_DbgAssert useful for adding debug-only checks to your code, as it allows you to detect errors during development, but will not slow down your code or raise errors in a release build.

It is important to note that CV_DbgAssert is a macro, not a function, and is intended to be used for debugging purposes only. In production code, it is typically better to use other error handling mechanisms, such as return codes or exceptions, to handle errors and ensure the correct operation of your code.

CV_Error:
CV_Error() is a function in OpenCV that generates an error. It is typically used for debugging and testing. The function takes an error code, an error message, and a file and line number as arguments, and raises an exception with the specified error code and error message.

Similarly, you can use CV_DbgAssert to perform assertions during the debug build and CV_Assert to perform assertions during both debug and release builds.

cv::error:
It’s a macro in OpenCV that is used to report an error. It takes two parameters: an error code and an error message. The error code is an integer value representing the type of error. The error message is a string describing the error in more detail. Here is a code example that demonstrates the use of cv::error:

cv::fastMalloc and cv::fastFree:
It’s a function provided by OpenCV to allocate memory in a fast manner. However, it’s not recommended to use it for image data because it may lead to memory leaks or other issues as it doesn’t call the constructors of the objects and doesn’t handle memory alignment for performance. Instead, it’s better to use cv::Mat for image data as it provides better memory management and ease of use.

cv::fastFree is a function in OpenCV that provides an optimized and fast way of freeing dynamically allocated memory. It is similar to the standard C function free but is designed to work faster by utilizing platform-specific memory allocation techniques.

However, Here’s an example code that uses cv::fastMalloc to allocate memory for an image:

cvFloor:
cvFloor is a function in the OpenCV library that rounds down a floating-point number to the nearest integer. It is equivalent to the floor function in the standard C library.

In this example, x is a floating-point number with a value of 3.14, and y is the rounded down value of x, which is 3.

cv::getCPUTickCount and cv::getTickFrequency():
cv::getCPUTickCount is a function in OpenCV which returns the number of clock-cycles that have passed since the start of the process. It can be used to measure the time elapsed between two parts of the code, which can help to identify the performance bottlenecks in the code. The value returned by this function is implementation-dependent and is typically obtained using the CPU’s time-stamp counter (TSC) register.

cv::getTickFrequency() is a function in OpenCV that returns the frequency of the high-resolution performance counter in Hertz (cycles per second). The function is commonly used to calculate the duration of a particular OpenCV operation by counting the number of ticks before and after the operation, and then dividing the difference by the frequency to obtain the duration in seconds. For example:

cv::useOptimized and cv::setUseOptimized():
cv::useOptimized() is a function in OpenCV that enables or disables the use of optimized code. By default, the optimized code is enabled. However, if you want to disable it to get more detailed error messages or to debug your code, you can use cv::useOptimized(false).

cv::setUseOptimized() is a function in the OpenCV library that can be used to enable or disable the usage of optimized functions. By default, the library uses optimized functions for performance. But in some cases, you might want to use the non-optimized version for debugging or testing purposes. In such cases, you can call cv::setUseOptimized(false) to disable the use of optimized functions. After that, all functions that have an optimized version will use the non-optimized version. To re-enable the optimized version, you can call cv::setUseOptimized(true). For example,

In this example, we first use cv::useOptimized to get the current value of the optimized flag. This function returns true if the optimized code path is used, and false if the optimized code path is not used. Next, we use cv::setUseOptimized to set the optimized flag to false, and we print the value of the optimized flag again using cv::useOptimized. Finally, we set the optimized flag back to true and print the value again to demonstrate how cv::setUseOptimized can be used to enable or disable the use of optimized code paths.

Large array types like cv::Mat will be covered in more detail in the next post. It’s crucial to be familiar with the various basic and utility types offered by OpenCV and to use them appropriately in computer vision applications, rather than limiting oneself to only a few of them.

I hope this information was helpful and that you found it useful. If you have any questions or comments, please don’t hesitate to let me know. I appreciate your feedback and are happy to help answer any questions you may have. Thank you for taking the time to read this.

I always share interesting articles and updates on LinkedIn, so if you want to stay informed, feel free to follow me on the platform.

Reference:
https://docs.opencv.org/4.x/
https://github.com/OATOMO/book/blob/master/opencv/Learning%20OpenCV%203%20Computer%20vision%20in%20C%2B%2B.pdf

--

--

Jegathesan Shanmugam

Talks about #robotics, #computervision, and #embeddedsystems