Image Processing Software
What is it?
3D Cube is an optical metrology system for acquiring measurements from image data. The objective of this project is to quickly determine the volumetric data of an object using high resolution camera with as little operator intervention as possible. Currently the system will detect and measure most objects with a single click, however the program may be adjusted to compensate for the environment, improve speed, selectivity, and desired accuracy.
Currently the only supported input format is .png files. The software may be easily adapted to accept images directly from high resolution cameras or video sources as required.
Background clutter, writing, etc... will not affect detection so long as there are no other rectangles in the image larger than the one being measured. By default the rectangles can be at any angle between -30 and 30 degrees and still be detected accurately. The box must be generally centered in middle of the image however a great amount of offset is allowed, so long as the each edge of the box is in a different quadrant of the screen measurements will not be affected.
The images to right are some examples of output.
3DCude software contains completely original code written specifically for the project. Only the standard Win32 and LibPNG libraries were used.
How does it work?
In the most basic sense, the system first identifies all straight edges within the image using various filters and the well documented Hough Transform which is able to detect lines even in otherwise cluttered images. Next it finds the largest rectangle formed by these lines and from that the size of the box can be determined in pixels. This combined with height information obtained either via lasers, stereoscopic cameras, or using multiple light sources and shadows can be then used to obtain volumetric information in future versions.
The process actually involves about two dozen individual procedures. During each step a variety of settings may be adjusted or even performed manually in order to allow less or more perfect rectangles to be found. In this way accuracy can be increased depending on the application. Some of these steps may be observed during program's operation such as binary conversion of the image.
For purposes of discussion these procedures can be grouped into these major steps:
The process begins by obtaining an edge enhanced binary image—where the pixels are either totally black or totally white—using 3x3 kernel based blur and high-pass filters. This results in an image where the edges of objects are clearly visible.
At this time it is possible to apply additional filters such as thinning, erodes, medians, etc to further enhance the desired edges or compensate for noise and/or clutter in the images. Some of these filters are provided in 3DCube but are not at present utilized by default, they may however be applied manually in order to experiment with various processing techniques.
Next the image is translated into Hough space as shown in the image on right. This is done by calculating all possible lines intersecting each white pixel in the binary source image. Lines are represented use rho (distance), theta (angle) parameterization which corrosponds to a distance and angle perpendicular from the top left of source image. In this way every possible line may be represented by only 2 parameters (rho/theta). In the Hough space accumulator image X coordinate is the angle (theta), and Y coord is the distance (rho). Each possible line intersecting each white pixel gets a 'vote' in the Hough space accumulator. As a the white pixels in a straight line are processed, lines common to each pixel accumulate more votes. This can be observed in image on right as the brighter spots, called local maxima.
With the Rubic's cube image its very easy to see the 4 bright spots representing lines at 71 degrees and the 5 bright spots representing lines at 161 degrees (right angle of 71 degrees). There are 5 spots at 161 degrees because of the shadow cast by the flash.
Now that the Hough space image contains line vote informations, it too may be processed as an image to further enhance the votes.
Next 3DCube must decide which are important lines and which are random clutter. It does this using a variety of proprietary procedures which determine the threshold for what is considered an accepted vote, detecting related lines, and finally for determining which rectangle formed by the lines is most likely the target to be measured.
Finally the coordinates designating the corner of the box are located using the matrix below, which finds the intersection of two lines described as rho/theta (r1, θ1 and r2, θ2):
AX = b, where A = [cos θ1 sin θ1] b = |r1| X = |x| [cos θ2 sin θ2] |r2| |y|
From two points on opposite sides (x1,y1 and x2,y2) it is possible to calculate the distance between them using:
Can it find other shapes?
It is possible with some reprogramming to allow accepting any numbers of sides, even circles. In some cases, this may slow the process down.
The images on the right shows how other images may be detected.
3DCube is written for Windows but will run fine and was even developed under Wine on linux. currently offered in executable format only. The archive contains the .exe as well as several images for testing and evaluation. Any image in the PNG file format may be used.
3DCube executable may be downloaded here in rar archive format.
Since the file contains several test image, total size is 57MB. Please allow some time for download to complete.
Simply unrar the archive, and click on win32application.exe. To open images click the Open Image button, to process an image click the button labeled Automagic.
The Operations Menu
The operations menu also offers many filters, many of which should be familiar to users of image editing programs. Including:
- Subtract ~ Displays the difference between two equally sized images (Images 1 and 2)
- Revert ~ to original image
- Histogram EQ ~ Auto adjust image brightness
- Amplify ~ Adjust brightness and contrast of image
- High-pass ~ This is essentally an edge enhancement tool, typically applied after one or more blur cycles and results in a binary image.
- Erode ~ Finds minimum value in a 3x3 neighborhood. Shrinks objects
- Median ~ Finds the median value of pixels in a 3x3 neighborhood. Good for speckle and noise reduction.
- Dilate ~ Blur maximum value in 3x3 neighborhood. Expands objects.
- Random ~ randomly selects a pixel out of 3x3 neighboorhood
- Threshold ~ Convert to binary based on pixel intensity alone.
- Thin ~ reduces binary lines to smallest possible width
- Hough Transform ~ convert image into hough space, and back.
By default the settings are configured to find even irregularly shaped rectangles as quickly as possible. By adjusting the controls in the Hough dialog it is possible to adjust these parameters for many different situations.
It is also possible to select that all found lines, or all found corners be displayed after conversion which is useful when adjusting settings.