create_shape_model_3d
— Prepare a 3D object model for matching.
create_shape_model_3d( : : ObjectModel3D, CamParam, RefRotX, RefRotY, RefRotZ, OrderOfRotation, LongitudeMin, LongitudeMax, LatitudeMin, LatitudeMax, CamRollMin, CamRollMax, DistMin, DistMax, MinContrast, GenParamName, GenParamValue : ShapeModel3DID)
The operator create_shape_model_3d
prepares a 3D object
model, which is passed in ObjectModel3D
, as a 3D shape
model used for matching. The 3D object model must previously been
read from a file by using read_object_model_3d
.
The 3D shape model is generated by computing different views of the
3D object model within a user-specified pose range. The views are
automatically generated by placing virtual cameras around the 3D
object model and projecting the 3D object model into the image plane
of each virtual camera position. For each such obtained view a 2D
shape representation is computed. Thus, for the generation of the 3D shape
model, no images of the object are used but only the 3D object model,
which is passed in ObjectModel3D
. The shape representations
of all views are stored in the 3D shape model, which is returned in
ShapeModel3DID
. During the matching process with
find_shape_model_3d
, the shape representations are used to
find out the best-matching view, from which the pose is subsequently
refined and returned.
In order to create the model views correctly, the camera parameters
of the camera that will be used for the matching must be passed in
CamParam
. The camera parameters are necessary, for example,
to determine the scale of the projections by using the actual focal
length of the camera. Furthermore, they are used to treat radial
distortions of the lens correctly. Consequently, it is essential
to calibrate the camera by using calibrate_cameras
before
creating the 3D shape model. On the one hand, this is necessary to
obtain accurate poses from find_shape_model_3d
. On the other
hand, this makes the 3D matching applicable even when using lenses
with significant radial distortions.
The pose range within which the model views are generated can be
specified by the parameters RefRotX
, RefRotY
,
RefRotZ
, OrderOfRotation
, LongitudeMin
,
LongitudeMax
, LatitudeMin
, LatitudeMax
,
CamRollMin
, CamRollMax
, DistMin
, and
DistMax
.
Note that the model will only be recognized during
the matching if it appears within the specified pose range.
The parameters are described in the following:
Before computing the views, the origin of the coordinate system of
the 3D object model is moved to the reference point of the 3D object
model, which is the center of the smallest enclosing axis-parallel
cuboid and can be queried by using
get_object_model_3d_params
. The virtual cameras, which are
used to create the views, are arranged around the 3D object model in
such a way that they all look at the origin of the coordinate
system, i.e., the z axes of the cameras pass through the origin. The
pose range can then be specified by restricting the views to a
certain quadrilateral on the sphere around the origin. This
naturally leads to the use of the spherical coordinates longitude,
latitude, and radius. The definition of the spherical coordinate
system is chosen such that the equatorial plane corresponds to the
xz plane of the Cartesian coordinate system with the y axis pointing
to the south pole (negative latitude) and the negative z axis
pointing in the direction of the zero meridian (see
convert_point_3d_spher_to_cart
or
convert_point_3d_cart_to_spher
for further details about the
conversion between Cartesian and spherical coordinates). The
advantage of this definition is that a camera with the pose
[0,0,z,0,0,0,0] has its optical center at longitude=0, latitude=0,
and radius=z. In this case, the radius represents the distance of
the optical center of the camera to the reference point of the 3D
object model.
The longitude range, for which views are to be generated, can be
specified by LongitudeMin
and LongitudeMax
, both
given in radians. Accordingly, the latitude range can be
specified by LatitudeMin
and LatitudeMax
, also given
in radians. LongitudeMin
and LongitudeMax
are adjusted to maintain a range of 360° .
If an adjustment is possible, LongitudeMin
and the range are
preserved. The minimum and maximum distance between the camera center
and the model reference point is specified by DistMin
and
DistMax
.
Thereby, the model origin is in the center of the smallest enclosing cuboid
and does not necessarily coincide with the origin of the CAD coordinate
system.
Note that the unit of the distance must be meters
(assuming that the parameter Scale
has been correctly set
when reading the CAD file with read_object_model_3d
).
Finally, the minimum and the maximum camera roll angle can be
specified in CamRollMin
and CamRollMax
. This
interval specifies the allowable camera rotation around its z axis
with respect to the 3D object model. If the image plane is parallel
to the plane on which the objects reside and if it is known that the
object may rotate in this plane only in a restricted range, then it
is reasonable to specify this range in CamRollMin
and
CamRollMax
. In all other cases the interpretation of the
camera roll angle is difficult, and hence, it is recommended to set
this interval to . Note that the larger
the specified pose range is chosen the more memory the model will
consume (except from the range of the camera roll angle) and the
slower the matching will be.
The orientation of the coordinate system of the 3D object model is
defined by the coordinates within the CAD file that was read by
using read_object_model_3d
. Therefore, it is reasonable
to previously rotate the 3D object model into a reference
orientation such that the view that corresponds to longitude=0 and
latitude=0 is approximately at the center of the pose range. This
can be achieved by passing appropriate values for the reference
orientation in RefRotX
, RefRotY
, RefRotZ
,
and OrderOfRotation
. The rotation is performed around the
axes of the 3D object model, which origin was set to the reference
point. The longitude and latitude range can then be interpreted as a
variation of the 3D object model pose around the reference
orientation. There are two possible ways to specify the reference
orientation. The first possibility is to specify three rotation
angles in RefRotX
, RefRotY
, and RefRotZ
and the order in which the three rotations are to be applied in
OrderOfRotation
, which can either be 'gba' or
'abg' . The second possibility is to specify the three
components of the Rodriguez rotation vector in RefRotX
,
RefRotY
, and RefRotZ
. In this case,
OrderOfRotation
must be set to 'rodriguez' (see
create_pose
for detailed information about the order of the
rotations and the definition of the Rodriguez vector).
Thus, two transformations are applied to the 3D object model before computing the model views within the pose range. The first transformation is the translation of the origin of the coordinate systems to the reference point. The second transformation is the rotation of the 3D object model to the desired reference orientation around the axes of the reference coordinate system. By combining both transformations one obtains the reference pose of the 3D shape model. The reference pose of the 3D shape model thus describes the pose of the reference coordinate system with respect to the coordinate system of the 3D object model defined by the CAD file. Let t = (x,y,z)' be the coordinates of the reference point of the 3D object model and R be the rotation matrix containing the reference orientation. Then, a point given in the 3D object model coordinate system can be transformed to a point in the reference coordinate system of the 3D shape model by applying the following formula:
This transformation can be expressed by a homogeneous 3D
transformation matrix or alternatively in terms of a 3D pose. The
latter can be queried by passing 'reference_pose' for the
parameter GenParamName
of the operator
get_shape_model_3d_params
. The above formula can be best
imagined as a pose of pose type 8, 10, or 12, depending on the value
that was chosen for OrderOfRotation
(see
create_pose
for detailed information about the different
pose types). Note, however, that get_shape_model_3d_params
always returns the pose using the pose type 0. Finally, poses that
are given in one of the two coordinate systems can be transformed to
the other coordinate system by using
trans_pose_shape_model_3d
.
Furthermore, it should be noted that the reference coordinate system
is introduced only to specify the pose range in a convenient
way. The pose resulting from the 3D matching that is performed with
find_shape_model_3d
always refers to the original 3D object
model coordinate system used in the CAD file.
With MinContrast
, it can be determined which edge contrast
the model must at least have in the recognition performed by
find_shape_model_3d
. In other words, this parameter
separates the model from the noise in the image. Therefore, a good
choice is the range of gray value changes caused by the noise in the
image. If, for example, the gray values fluctuate within a range of
10 gray levels, MinContrast
should be set to 10. If
multichannel images are used for the search images, the noise in one
channel must be multiplied by the square root of the number of
channels to determine MinContrast
. If, for example, the
gray values fluctuate within a range of 10 gray levels in a single
channel and the image is a three-channel image, MinContrast
should be set to 17. If the model should be recognized in very low
contrast images, MinContrast
must be set to a
correspondingly small value. If the model should be recognized even
if it is severely occluded, MinContrast
should be slightly
larger than the range of gray value fluctuations created by noise in
order to ensure that the pose of the model is extracted robustly and
accurately by find_shape_model_3d
.
The parameters described above are application-dependent and must be
always specified when creating a 3D shape model. In addition, there
are some generic parameters that can optionally be used to influence
the model creation. For most applications these parameters need not
to be specified but can be left at their default values. If desired,
these parameters and their corresponding values can be specified by
using GenParamName
and GenParamValue
,
respectively. The following values for GenParamName
are
possible:
For efficiency reasons
the model views are generated on multiple pyramid levels. On
higher levels fewer views are generated than on lower levels. With
the parameter 'num_levels' the number of pyramid levels
on which model views are generated can be specified. It should be
chosen as large as possible because by this the time necessary to
find the model is significantly reduced. On the other hand, the
number of levels must be chosen such that the shape
representations of the views on the highest pyramid level are
still recognizable and contain a sufficient number of points (at
least four). If not enough model points are generated for a
certain view, the view is deleted from the model and replaced by a
view on a lower pyramid level. If for all views on a pyramid
level not enough model points are generated, the number of levels
is reduced internally until for at least one view enough model
points are found on the highest pyramid level. If this procedure
would lead to a model with no pyramid levels, i.e., if the number
of model points is too small for all views already on the lowest
pyramid level, create_shape_model_3d
returns an error
message. If 'num_levels' is set to 'auto'
(default value), create_shape_model_3d
determines the
number of pyramid levels automatically. In this case all model
views on all pyramid levels are automatically checked whether
their shape representations are still recognizable. If the shape
representation of a certain view is found to be not recognizable,
the view is deleted from the model and replaced by a view on a
lower pyramid level. Note that if 'num_levels' is set to
'auto' , the number of pyramid levels can be different for
different views. In rare cases, it might happen that
create_shape_model_3d
determines a value for the number of
pyramid levels that is too large or too small. If the number of
pyramid levels is chosen too large, the model may not be
recognized in the image or it may be necessary to select very low
parameters for MinScore
or Greediness
in
find_shape_model_3d
in order to find the model. If the
number of pyramid levels is chosen too small, the time required to
find the model in find_shape_model_3d
may increase. In
these cases, the views on the pyramid levels should be checked by
using the output of get_shape_model_3d_contours
.
Suggested values: 'auto' , 3, 4, 5, 6
Default value: 'auto'
The parameter
specifies whether the pose refinement during the search with
find_shape_model_3d
is sped up. If
'fast_pose_refinement' is set to 'false' , for
complex models with a large number of faces the pose refinement
step might amount to a significant part of the overall computation
time. If 'fast_pose_refinement' is set to
'true' , some of the calculations that are necessary
during the pose refinement are already performed during the model
generation and stored in the model. Consequently, the pose
refinement during the search will be faster. Please note, however,
that in this case the memory consumption of the model may increase
significantly (typically by less than 30 percent).
Further note that the resulting poses that are returned by
find_shape_model_3d
might slightly differ depending on the
value of 'fast_pose_refinement' , because internally the
pose refinement is approximated if the parameter is set to
'true' .
List of values: 'true' , 'false'
Default value: 'true'
In some cases the model
generation process might be very time consuming and the memory
consumption of the model might be very high. The reason for this
is that in these cases the number of views, which must be computed
and stored in the model, is very high. The larger the pose range
is chosen and the larger the objects appear in the image (measured
in pixels) the more views are necessary. Consequently, especially
the use of large images (e.g., images exceeding a size of
640×480 ) can result in very large models.
Because the number of views is highest on lower pyramid levels,
the parameter 'lowest_model_level' can be used to exclude
the lower pyramid levels from the generation of views. The value
that is passed for 'lowest_model_level' determines the
lowest pyramid level down to which views are generated and stored
in the 3d shape model. If, for example, a value of 2 is
passed for large models, the time to generate the model as well as
the size of the resulting model is reduced to approximately one
third of the original values. If 'lowest_model_level' is
not passed, views are generated for all pyramid levels, which
corresponds to the behavior when passing a value of 1 for
'lowest_model_level' . If for
'lowest_model_level' a value larger than 1 is
passed, in find_shape_model_3d
the tracking of matches
through the pyramid will be stopped at this level. However, if in
find_shape_model_3d
a least-squares adjustment is chosen
for pose refinement, the matches are refined on the lowest pyramid
level using the least-squares adjustment. Note that for different
values for 'lowest_model_level' different matches might
be found during the search. Furthermore, the score of the matches
depends on the chosen method for pose refinement. Also note that
the higher 'lowest_model_level' is chosen the higher the
portion of the refinement step with respect to the overall
run-time of find_shape_model_3d
will be. As a consequence
for higher values of 'lowest_model_level' the influence
of the generic parameter 'fast_pose_refinement' (see
above) on the runtime will increase. A large value for
'lowest_model_level' on the one hand may lead to long
computation times of find_shape_model_3d
if
'fast_pose_refinement' is switches off
('false' ). On the other hand it may lead to a decreased
accuracy if 'fast_pose_refinement' is switches on
('true' ) because in this mode the pose refinement is only
approximated. Therefore, the value for
'lowest_model_level' should be chosen as small as
possible. Furthermore, 'lowest_model_level' should be
chosen small enough such that the edges of the 3D object model
are still observable on this level.
Suggested values: 1, 2, 3
Default value: 1
For models with particularly large model views, it may be useful
to reduce the number of model points by setting
'optimization' to a value different from 'none' .
If 'optimization' = 'none' , all model points
are stored. In all other cases, the number of points is reduced
according to the value of 'optimization' . If the number
of points is reduced, it may be necessary in
find_shape_model_3d
to set the parameter
Greediness
to a smaller value, e.g., 0.7 or 0.8. For
models with small model views, the reduction of the number of
model points does not result in a speed-up of the search because
in this case usually significantly more potential instances of the
model must be examined. If 'optimization' is set to
'auto' , create_shape_model_3d
automatically
determines the reduction of the number of model points for each
model view.
List of values: 'auto' , 'none' , 'point_reduction_low' , 'point_reduction_medium' , 'point_reduction_high'
Default value: 'auto'
This parameter determines the conditions
under which the model is recognized in the image. If
'metric' = 'ignore_part_polarity' , the
contrast polarity is allowed to change only between different
parts of the model, whereas the polarity of model points that are
within the same model part must not change. Please note that the
term 'ignore_part_polarity' is capable of being
misunderstood. It means that polarity changes between
neighboring model parts do not influence the score, and hence
are ignored. Appropriate model
parts are automatically determined. The size of the parts can be
controlled by the generic parameter 'part_size' , which is
described below. Note that this metric only works for one-channel
images. Consequently, if the model is created by using this
metric and searched in a multi-channel image by using
find_shape_model_3d
an error will be returned. If
'metric' = 'ignore_local_polarity' , the model
is found even if the contrast polarity changes for each individual
model point. This metric works for one-channel images as well as
for multi-channel images. The metric
'ignore_part_polarity' should be used if the images
contain strongly textured backgrounds or clutter objects, which
might result in wrong matches. Note that in general the scores of
the matches that are returned by find_shape_model_3d
are
lower for 'ignore_part_polarity' than for
'ignore_local_polarity' . This should be kept in mind when
choosing the right value for the parameter MinScore
of
find_shape_model_3d
.
List of values: 'ignore_local_polarity' , 'ignore_part_polarity'
Default value: 'ignore_local_polarity'
This parameter determines the size of the model parts that is used when 'metric' is set to 'ignore_part_polarity' (see above). The size must be specified in pixels and should be approximately twice as large as the size of the background texture in the image. For example, if an object should be found in front of a chessboard with black and white squares of size 5×5 pixels, 'part_size' should be set to 10. Note that higher values of 'part_size' might also decrease the scores of correct instances especially when searching for objects with shiny or reflective surfaces. Therefore, the risk of missing correct instances might increase if 'part_size' is set to a higher value. If 'metric' is set to 'ignore_local_polarity' , the value of 'part_size' is ignored.
Suggested values: 2, 3, 4, 6, 8, 10
Default value: 4
3D edges are only
included in the shape representations of the views if the angle
between the two 3D faces that are incident with the 3D object
model edge is at least 'min_face_angle' . If
'min_face_angle' is set to 0.0, all edges are
included. If 'min_face_angle' is set to
(equivalent to 180 degrees), only the silhouette of the 3D object
model is included. This parameter can be used to suppress edges
within curved surfaces, e.g., the surface of a cylinder or
cone. Curved surfaces are approximated by multiple planar
faces. The edges between such neighboring planar faces should not
be included in the shape representation because they also do not
appear in real images of the model. Thus,
'min_face_angle' should be set sufficiently high to
suppress these edges. The effect of different values for
'min_face_angle' can be inspected by using
project_object_model_3d
before calling
create_shape_model_3d
. Note that if edges that are not
visible in the search image are included in the shape
representation, the performance (robustness and speed) of the
matching may decrease considerably.
Suggested values: 'rad(10)' , 'rad(20)' , 'rad(30)' , 'rad(45)'
Default value: 'rad(30)'
This value determines a threshold for the selection of significant model components based on the size of the components, i.e., connected components that have fewer points than the specified minimum size are suppressed. This threshold for the minimum size is divided by two for each successive pyramid level.
Suggested values: 'auto' , 0, 3, 5, 10, 20
Default value: 'auto'
The parameter
specifies the tolerance of the projected 3D object model edges in
the image, given in pixels. The higher the value is chosen, the
fewer views need to be generated. Consequently, a higher value
results in models that are less memory consuming and faster to
find with find_shape_model_3d
. On the other hand, if the
value is chosen too high, the robustness of the matching will
decrease. Therefore, this parameter should only be modified with
care. For most applications, a good compromise between speed and
robustness is obtained when setting 'model_tolerance' to
1.
Suggested values: 0, 1, 2
Default value: 1
This parameter specifies if
adjacent projected contours should be joined by the operator
project_shape_model_3d
or not.
Activating this option is equivalent to calling
union_adjacent_contours_xld
afterwards, but significantly
faster.
List of values: 'true' , 'false'
Default value: 'false'
If the system variable (see set_system
)
'opengl_hidden_surface_removal_enable' is set to 'true'
(which is default if it is available) the graphics card is used to accelerate
the computation of the visible faces in the model views. Depending on the
graphics card this is significantly faster than the analytic visibility
computation.
If 'fast_pose_refinement' is set to 'true' , the
precomputations necessary for the pose refinement step in
find_shape_model_3d
are also performed on the graphics card.
Be aware that the results of the OpenGL projection are slightly different
compared to the analytic projection.
This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.
ObjectModel3D
(input_control) object_model_3d →
(handle)
Handle of the 3D object model.
CamParam
(input_control) campar →
(real / integer / string)
Internal camera parameters.
RefRotX
(input_control) angle.rad →
(real)
Reference orientation: Rotation around x-axis or x component of the Rodriguez vector (in radians or without unit).
Default: 0
Suggested values: -1.57, -0.78, -0.17, 0., 0.17, 0.78, 1.57
RefRotY
(input_control) angle.rad →
(real)
Reference orientation: Rotation around y-axis or y component of the Rodriguez vector (in radians or without unit).
Default: 0
Suggested values: -1.57, -0.78, -0.17, 0., 0.17, 0.78, 1.57
RefRotZ
(input_control) angle.rad →
(real)
Reference orientation: Rotation around z-axis or z component of the Rodriguez vector (in radians or without unit).
Default: 0
Suggested values: -1.57, -0.78, -0.17, 0., 0.17, 0.78, 1.57
OrderOfRotation
(input_control) string →
(string)
Meaning of the rotation values of the reference orientation.
Default: 'gba'
List of values: 'abg' , 'gba' , 'rodriguez'
LongitudeMin
(input_control) angle.rad →
(real)
Minimum longitude of the model views.
Default: -0.35
Suggested values: -0.78, -0.35, -0.17
LongitudeMax
(input_control) angle.rad →
(real)
Maximum longitude of the model views.
Default: 0.35
Suggested values: 0.17, 0.35, 0.78
Restriction:
LongitudeMax >= LongitudeMin
LatitudeMin
(input_control) angle.rad →
(real)
Minimum latitude of the model views.
Default: -0.35
Suggested values: -0.78, -0.35, -0.17
Restriction:
- pi / 2 <= LatitudeMin && LatitudeMin <= pi / 2
LatitudeMax
(input_control) angle.rad →
(real)
Maximum latitude of the model views.
Default: 0.35
Suggested values: 0.17, 0.35, 0.78
Restriction:
- pi / 2 <= LatitudeMax && LatitudeMax <= pi / 2 && LatitudeMax >= LatitudeMin
CamRollMin
(input_control) angle.rad →
(real)
Minimum camera roll angle of the model views.
Default: -3.1416
Suggested values: -3.14, -1.57, -0.39, 0.0, 0.39, 1.57, 3.14
CamRollMax
(input_control) angle.rad →
(real)
Maximum camera roll angle of the model views.
Default: 3.1416
Suggested values: -3.14, -1.57, -0.39, 0.0, 0.39, 1.57, 3.14
Restriction:
CamRollMax >= CamRollMin
DistMin
(input_control) number →
(real)
Minimum camera-object-distance of the model views.
Default: 0.3
Suggested values: 0.05, 0.1, 0.2, 0.5
Restriction:
DistMin > 0
DistMax
(input_control) number →
(real)
Maximum camera-object-distance of the model views.
Default: 0.4
Suggested values: 0.1, 0.2, 0.5, 1.0
Restriction:
DistMax >= DistMin
MinContrast
(input_control) number →
(integer)
Minimum contrast of the objects in the search images.
Default: 10
Suggested values: 1, 2, 3, 5, 7, 10, 20, 30, 1000, 2000, 5000
GenParamName
(input_control) attribute.name(-array) →
(string)
Names of (optional) parameters for controlling the behavior of the operator.
Default: []
List of values: 'fast_pose_refinement' , 'lowest_model_level' , 'metric' , 'min_face_angle' , 'min_size' , 'model_tolerance' , 'num_levels' , 'optimization' , 'part_size' , 'union_adjacent_contours'
GenParamValue
(input_control) attribute.name(-array) →
(integer / real / string)
Values of the optional generic parameters.
Default: []
Suggested values: 0, 1, 2, 3, 4, 6, 8, 10, 'auto' , 'none' , 'point_reduction_low' , 'point_reduction_medium' , 'point_reduction_high' , 0.1, 0.2, 0.3, 'ignore_local_polarity' , 'ignore_part_polarity' , 'true' , 'false'
ShapeModel3DID
(output_control) shape_model_3d →
(handle)
Handle of the 3D shape model.
If the parameters are valid, the operator
create_shape_model_3d
returns the value 2 (
H_MSG_TRUE)
. If necessary
an exception is raised. If the parameters are chosen such that all
model views contain too few points, the error 8510 is raised. In the
case that the projected model is bigger than twice the image size in
at least one model view, the error 8910 is raised.
read_object_model_3d
,
project_object_model_3d
,
get_object_model_3d_params
find_shape_model_3d
,
write_shape_model_3d
,
project_shape_model_3d
,
get_shape_model_3d_params
,
get_shape_model_3d_contours
convert_point_3d_cart_to_spher
,
convert_point_3d_spher_to_cart
,
create_cam_pose_look_at_point
,
trans_pose_shape_model_3d
Markus Ulrich, Christian Wiedemann, Carsten Steger, “Combining Scale-Space and Similarity-Based Aspect Graphs for Fast 3D Object Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1902-1914, Oct., 2012.
3D Metrology