Home > Enterprise >  How to compare two poly lines for equality?
How to compare two poly lines for equality?

Time:01-25

I have two poly lines (paths), each represented by an array of 2D points. I'd like to compute a distance or similarity score between the two. There may be a different number of points in each array. If you were to plot the poly lines and they were directly on top of each other, the distance should be zero.

p1 = np.array([[0,0], [5,5], [9,9]])
p2 = np.array([[0,0], [3,3], [6,6], [9,9]])
p3 = np.array([[0,0], [3,4], [6,7], [9,9]])
p4 = np.array([[0,0], [0,9]])

polyline_dist(p1, p2) # Should be 0 since the plots are identical
polyline_dist(p1, p3) # Should be small since the plots are close
polyline_dist(p1, p4) # Should be larger since the plots are much different

I tried one approach where I calculated the distance from each point in array 1 to the line segments from array 2 and took the minimum distance, then took the average over all the points. This worked, but got very slow for longer arrays with hundreds of points.

Any suggestions would be welcome!

CodePudding user response:

You could try finding the area each plot casts onto the x and y axis, then comparing the intersection of that area. Similar to with calculus where you have two curves f(x) and g(x), you can find the area between the curves using the integral from the lower bound to the upper bound of (f(x) - g(x)) dx. If your lines don't have overlapping domains/codomains, you may need to add some penalty and start at the maximum of p1[0] and p2[0] and end at the minimum of p1[-1] and p2[-1]. Polylines that are the same would have 0 distance between each other and thus the area between each polyline would be 0.You would check both the x and y axis for cases where polylines area heavily vertical. I wrote the following code minus the sample_at_point part due to it getting late, all that part needs to do is find the lower and upper bounding points of p in p1 for the given axis, then define a line equation and find where the passed point p lies on that line equation and return the value. I'll edit this answer tomorrow but figured I would post what I have right now. This solution would work for the polylines you show above however it would fail if the polylines are not similar to basic curves (ex: a circle/any curve where f(x) could have two different values).

def polydist(p1, p2):
    x_area = area_between_squared(p1, p2, axis=0, value=1)
    y_area = area_between_squared(p1, p2, axis=1, value=0)

    return (x_area   y_area)**0.5


def sample_at_point(p1, p, axis, value):
    """
    # There should be a fast way to do this,
    # Finnd the bounding points that p is between,
    # the bounding points are the max point < p and the min point > p
    # these points form a line,
    # find the value of p applied to that line formula.
    """


def area_between_squared(p1, p2, axis, value):
    start = max(p1[0][axis], p2[0][axis])
    start_penalty = start - min(p1[0][axis], p2[0][axis])
    end = min(p1[-1][axis], p2[-1][axis])
    end_penalty = max(p1[-1][axis], p2[-1][axis]) - end

    # Getting late, may edit this later tomorrow with the code for this to work but feel free to implement the following idea below:
    axis_points = [x[axis] for x in p1]
    axis_points.extend([x[axis] for x in p2])
    axis_points.sort()

    prev_p1_v = sample_at_point(p1, axis_points[0], axis, value)
    prev_p2_v = sample_at_point(p2, axis_points[0], axis, value)
    prev_axis_point = axis_points[0]

    axis_points.pop(0) # remove the first item

    sum = 0

    for e in axis_points:
        p1_v = sample_at_point(p1, e, axis, value)
        p2_v = sample_at_point(p2, e, axis, value)

        point_c = p1_v - p2_v
        point_b = prev_p1_v - prev_p2_v

        area = 0.5 * (point_c   point_b) * (e - prev_axis_point)
        sum  = area

        prev_p1_v = p1_v
        prev_p2_v = p2_v
        prev_axis_point = e

    sum *= sum # square the result value, add the penalties since this difference should be positive
    sum  = start_penalty   end_penalty

    return sum

CodePudding user response:

Thanks for the comments and answers. They actually led me to a good solution that used the OpenCV function pointPolygonTest, which is optimized much more than my poorly written numpy code.

The approach was to turn the polyline into a contour by appending the points in reverse order, then walk through the points of the first array and call pointPolygonTest to get the distance. I found squaring the distance helped amplify smaller differences. I then take the average over all the points.

Here's the code:

import cv2
import numpy as np

def point_to_polyline_dist(p, polyline):
    cnt = np.concatenate((polyline, polyline[::-1]))
    return np.abs(cv2.pointPolygonTest(cnt, (int(p[0]),int(p[1])), True))

def polyline_dist(polyline1, polyline2):
    return np.mean([point_to_polyline_dist(p, polyline2)**2 for p in polyline1])


p1 = np.array([[0,0], [5,5], [9,9]])
p2 = np.array([[0,0], [3,3], [6,6], [9,9]])
p3 = np.array([[0,0], [3,4], [6,7], [9,9]])
p4 = np.array([[0,0], [0,9]])

print(polyline_dist(p1, p2)) # 0.0
print(polyline_dist(p1, p3)) # 0.1666666666666667
print(polyline_dist(p1, p4)) # 35.333333333333336

This works well in my case because my poly lines have lots of short line segments. It might be more robust to compute the distance from both arrays to each other and add the distances, like this:

def polyline_dist_better(polyline1, polyline2):
    dist_1to2 = np.mean([point_to_polyline_dist(p, polyline2)**2 for p in polyline1])
    dist_2to2 = np.mean([point_to_polyline_dist(p, polyline1)**2 for p in polyline2])
    return dist_1to2   dist_2to1

Hope that helps someone out there!

  •  Tags:  
  • Related