I would like to find an area in about 1,5K images which are all in a similar format. They all are scans of painted or photographed images of persons. They all feature the same color card. The color cards may be placed on either side of the image (see sample image below).
The result should be an image, only containing the person's portrait.
I am able to find the color card with opencv template matching:
import cv2
import numpy as np
method = cv2.TM_SQDIFF_NORMED
# Read the images from the file
img_rgb = cv2.imread('./imgs/test_portrait.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('./portraet_color_card.png', 0)
w, h = template.shape[::-1]
result = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = .97
loc = np.where(result >= threshold)
for pt in zip(*loc[::-1]):
print("Found:", pt)
cv2.rectangle(img_rgb, pt, (pt[0] w, pt[1] h), (0,255,255), 2)
cv2.imwrite('result.png',img_rgb)
Output:
Found: (17, 303)
Found: (18, 303)
Found: (17, 304)
Found: (18, 304)
With the coordinates and the image dimensions, I am able to determine if the image is left or right and can crop the image. The result is far from perfect, as the borders still are there.
Is there a better way to extract the portraits from the images? I would prefer to work with python and opencv but I am open to other suggestions on how to solve this problem for a larger number of images.
CodePudding user response:
This solution assumes that there are mainly two main objects in the image:
- The portrait (supposed to be larger than the pattern, and the largest object in the image).
- The pattern.

Solution Steps in order:
Classical Image processing to obtain the important features from the image:
- Conversion to Gray level.
- Gaussian Blur to reduce noise and smooth the image.
- Edged Detection, using
Cannyin my case. - Morphological Dilation to group the features into two main patterns.
- Largest Connected components Detection (credit to an old
Two segmented cells provide us with a pretty simple validation approach (compare dimensions, relative location, etc). At this point we can easily find color card background (it is better to do multiple measurements near identified color cells):
As you can see some noise, lossy compression artefacts affect the result, but it is still good enough (yet another validation possibility: compare cell and card size). At this point, we can do additional measurements in order to find colors of the background.
Let's review simple cases first: results seem to be good enough, so final crop and small correctness can be easily implemented:
Some cases will not be that straightforward:
I'd suggest investing more time in validation rules and processing all tricky cases manually, but with some additional time "common trickiness" can be also addressed.
Anyway, here is a brief summary:
- use key colors to reliably identify the color card (and do initial validation)
- do multiple measurements to find color card background (so you can use a smaller threshold)
- do multiple measurements to define image background
- validate strategy is a must, so it will be easier to process some small amount of leftovers manually

PS: white on white is fun, but Kazimir Malevich did that quite a while ago, no need to repeat :)













