2

I have dataset of images which are all like this one.

sample image

The task is to crop the white space surrounding the image as much as possible and return the image that contains less white surrounding image:

def crop_object(img):
    lst = []
    # hold min and max of height
    for i in range(img.shape[0]):
        r = img[:,i,0]
        g = img[:,i,1]
        b = img[:,i,2]

        if (np.min(r) != 255) or (np.min(g) != 255) or (np.min(b) != 255):
            lst.append(i)
    a1 = min(lst)
    a2 = max(lst)

    for i in range(img.shape[1]):
        r = img[i,:,0]
        g = img[i,:,1]
        b = img[i,:,2]

        if (np.min(r) != 255) or (np.min(g) != 255) or (np.min(b) != 255):
            lst.append(i)
    a3 = min(lst)
    a4 = max(lst)

    return img [a3:a4, a1:a2, :]

I want a more pythonic way to handle this. Something like less code and faster run.

Can you help me guys?

2

4 Answers 4

3

Inspired by Crop black border of image using NumPy, here are two ways of cropping -

# I. Crop to remove all black rows and columns across entire image
def crop_image(img):
    mask = img!=255
    mask = mask.any(2)
    mask0,mask1 = mask.any(0),mask.any(1)
    return img[np.ix_(mask1,mask0)]

# II. Crop while keeping the inner all black rows or columns
def crop_image_v2(img):
    mask = img!=255
    mask = mask.any(2)
    mask0,mask1 = mask.any(0),mask.any(1)
    colstart, colend = mask0.argmax(), len(mask0)-mask0[::-1].argmax()+1
    rowstart, rowend = mask1.argmax(), len(mask1)-mask1[::-1].argmax()+1
    return img[rowstart:rowend, colstart:colend]

Using a tolerance

As mentioned in that linked post, we might want to use some tolerance. For the same, the mask creation step would modify to -

tol = 255 # tolerance value
mask = img<tol

Timings -

# Read in given image
In [119]: img = cv2.imread('9Aplg.jpg')

# With original soln
In [120]: %timeit crop_object(img)
5.46 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [121]: %timeit crop_image(img)
923 µs ± 4.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [122]: %timeit crop_image_v2(img)
672 µs ± 53.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1
  • Just what I needed. so simple, so clean and speed performance comparison made it perfect. Commented Sep 24, 2020 at 15:43
1

Here is one way that is similar but uses more OpenCV in my Python code. Three run times on my Mac Mini are shown at the bottom. I note that your image is JPG, so the white is not pure white, especially near the object, due to JPG compression. So I used cv2.inRange() to do a color thresholding. Alternately, one could convert to grayscale and then do a simple threshold at 220. However, my timings were similar, but slightly longer that way.

import cv2
import numpy as np
import time

start = time.time()

# load image
img = cv2.imread("object2crop.jpg")

# get color bounds of white background
lower =(220,220,220) # lower bound for each channel
upper = (255,255,255) # upper bound for each channel

# create the mask
mask = cv2.inRange(img, lower, upper)

# get bounds of black pixels
black = np.where(mask==0)
xmin, ymin, xmax, ymax = np.min(black[1]), np.min(black[0]), np.max(black[1]), np.max(black[0])
print(xmin,xmax,ymin,ymax)

# crop the image at the bounds
crop = img[ymin:ymax, xmin:xmax]

# write result to disk
cv2.imwrite("object2crop_cropped.jpg", crop)

end = time.time()
elapsed_time = end - start
print("time:",elapsed_time)

# display it
cv2.imshow("mask", mask)
cv2.imshow("crop", crop)
cv2.waitKey(0)

# time: 0.0021338462829589844
# time: 0.002237081527709961
# time: 0.0021467208862304688
1

This method is just slightly faster than my first one. It use more OpenCV in Python. In this method, I get the largest contour after thresholding and then its bounding box. If the background were not JPG compressed, it would not need to find the largest contour, since the extraneous pixels left after thresholding would not be there. So there would be only one external contour.

import cv2
import numpy as np
import time

start = time.time()

# load image
img = cv2.imread("object2crop.jpg")

# get color bounds of white background
lower =(220,220,220) # lower bound for each channel
upper = (255,255,255) # upper bound for each channel

# create the mask
mask = cv2.inRange(img, lower, upper)
mask = cv2.bitwise_not(mask)

# get the largest contour
contours = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
big_contour = max(contours, key=cv2.contourArea)

# get bounding box
x,y,w,h = cv2.boundingRect(big_contour)

# crop the image at the bounds
crop = img[y:y+h, x:x+w]

# write result to disk
cv2.imwrite("object2crop_cropped3.jpg", crop)

end = time.time()
elapsed_time = end - start
print("time:",elapsed_time)

# display it
cv2.imshow("mask", mask)
cv2.imshow("crop", crop)
cv2.waitKey(0)

time: 0.002028942108154297
time: 0.0019147396087646484
time: 0.0021567344665527344
0

We can use the splitImageAtXvalues() function I wrote few days ago. It takes an image and n x-Values as Input and returns n+1 Subimages. For example if xvals=[20] and your image has a width of 40 pixel, it returns two subsets of the image, one from x=0 till x=20 and the other from x=21 till x=40. So for your case, we just have to find the x-values where the none-white pixel start from the left (x1) and from the right (x2), and then return the middle image returned by splitImageAtXvalues. I included the theshold as a parameter since in your case there are some not purely white pixels around the images content.

def splitImageAtXvalues(img, xvals):
    subimages = []
    xvals = [0] + xvals + [img.shape[0]]
    for j in range(len(xvals)):
        if j == len(xvals)-1:
            break
        subimg= img[:, xvals[j]:xvals[j+1]]
        subimages.append(subimg)
    return subimages

def crop_object_by_whitespace(img, threshold):
    x1 = None
    x2 = None
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # convert to grayscale
    img_b = cv2.threshold(img_gray,threshold,255,cv2.THRESH_BINARY)[1] # convert to binary
    # Loop from left:
    for i in range(len(img_b)):
        column = img_b[:,i]
        uniqueValues = np.unique(column) # if there are other pixels than 255
        if len(uniqueValues) > 1:
            x1 = i
            break
    # Loop from right:
    for i in range(len(img_b),-1,-1):
        column = img_b[:,i-1]
        uniqueValues = np.unique(column) # if there are other pixels than 255
        if len(uniqueValues) > 1:
            x2 = i
            break
    return splitImageAtXvalues(img, [x1, x2])[1]   

crop_object_by_whitespace(img, 240) # 240 seems to fit good for your image

Result with threshold = 240

Result

Result with threshold = 254

enter image description here

1
  • Good idea. but what I really need is to have more pythonic and of course shorter code, but this answer is longer than my original approach. However I learned a lot from the answer. thanks Commented Sep 24, 2020 at 15:21

Not the answer you're looking for? Browse other questions tagged or ask your own question.