OpenCV (short forOpen Source Computer Vision) is a library for Computer Vision, Machine Learning, and Image Processing. It can be used to identify patterns in an image, extract features, and perform mathematical operations on it. The first step is to process the puzzle screenshot using OpenCV, lets have a quick refresher on basics of OpenCV image processing.
Install the OpenCV python package
pip install opencv-python
How to load an image
cv.imread
reads an image file and converts it to an OpenCV matrix. If the image cannot be read because the file may be missing or in a format that OpenCV can’t understand an empty matrix is returned. The OpenCV matrix can be converted to an image back using cv.imshow
function.
import cv2 as cv
import numpy as np# Reading an image
original = cv.imread("<file_name">)
cv.imshow("original", original)
How to draw a line, circle, rectangle, text on the same image
Once we detect the grid, we need to recreate it using lines and place Q
using text. Let’s look at a short snippet to draw a line, circle, rectangle, and text on the above read matrix.
# Drawing a line
line = cv.line(original, (original.shape[1]//2, original.shape[0]//2), (0,0) , (0,255,0), thickness=2)
cv.imshow("line", line)# Drawing other shapes
circle = cv.circle(line, (line.shape[1]//2, line.shape[0]//2), 50, (0,0,255), thickness=2)
rect = cv.rectangle(circle, (10,10), (circle.shape[1]//2, circle.shape[0]//2), (255,0,0), thickness=2)
text = cv.putText(rect, "Hi", (rect.shape[1]//2, rect.shape[0]//2), cv.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), thickness=2)
cv.imshow("all shapes", text)
How to detect contours
Contours are simply a shape joining all points of similar color and intensity at a continuous boundary. These are useful when detecting shapes and object outline analysis. We will draw our puzzle grid by detecting the individual cells.
# Its best to convert image to grayscale
# and add a bit of blur for better contour detections
# since our image is mostly a grid we dont need blur# by default OpenCV reads images as BGR
# as opposed to traditional RGB
gray = cv.cvtConvert(original, cv.COLOR_BGR2GRAY)
contours, _ = cv.findContours(gray, cv.RETR_TREE, cv.CHAIN_APPROX_NONE)
By default, OpenCV reads images as BGR, as opposed to traditional RGB
Cropping an image
For us to eliminate unnecessary areas from screenshots and reduce noise, once we’ve detected our contours
# its essentially selecting the pixels we need from the entire image
cropped = original[0:original.shape[1]//2, 0:original.shape[0]//2]
cv.imshow("cropped", cropped)
First, we begin by loading the image into memory and converting it into Grayscale. This helps in simplifying contour detection, a general step that is always followed since it reduces the image complexity. Next, we find contours, sort them, and select the largest one. Typically, the first contour is the bound box of the original image, so we use the second largest contour to isolate the puzzle grid. Then, we crop the image just to get the grid and nothing else. We again find contours, since now the noise is reduced, it will detect the grid better. We determine the number of cells within the grid and iterate over each cell, take the average color, and assign a number of each color, which gives us the 2D array of our puzzle
# Read the input image and save the original
original = cv.imread(file_name)
cv.imwrite("solution/original.png", original)# Convert the image to grayscale
gray = cv.cvtColor(original, cv.COLOR_BGR2GRAY)
# Find contours in the grayscale image and sort them by area
contours, _ = cv.findContours(gray, cv.RETR_TREE, cv.CHAIN_APPROX_NONE)
contours = sorted(contours, key=cv.contourArea, reverse=True)
# Extract the bounding box of the puzzle grid (using the second largest contour)
x, y, w, h = cv.boundingRect(contours[1])
# Crop the grid area from the original image
grid = original[y:y+h, x:x+w]
cv.imwrite("solution/grid.png", grid)
# Convert the cropped grid to grayscale
gray = cv.cvtColor(grid, cv.COLOR_BGR2GRAY)
cv.imwrite("solution/gray-grid.png", gray)
# Find contours again in the cropped grayscale grid
contours, _ = cv.findContours(gray, cv.RETR_TREE, cv.CHAIN_APPROX_NONE)
contours = sorted(contours, key=cv.contourArea)
# Determine the total number of cells in the grid
total_cells = len(contours) - 2
grid_size = int(math.sqrt(total_cells))
# Check if the detected cells form a complete square grid
if total_cells != grid_size**2:
print("Unable to detect full grid! Aborting")
# Calculate individual cell dimensions
cell_width = w // grid_size
cell_height = h // grid_size
# Initialize color mappings and board representation
colors = []
board = []
color_index = 1
color_map = {}
reverse_color_map = {}
padding = 10
# Iterate through each cell in the grid
for i in range(grid_size):
row = []
for j in range(grid_size):
# Calculate cell coordinates with padding
cell_x = j * cell_width
cell_y = i * cell_height
padding = 15
cell = grid[cell_y+padding:cell_y+cell_height-padding, cell_x+padding:cell_x+cell_width-padding]
# Get the average color of the cell
avg_color = cell.mean(axis=0).mean(axis=0)
avg_color = avg_color.astype(int)
avg_color = tuple(avg_color)
# Map the color to a unique index if not already mapped
if avg_color not in color_map:
color_map[avg_color] = str(color_index)
reverse_color_map[str(color_index)] = avg_color
color_index += 1
# Add the color index to the row
row.append(color_map[avg_color])
# Add the row to the board
board.append(row)