[Technical Sharing] Screen Capture — Principles of Long Screenshots_weixin

Today we’d like to share with you the principle and procedure of the long screenshot feature in Screen Capture.

Taking long scrollshots is common on the mobile side, but on the PC side, there have been many limitations for a long time. Taking the WeCom interface for example, the main differences and difficulties are as follows:

On the mobile side, the interface layout is usually longitudinal (that is, the top, middle, and bottom parts). The scrollable area and screenshot area are located in the middle and basically fixed, so it is relatively simple to handle.

While on the PC side, the interface layout is much more complicated (including the left, right, top, bottom, and middle parts) with interfaces crossed and overlapped, unclear layout of scrollable area, and free and random screenshot area which is selected by the user but uncontrollable. Therefore the distance of scrolling is unpredictable.

The essence of the scrolling screenshot is to take screenshots multiple times during the scrolling process and stitch the screenshots by order at the same time with overlapped parts filtered out. The main process is as follows:

流程图1.png

1.On the main thread, scrolling screenshots can be triggered by the mouse wheel rolling (either in the automatic scrolling mode or manual scrolling mode) and be saved in the queue.

2.On the sub-thread, screenshots saved in the queue are edited and processed (For example, the fixed top and bottom parts are identified and cropped off, and then the rest parts are stitched together. In the end, the cropped top and bottom parts are restored and added to the top and bottom parts of the final scrolling screenshot respectively.)

In other words, the core process of the scrolling screenshot is “image stitching” on the sub-thread.

We use the stitching algorithm based on template matching to realize image stitching. The basic principle is as follows:

Take the part of the public area ROI in screenshot B as a template, match the template with screenshot A, get the best matching position, and calculate the pixel distance that needs to be shifted in the Y direction (vertical stitching, so only the ordinate position needs to be calculated), copy the data in screenshot B to the starting position (0, Y) of screenshot A.

图片拼接示意图.png The OpenCV library provides algorithms based on template matching. The detailed process is as follows:

Convert screenshots A and B to grayscale images;
Take the first 50 lines of data in screenshot B as a template image;
Match the template image with screenshot A;
Thresholding the matching result matrix;
If the matching value is greater than the threshold, the match can be considered successful, that is, stitch the pictures according to the matching coordinates.

OpenCV provides six matching algorithms:

CV_TM_SQDIFF square difference matching method, so a perfect match will be 0 and bad matches will be large.
CV_TM_SQDIFF_NORMED normalized square difference matching method.
CV_TM_CCORR correlation matching method, which multiplicatively matches the template against the image, so a perfect match will be large and bad matches will be small or 0.
CV_TM_CCORR_NORMED normalized correlation matching method.
CV_TM_CCOEFF correlation coefficient matching method, so a perfect match will be 1 and a perfect mismatch will be -1.

流程图2.png

CV_TM_CCOEFF_NORMED normalized correlation coefficient matching method.

We obtain more accurate matches (at the cost of more computations) as we

move from simpler measures (square difference) to the more sophisticated ones (correlation coefficient).

After doing some test trials of all these methods, we choose the CV_TM_CCOEFF_NORMED for image stitching that best trades off accuracy for speed.

The normalized correlation coefficient matching method is the most complex similarity algorithm supported by OpenCV, using the correlation coefficient calculation method of mathematical statistics. Specifically, in addition to subtracting their respective averages, they are also divided by their respective variances.

After these two steps, the image to be tested and the template are standardized, which can ensure that the calculation results will not be affected even when the brightness of the image and template vary respectively. The calculated correlation coefficient is limited to the [- 1,1] interval, in which, 1 means exactly the same, — 1 means that the brightness of the two images is just the opposite, and 0 means that there is no linear relationship between the two images.

The key codes of the flow are as follows:

Mat ImageMerger::getMerageImageBasedOnTemplate1(const Mat &image1, const Mat &image2) { // Convert to grayscale images Mat image1_gray, image2_gray; cvtColor(image1, image1_gray, CV_BGR2GRAY); cvtColor(image2, image2_gray, CV_BGR2GRAY); // Choose line 1 to 50 as template Mat temp = image2_gray(Range(1, 50), Range::all()); // Image, size and data type of result matrix Mat res(image1_gray.rows — temp.rows + 1, image2_gray.cols — temp.cols + 1, CV_32FC1); // Match templates with normalized correlation coefficient matchTemplate(image1_gray, temp, res, CV_TM_CCOEFF_NORMED); //Thresholding of result matrix threshold(res, res, 0.8, 1, CV_THRESH_TOZERO); double minVal, maxVal, thresholdv = 0.8; Point minLoc, maxLoc; minMaxLoc(res, &minVal, &maxVal, &minLoc, &maxLoc); //Image stitching Mat temp1, result; if (maxVal >= thresholdv)//Match is deemed successful only if the measures are greater than the threshold. { //result: Stitched images result = Mat::zeros(cvSize(image1.cols, maxLoc.y + image2.rows), image1.type()); //temp1: Non-template part of Image 1 temp1 = image1(Rect(0, 0, image1.cols, maxLoc.y)); //result Copy the non-template part of Image 1 and Image 2 to result temp1.copyTo(Mat(result, Rect(0, 0, image1.cols, maxLoc.y))); image2.copyTo(Mat(result, Rect(0, maxLoc.y — 1, image2.cols, image2.rows))); } //Save the merged image imwrite(“merge.jpg”, result); return result;

The current long screenshot feature is enough to meet the requirements in most scenarios, while there are still the following shortcomings:

There shall be no areas that affect the recognition of the algorithm at the stitching joints, such as large blank areas or repeated areas. There shall be no dynamically changing video or animation in the selection area. We will continue to investigate and optimize the existing algorithms, so as to improve the long screenshot feature. You are also welcome to leave us messages.