How to recognize Total in customer check

    February 28, 2017

    Our goal is to get “TOTAL” string from the check scan or photo. For this project we need 2 libraries:

    • OpenCV for image processing;
    • Tesseract for text recognizing.

    First, we need to collect checks or their photos to train Canny neural network to detect their edges better. So you need to visit a lot of different shops or cafes. Or ask your girlfriend, she will do it better. Our program must do two things:

    • detect check on the photo;
    • recognize text on the check.

    Check detection problem can be solved with OpenCV library, if you are programming on Python or C++ you need to do next steps:

    1. load image with OpenCV;
    2. transform an image from BGR to Grayscale format;
    3. blur image with filters;
    4. detect edges with Canny algorithm;
    5. find contours and crop image:
    6. rotate it, if you need

    To load the image with OpenCV we should do next:

    import cv2
    img = cv2.imread('1.jpg')
    cv2.imshow("Image", edged)
    cv2.waitKey(0)

    paycheck analytics

    After that, we need to transform the image into Grayscale format and Blur it with Median Blur Functions and Canny algorithm:

    paycheck analytics

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    gray = cv2.medianBlur(gray, 13)
    edged = cv2.Canny(gray, 10, 120)

    The last step — find contours (note that we need to get 4 lines in one contour because a cheque is a rectangle), crop image and save in right format (tif):

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7, 15))
    closed = cv2.morphologyEx(edged, cv2.MORPH_CLOSE, kernel)
    (cnts, _) = cv2.findContours(closed.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    for c in cnts:
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.02 * peri, True)
    if len(approx) >= 4:
    if approx[0, 0, 0] < approx[1, 0, 0]:
    x1 = approx[0, 0, 0]
    y1 = approx[0, 0, 1]
    else:
    x1 = approx[1, 0, 0]
    y1 = approx[0, 0, 1]
    if approx[2, 0, 0] > approx[3, 0, 0]:
    x2 = approx[2, 0, 0]
    y2 = approx[2, 0, 1]
    else:
    x2 = approx[3, 0, 0]
    y2 = approx[2, 0, 1]
    cropped = img[y1:y2, x1:x2]
    height, width = cropped.shape[:2]
    center = (width / 2, height / 2)
    if (width > height):
    m = cv2.getRotationMatrix2D(center, 270, 1)
    rotated = cv2.warpAffine(cropped, m, (width, height))
    gray = cv2.cvtColor(rotated, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (3, 3), 5)
    thresh = cv2.adaptiveThreshold(blur, 255, 1, 1, 11, 2)
    img = cv2.bitwise_not(thresh)
    cv2.imwrite('1.tif', img)
    else:
    rotated = cropped
    gray = cv2.cvtColor(rotated, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (3, 3), 5)
    thresh = cv2.adaptiveThreshold(blur, 255, 1, 1, 11, 2)
    img = cv2.bitwise_not(thresh)
    cv2.imwrite('1.tif', img)

    paycheck analytics

    On each step, you can choose different attributes to get better results than I have, because in each case we need different filters and blur. Try to experiment with other functions to improve the algorithm or just for fun.

    Recognition of a check on the cropped image

    To solve this task, we need to install PyTesseract. After downloading PyTesseract we feed in our prepared image in .tif format with commands:

    from pytesseract import image_to_string
    from PIL import Image
    total = image_to_string(Image.open('1.tif'), lang='rus'

    By default, PyTesseract uses the English alphabet. I used lang=’rus’ argument to recognize the Russian alphabet on the check. To download and use not default language search it on github https://github.com/tesseract-ocr/tessdata. Put file “lang.tessdata” under the right folder. And get the result string:

    To get all code to follow this link: https://github.com/andrewdemchenkodeveloper/ChequeRecognition

    paycheck recognition

    I hope you enjoyed this intro to Image Processing and Computer Vision. And now you know how much money spent your girlfriend 🙂 I am sure that there would be more posts like this in the future. If you have any comments or questions, please contact us.

    • #Image processing
    • #Opencv
    • #Pytesseract
    • #Python
    • #Text Recognizing

    Share Article

    Case studies

    CONNECT WITH OUR EXPERTS