[项目准备] 论使用深度学习进行卡片识别如何轻松省力获取大量数据用四张卡片生成大量数据集

Author： MrXiao
发布时间：January 25, 2019
2325views
No comments
7964 words
Categories：深度学习旧文归档

之前用OpenCV实现了《德国心脏病》桌游的卡片识别，但效果并不是很好：视频流识别时对光线条件的要求极其苛刻，基本上是被机器玩而不是玩机器了。这让我有点恼火，CV刚刚入门的我也没啥好解决办法，只好大炮打蚊子上目标检测了，可又有问题了——深度学习要大量训练集，而这四种水果有啥好训练的？无非就是适应一下光线环境和背景，顺便变换一下角度。跟之前的无人机高空识别不同，这次的卡片识别根本没有数据集支持（想到这里就想给VisDrone团队点一万个赞，暴风感谢！）

转念一想呀，咱能不能就拍四张照片或者四段视频，写段脚本让它自己旋转跳跃，顺便还能加上多样的背景使得模型能在复杂背景中表现极佳？

项目地址：Github传送门

参考项目：playing-card-detection

准备工作

必要的包：OpenCV和Numpy，还有以下不常用的：

imgaug : https://github.com/aleju/imgaug

用于图片放大。

shapely : https://github.com/Toblerity/Shapely

用于笛卡尔坐标系下的图像变换与几何图形分析。用来确认一张卡片是否被另一张卡片覆盖。

tqdm : https://github.com/tqdm/tqdm

进度条工具。

Win10下Python3.5安装shapely可能报错：

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      …………
        lgeos = CDLL("geos_c.dll")
      File "g:\python\pyhton36\lib\ctypes\__init__.py", line 348, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: [WinError 126] 找不到指定的模块。

可以进入https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely手动下载对应Windows和Python版本的whl，pip install *.whl手动安装。

四段以纯色为背景的视频，需要我们依次拍摄以四种水果在中心的卡片，最好持续20s以上以获得最佳效果。

下载数据集：Describable Textures Dataset (DTD).
数据集中有多种多样的背景可供选择。

思路

其实一想到考虑的问题如此之多，我打算放弃用工具直接手动拍个几百来张然后手动标注来着。功夫不负有心人，我找到了playing-card-detection这个用于针对单张扑克牌数据集生成的项目。

看到这张图我简直抑制不住激动的心情，你是匹配角落里的花色和数字，我是匹配中心的图案，需要修改的只是几个定位参数而已：

# 单位：mm
cardW = 57
cardH = 87
cornerXmin = 17
cornerXmax = 39
cornerYmin = 31
cornerYmax = 60
# 以上数据由手工根据我手中的卡牌大小测量，可适当修改或根据随后debug结果酌情调整

随后的操作基本就是依葫芦画瓢了，有改动的地方我会额外指出。

看着大佬的代码，我们又能学到另一种从单调背景中提取卡片的方法：

def extract_card(img, output_fn=None, min_focus=20, debug=False):

    imgwarp = None

    # Check the image is not too blurry
    focus = varianceOfLaplacian(img)
    if focus < min_focus:
        if debug: print("Focus too low :", focus)
        return False, None

    # Convert in gray color
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Noise-reducing and edge-preserving filter
    gray = cv2.bilateralFilter(gray, 11, 17, 17)

    # Edge extraction
    edge = cv2.Canny(gray, 30, 200)

    # Find the contours in the edged image
    _, cnts, _ = cv2.findContours(edge.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # We suppose that the contour with largest area corresponds to the contour delimiting the card
    cnt = sorted(cnts, key=cv2.contourArea, reverse=True)[0]

    # We want to check that 'cnt' is the contour of a rectangular shape
    # First, determine 'box', the minimum area bounding rectangle of 'cnt'
    # Then compare area of 'cnt' and area of 'box'
    # Both areas sould be very close
    rect = cv2.minAreaRect(cnt)
    box = cv2.boxPoints(rect)
    box = np.int0(box)
    areaCnt = cv2.contourArea(cnt)
    areaBox = cv2.contourArea(box)
    valid = areaCnt / areaBox > 0.95

    if valid:
        # We want transform the zone inside the contour into the reference rectangle of dimensions (cardW,cardH)
        ((xr, yr), (wr, hr), thetar) = rect
        # Determine 'Mp' the transformation that transforms 'box' into the reference rectangle
        if wr > hr:
            Mp = cv2.getPerspectiveTransform(np.float32(box), refCard)
        else:
            Mp = cv2.getPerspectiveTransform(np.float32(box), refCardRot)
        # Determine the warped image by applying the transformation to the image
        imgwarp = cv2.warpPerspective(img, Mp, (cardW, cardH))
        # Add alpha layer
        imgwarp = cv2.cvtColor(imgwarp, cv2.COLOR_BGR2BGRA)

        # Shape of 'cnt' is (n,1,2), type=int with n = number of points
        # We reshape into (1,n,2), type=float32, before feeding to perspectiveTransform
        cnta = cnt.reshape(1, -1, 2).astype(np.float32)
        # Apply the transformation 'Mp' to the contour
        cntwarp = cv2.perspectiveTransform(cnta, Mp)
        cntwarp = cntwarp.astype(np.int)

        # We build the alpha channel so that we have transparency on the
        # external border of the card
        # First, initialize alpha channel fully transparent
        alphachannel = np.zeros(imgwarp.shape[:2], dtype=np.uint8)
        # Then fill in the contour to make opaque this zone of the card
        cv2.drawContours(alphachannel, cntwarp, 0, 255, -1)

        # Apply the alphamask onto the alpha channel to clean it
        alphachannel = cv2.bitwise_and(alphachannel, alphamask)

        # Add the alphachannel to the warped image
        imgwarp[:, :, 3] = alphachannel

        # Save the image to file
        if output_fn is not None:
            cv2.imwrite(output_fn, imgwarp)

    if debug:
        cv2.imshow("Gray", gray)
        cv2.imshow("Canny", edge)
        edge_bgr = cv2.cvtColor(edge, cv2.COLOR_GRAY2BGR)
        cv2.drawContours(edge_bgr, [box], 0, (0, 0, 255), 3)
        cv2.drawContours(edge_bgr, [cnt], 0, (0, 255, 0), -1)
        cv2.imshow("Contour with biggest area", edge_bgr)
        if valid:
            cv2.imshow("Alphachannel", alphachannel)
            cv2.imshow("Extracted card", imgwarp)

    return valid, imgwarp

与之前项目单纯的我相比，大佬首先检查了图片是否清晰（因为可能从视频中提取的关键帧是模糊的），然后多了降噪和边缘检测部分，专业玩CV的跟我们这种入门爱好者还真是不一样，看来之前那个基于OpenCV检测卡片的项目还能再优化一下。

基于原项目的修改如下：

调整上述定位参数。
修改了上述卡片检测的阈值min_focus，从120调至20，在我实践过程中清晰的图片focus在50左右波动。
删除了扑克列表，加入了如下水果列表。card_fruits = ['Bananas','Lemons','Strawberries','Grapes']
在所有调用了matplotlib.pyplot库的函数末尾加入plt.show()以便在Pycharm等IDE中即时显示。
findHull函数使用多边形逼近水果的具体形状，由于角落的扑克标记与水果大小显然不一致，此函数中标注“TWEAK”参数均可根据需要调整，我在此仅调整了` min_hull_area = 2000 # TWEAK, deck and 'zoom' dependant
max_hull_area = 8000 # TWEAK, deck and 'zoom' dependant`
删除了原项目三卡片叠加的操作，因为三卡片叠加时完全会挡住卡片中心的图案，同时修改了叠加卡片的逻辑：

            for i in range(1, 3):
                # smallPoly1: shapely polygon of one of the hull of card 1
                smallPoly1 = kps_to_polygon(self.lkps1[i].keypoints[:])
                a = smallPoly1.area
                # We calculate area of the intersection of card 1 corner with card 2
                intersect = mainPoly2.intersection(smallPoly1)
                ai = intersect.area  # Center intersection area
                if ai != 0:  # Require two center parts can not intersect with each other， even partially
                    invalid = True
                    break
                else:
                    self.listbba.append(BBA(self.bbs1[i - 1], class1))
                '''
                # If intersection area is small enough, we accept card 2
                if (a - ai) / a > 1 - intersect_ratio:
                    self.listbba.append(BBA(self.bbs1[i - 1], class1))
                # If intersectio area is not small, but also not big enough, we want apply new transform to card 2
                elif (a - ai) / a > intersect_ratio:
                    invalid = True
                    break
                '''

被注释掉的部分是原逻辑，显然我们需要的是图案不能有一丝重叠，而不是“轻微的重叠是可以接受的”。

使用方法

由于这份代码比较特殊，原本应该在控制台一步步执行的，我在此项目中将它用py文件包装了，你需要逐个搜索“STEP1”“STEP2”这样我事先添加的注释，去掉它周围的三个单引号注释逐一执行，TEST也是如法炮制。有的STEP只需要执行一次，我也给出了标注。

注意全程需要大量内存，大概像这样，所以至少使用8GB内存的设备执行本代码或者自行删减背景数据集，同时关闭不必要的应用。

总结

看大佬的代码总是愉快的，但如果一直只会看大佬的代码那就永远摆脱不了面对GitHub编程的名号，因此需要不断在大佬代码中学习精华，用到自己的程序中去。

今天累了，学了一整天的MLP还得做笔记，明天可能要做做船模换换脑子。船模的主要配件明天就到，程序都写好了就等实际运行咯！

Last modification：January 25th, 2019 at 11:46 pm

If you think my article is useful to you, please feel free to appreciate

[项目准备] 论使用深度学习进行卡片识别如何轻松省力获取大量数据用四张卡片生成大量数据集

MrXiao • 2019 年 01 月 25 日

项目地址：Github传送门

参考项目：playing-card-detection

准备工作

必要的包：OpenCV和Numpy，还有以下不常用的：

imgaug : https://github.com/aleju/imgaug

用于图片放大。

shapely : https://github.com/Toblerity/Shapely

用于笛卡尔坐标系下的图像变换与几何图形分析。用来确认一张卡片是否被另一张卡片覆盖。

tqdm : https://github.com/tqdm/tqdm

进度条工具。

Win10下Python3.5安装shapely可能报错：

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      …………
        lgeos = CDLL("geos_c.dll")
      File "g:\python\pyhton36\lib\ctypes\__init__.py", line 348, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: [WinError 126] 找不到指定的模块。

可以进入https://www.lfd.uci.edu/~gohlke/pythonlibs/#shapely手动下载对应Windows和Python版本的whl，pip install *.whl手动安装。

四段以纯色为背景的视频，需要我们依次拍摄以四种水果在中心的卡片，最好持续20s以上以获得最佳效果。

下载数据集：Describable Textures Dataset (DTD).
数据集中有多种多样的背景可供选择。

思路

看到这张图我简直抑制不住激动的心情，你是匹配角落里的花色和数字，我是匹配中心的图案，需要修改的只是几个定位参数而已：

# 单位：mm
cardW = 57
cardH = 87
cornerXmin = 17
cornerXmax = 39
cornerYmin = 31
cornerYmax = 60
# 以上数据由手工根据我手中的卡牌大小测量，可适当修改或根据随后debug结果酌情调整

随后的操作基本就是依葫芦画瓢了，有改动的地方我会额外指出。

看着大佬的代码，我们又能学到另一种从单调背景中提取卡片的方法：

def extract_card(img, output_fn=None, min_focus=20, debug=False):

    imgwarp = None

    # Check the image is not too blurry
    focus = varianceOfLaplacian(img)
    if focus < min_focus:
        if debug: print("Focus too low :", focus)
        return False, None

    # Convert in gray color
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Noise-reducing and edge-preserving filter
    gray = cv2.bilateralFilter(gray, 11, 17, 17)

    # Edge extraction
    edge = cv2.Canny(gray, 30, 200)

    # Find the contours in the edged image
    _, cnts, _ = cv2.findContours(edge.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # We suppose that the contour with largest area corresponds to the contour delimiting the card
    cnt = sorted(cnts, key=cv2.contourArea, reverse=True)[0]

    # We want to check that 'cnt' is the contour of a rectangular shape
    # First, determine 'box', the minimum area bounding rectangle of 'cnt'
    # Then compare area of 'cnt' and area of 'box'
    # Both areas sould be very close
    rect = cv2.minAreaRect(cnt)
    box = cv2.boxPoints(rect)
    box = np.int0(box)
    areaCnt = cv2.contourArea(cnt)
    areaBox = cv2.contourArea(box)
    valid = areaCnt / areaBox > 0.95

    if valid:
        # We want transform the zone inside the contour into the reference rectangle of dimensions (cardW,cardH)
        ((xr, yr), (wr, hr), thetar) = rect
        # Determine 'Mp' the transformation that transforms 'box' into the reference rectangle
        if wr > hr:
            Mp = cv2.getPerspectiveTransform(np.float32(box), refCard)
        else:
            Mp = cv2.getPerspectiveTransform(np.float32(box), refCardRot)
        # Determine the warped image by applying the transformation to the image
        imgwarp = cv2.warpPerspective(img, Mp, (cardW, cardH))
        # Add alpha layer
        imgwarp = cv2.cvtColor(imgwarp, cv2.COLOR_BGR2BGRA)

        # Shape of 'cnt' is (n,1,2), type=int with n = number of points
        # We reshape into (1,n,2), type=float32, before feeding to perspectiveTransform
        cnta = cnt.reshape(1, -1, 2).astype(np.float32)
        # Apply the transformation 'Mp' to the contour
        cntwarp = cv2.perspectiveTransform(cnta, Mp)
        cntwarp = cntwarp.astype(np.int)

        # We build the alpha channel so that we have transparency on the
        # external border of the card
        # First, initialize alpha channel fully transparent
        alphachannel = np.zeros(imgwarp.shape[:2], dtype=np.uint8)
        # Then fill in the contour to make opaque this zone of the card
        cv2.drawContours(alphachannel, cntwarp, 0, 255, -1)

        # Apply the alphamask onto the alpha channel to clean it
        alphachannel = cv2.bitwise_and(alphachannel, alphamask)

        # Add the alphachannel to the warped image
        imgwarp[:, :, 3] = alphachannel

        # Save the image to file
        if output_fn is not None:
            cv2.imwrite(output_fn, imgwarp)

    if debug:
        cv2.imshow("Gray", gray)
        cv2.imshow("Canny", edge)
        edge_bgr = cv2.cvtColor(edge, cv2.COLOR_GRAY2BGR)
        cv2.drawContours(edge_bgr, [box], 0, (0, 0, 255), 3)
        cv2.drawContours(edge_bgr, [cnt], 0, (0, 255, 0), -1)
        cv2.imshow("Contour with biggest area", edge_bgr)
        if valid:
            cv2.imshow("Alphachannel", alphachannel)
            cv2.imshow("Extracted card", imgwarp)

    return valid, imgwarp

基于原项目的修改如下：

调整上述定位参数。
修改了上述卡片检测的阈值min_focus，从120调至20，在我实践过程中清晰的图片focus在50左右波动。
删除了扑克列表，加入了如下水果列表。card_fruits = ['Bananas','Lemons','Strawberries','Grapes']
在所有调用了matplotlib.pyplot库的函数末尾加入plt.show()以便在Pycharm等IDE中即时显示。
findHull函数使用多边形逼近水果的具体形状，由于角落的扑克标记与水果大小显然不一致，此函数中标注“TWEAK”参数均可根据需要调整，我在此仅调整了` min_hull_area = 2000 # TWEAK, deck and 'zoom' dependant
max_hull_area = 8000 # TWEAK, deck and 'zoom' dependant`
删除了原项目三卡片叠加的操作，因为三卡片叠加时完全会挡住卡片中心的图案，同时修改了叠加卡片的逻辑：

            for i in range(1, 3):
                # smallPoly1: shapely polygon of one of the hull of card 1
                smallPoly1 = kps_to_polygon(self.lkps1[i].keypoints[:])
                a = smallPoly1.area
                # We calculate area of the intersection of card 1 corner with card 2
                intersect = mainPoly2.intersection(smallPoly1)
                ai = intersect.area  # Center intersection area
                if ai != 0:  # Require two center parts can not intersect with each other， even partially
                    invalid = True
                    break
                else:
                    self.listbba.append(BBA(self.bbs1[i - 1], class1))
                '''
                # If intersection area is small enough, we accept card 2
                if (a - ai) / a > 1 - intersect_ratio:
                    self.listbba.append(BBA(self.bbs1[i - 1], class1))
                # If intersectio area is not small, but also not big enough, we want apply new transform to card 2
                elif (a - ai) / a > intersect_ratio:
                    invalid = True
                    break
                '''

被注释掉的部分是原逻辑，显然我们需要的是图案不能有一丝重叠，而不是“轻微的重叠是可以接受的”。

使用方法

注意全程需要大量内存，大概像这样，所以至少使用8GB内存的设备执行本代码或者自行删减背景数据集，同时关闭不必要的应用。

总结

今天累了，学了一整天的MLP还得做笔记，明天可能要做做船模换换脑子。船模的主要配件明天就到，程序都写好了就等实际运行咯！

[项目准备] 论使用深度学习进行卡片识别如何轻松省力获取大量数据用四张卡片生成大量数据集

准备工作

思路

使用方法

总结

Leave a Comment Cancel reply

在中国，当亲友遭遇非法集资型诈骗，与警察打交道时要注意些什么？（第一次报警纪实&经验总结）

每日自动打新可转债你也想半年白捡1600块吗？ Auto-Make-Money项目手记

PyTorch DistributedDataParallel 单机多卡训练踩坑记录

Windows服务器断开远程桌面RDP后保持桌面活跃(active)的方法

MMDetection v1.x 升级至 v2.x：模型与配置文件升级的踩坑记录

深刻理解激活函数与反向传播（Back Propagation）以及代码实现

位运算在程序设计中的奇淫技巧

树莓派3 到手后一定要做的几件事

此内容被密码保护

[C++] 当复制构造函数遇上对象数组时参数表中const关键字带来的疑问

[项目准备] 论使用深度学习进行卡片识别如何轻松省力获取大量数据用四张卡片生成大量数据集

准备工作

思路

使用方法

总结

准备工作

思路

使用方法

总结

Leave a Comment Cancel reply

[项目准备] 论使用深度学习进行卡片识别如何轻松省力获取大量数据 用四张卡片生成大量数据集

准备工作

思路

使用方法

总结

[项目准备] 论使用深度学习进行卡片识别如何轻松省力获取大量数据用四张卡片生成大量数据集