[Python] Naver Cloud 한글 OCR

250x250

Link

GitHub

나의 GitHub Contribution 그래프

Loading data ...

Notice

Recent Posts

Recent Comments

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

관리 메뉴

Data Science LAB

[Python] Naver Cloud 한글 OCR 본문

🖥️ Computer Vision/ocr

[Python] Naver Cloud 한글 OCR

ㅅ ㅜ ㅔ ㅇ 2022. 10. 18. 00:52

728x90

네이버에서 개발한 인공지능 플랫폼인 Naver CLOVA에서 제공하는 OCR API를 활용해 이미지에서 텍스트 영역을 감지하고 문자 인식(OCR)을 해보려고 한다.

Naver Cloud Platform

https://www.ncloud.com/

NAVER CLOUD PLATFORM

cloud computing services for corporations, IaaS, PaaS, SaaS, with Global region and Security Technology Certification

www.ncloud.com

1. 도메인 생성

Naver Cloud Platform 계정이 없다면 회원가입 후 로그인한다.

먼저, 결제 수단 등록을 해야한다.

마이페이지 -> 결제수단관리 클릭!

첫 가입이라면 100,000원의 크레딧을 제공해준다.

결제 수단 등록이 완료되었다면, 서비스 클릭

AI Service -> CLOVA OCR 클릭

이용 신청하기

콘솔 왼쪽에 CLOVA OCR이 보이면 이용신청 완료

Invoke URL 생성을 위해 API Gateway 이용 신청

서비스 -> Application Services -> API Gateway

도메인 추가

Naver Cloud Platform 에 접속하여 CLOVA OCR -> Domain -> 도메인 생성 클릭

도메인 이름과 코드, 지원언어는 한국어로 선택한 뒤, 서비스타입은 General로 선택!

인식모델은 Basic 으로 설정

Demo 버튼을 클릭하면 테스트 가능!

이미지를 드래그하면 테스트 가능합니다

2. Naver Clova OCR API 개발

데이터셋 구성

- train.csv, test.csv 파일에는 각 이미지의 경로가 존재

- train, test 폴더에는 각각의 이미지들 존재

ex) train_00001.png

CLOVA OCR은 REST API 이용

import numpy as np
import platform
from PIL import ImageFont, ImageDraw, Image
from matplotlib import pyplot as plt
 
import uuid
import json
import time
import cv2
import requests

import pandas as pd
from tqdm import tqdm
import os

도메인 생성시 생성한 Invoke URL의 키 이용하여 각각 api_url과 secret_key에 입력

api_url = <고유 API_URL>
secret_key = <고유 SECRET_KEY>

이미지의 출력 결과를 확인하기 위한 함수 생성

def plt_imshow(title='image', img=None, figsize=(8 ,5)):
    plt.figure(figsize=figsize)
 
    if type(img) == list:
        if type(title) == list:
            titles = title
        else:
            titles = []
 
            for i in range(len(img)):
                titles.append(title)
 
        for i in range(len(img)):
            if len(img[i].shape) <= 2:
                rgbImg = cv2.cvtColor(img[i], cv2.COLOR_GRAY2RGB)
            else:
                rgbImg = cv2.cvtColor(img[i], cv2.COLOR_BGR2RGB)
 
            plt.subplot(1, len(img), i + 1), plt.imshow(rgbImg)
            plt.title(titles[i])
            plt.xticks([]), plt.yticks([])
        plt.show()
    else:
        if len(img.shape) < 3:
            rgbImg = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
        else:
            rgbImg = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 
        plt.imshow(rgbImg)
        plt.title(title)
        plt.xticks([]), plt.yticks([])
        plt.show()

한글 출력을 위한 함수 생성

def put_text(image, text, x, y, color=(0, 255, 0), font_size=22):
    if type(image) == np.ndarray:
        color_coverted = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        image = Image.fromarray(color_coverted)
 
    if platform.system() == 'Darwin':
        font = 'AppleGothic.ttf'
    elif platform.system() == 'Windows':
        font = 'malgun.ttf'
        
    image_font = ImageFont.truetype(font, font_size)
    font = ImageFont.load_default()
    draw = ImageDraw.Draw(image)
 
    draw.text((x, y), text, font=image_font, fill=color)
    
    numpy_image = np.array(image)
    opencv_image = cv2.cvtColor(numpy_image, cv2.COLOR_RGB2BGR)
 
    return opencv_image

test

img = cv2.imread(path)
roi_img = img.copy()
 
for field in result['images'][0]['fields']:
    text = field['inferText']
    vertices_list = field['boundingPoly']['vertices']
    pts = [tuple(vertice.values()) for vertice in vertices_list]
    topLeft = [int(_) for _ in pts[0]]
    topRight = [int(_) for _ in pts[1]]
    bottomRight = [int(_) for _ in pts[2]]
    bottomLeft = [int(_) for _ in pts[3]]
 
    cv2.line(roi_img, topLeft, topRight, (0,255,0), 2)
    cv2.line(roi_img, topRight, bottomRight, (0,255,0), 2)
    cv2.line(roi_img, bottomRight, bottomLeft, (0,255,0), 2)
    cv2.line(roi_img, bottomLeft, topLeft, (0,255,0), 2)
    roi_img = put_text(roi_img, text, topLeft[0], topLeft[1] - 10, font_size=30)
    
    print(text)
 
plt_imshow("Original", img,  figsize=(16, 10))
plt_imshow("ROI", roi_img, figsize=(16, 10))

결과 추출

test_df = pd.read_csv('../data/test.csv')

results = []
for i_path in test_df['img_path']:
    path = os.path.join('../data',i_path[2:])
    files = [('file',open(path, 'rb'))]
    request_json = {'images': [{'format': 'jpg',
                                'name': 'demo'
                               }],
                    'requestId': str(uuid.uuid4()),
                    'version': 'V2',
                    'timestamp': int(round(time.time() * 1000))
                   }
 
    payload = {'message': json.dumps(request_json).encode('UTF-8')}
 
    headers = {
                'X-OCR-SECRET': secret_key,
            }       
 
    response = requests.request("POST", api_url, headers=headers, data=payload, files=files)
    result = response.json()
    img = cv2.imread(path)
    roi_img = img.copy()
    
    text = ''
    try:
        for field in result['images'][0]['fields']:
            t = field['inferText']
            text = "".join([text,t])
        results.append(text)
        print(text)
    except:
        results.append(text)

for 문을 사용하여 속도가 느리기 때문에 데이터셋이 많지 않은 경우 추천

728x90

'🖥️ Computer Vision > ocr' 카테고리의 다른 글

[Python] EasyOCR 을 이용한 이미지에서 한글 인식하기 (0)	2022.10.19
[Python] Mac에서 tesseract 설치하기 및 한글 추가 (0)	2022.09.15

'🖥️ Computer Vision/ocr' Related Articles

Comments

Data Science LAB

[Python] Naver Cloud 한글 OCR 본문

[Python] Naver Cloud 한글 OCR

1. 도메인 생성

2. Naver Clova OCR API 개발

이미지의 출력 결과를 확인하기 위한 함수 생성

한글 출력을 위한 함수 생성

결과 추출

'🖥️ Computer Vision > ocr' 카테고리의 다른 글

티스토리툴바