Süperpiksel tabanlı satır bölütleme = Superpixel based text-line segmentation /

Demir, Ali Alper,
    1989-
    author
    196857; Özkaya, Ufuk,
    1981-
    thesis advisor
    23746; Süleyman Demirel Üniversitesi.
    Fen Bilimleri Enstitüsü.
    Elektronik ve Haberleşme Mühendisliği Anabilim Dalı.
    9124
    issuing body

dc.creator	Demir, Ali Alper, 1989- author 196857
dc.creator	Özkaya, Ufuk, 1981- thesis advisor 23746
dc.creator	Süleyman Demirel Üniversitesi. Fen Bilimleri Enstitüsü. Elektronik ve Haberleşme Mühendisliği Anabilim Dalı. 9124 issuing body
dc.date	2019.
dc.identifier	http://tez.sdu.edu.tr/Tezler/TF04249.pdf
dc.description	Tarihçilerin ve araştırmacıların tarihi veya el yazısı dokümanlar üzerinde araştırma yapabilmesi için ilgili dokümanı teker teker incelemesi gerekmektedir. Tarihi belgeler bilgisayar ortamına aktarıldıktan sonra araştırılmak istenen kelime girilerek belgedeki kelimenin geçtiği kısmın bulunması araştırmacı için büyük bir kolaylık sağlayacaktır. Bunun için el yazısı karakterlerin tanınması gerekmektedir. Doküman analizi uygulamalarından biri olan optik karakter tanıma (OCR) sistemlerinde bölütleme aşaması en önemli aşamalardan biridir. Satır bölütleme işlemi optik karakter tanıma sistemlerindeki aşamalar arasında ilk sıralarda yer aldığından dolayı daha iyi bir tanıma başarısı elde edilmesi için satırların yüksek doğrulukta bölütlenmesi gerekmektedir. Böylece devam eden diğer işlemlere daha doğru bir giriş verisi sağlanmış olur. Matbu belgeler için satır bölütleme işlemi başarılı bir şekilde yapılabilmektedir. El yazısı belgeler için satır bölütleme işlemi metin satırlarının eğik, eğri, dalgalı olması, satırlar arası boşlukların darlığı, örtüşen ve temas eden bileşenlerden dolayı hala zorlayıcı bir problemdir. Bu tez çalışmasında, el yazısı dokümanlar için süperpiksel tabanlı bir satır bölütleme yöntemi önerilmiştir. Önerilen yöntem Arapça, Çince ve İngilizce el yazısı doküman imgelerinden oluşan veri seti üzerinde uygulanıp performans metrikleri elde edilmiştir. Ayrıca tez kapsamında önerilen yöntem 853 adet Çince el yazısı doküman imgesi içeren HIT-MW veri seti üzerinde uygulanarak, % 98.03 tespit oranı, % 97.66 tanıma doğruluğu elde edilmiştir. Önerilen yöntem literatürde bulunan diğer yöntemlerle karşılaştırılmıştır. Anahtar Kelimeler: Süperpiksel, Satır Bölütleme, El Yazısı Belge, Doküman İmgesi Analizi, Metin Satırı Çıkarımı
dc.description	Historians and researchers need to examine the relevant document singly to research on historical or handwritten documents. After transferring historical documents to computer environment, it is a great convenience for the researcher to find the part of the word in the document by entering the word to be searched. Therefore, handwritten characters need to be recognized. In the OCR systems, one of the document analysis applications, the segmentation stage is one of the most important stages. To achieve a better recognition success, text lines must be segmented in high accuracy, since text line segmentation process is among the first steps in OCR systems. This allows more accurate input data to other processes in progress. For printed documents, text line segmentation can be done successfully. For handwritten documents, it is still a challenging problem because of the skewed, curved, fluctuated text lines, narrow gaps between the text lines, overlapping and touching components. In this thesis, a superpixel-based text line segmentation method for handwritten documents is proposed. The proposed method was applied on the dataset consisting of Arabic, Chinese and English handwritten document images and performance metrics were obtained. In addition, the method proposed within the thesis was applied on a dataset, HIT-MW, containing 853 Chinese handwritten document images, detection rate of 98.03% and recognition accuracy of 97.66% was obtained. Our method was compared with existing methods in the literature. Keywords: Superpixels, Text Line Segmentation, Handwritten Document, Document Image Analysis, Text-line Extraction
dc.description	Tez (Yüksek Lisans) - Süleyman Demirel Üniversitesi, Fen Bilimleri Enstitüsü, Elektronik ve Haberleşme Mühendisliği Anabilim Dalı, 2019.
dc.description	Kaynakça var.
dc.description	Tarihçilerin ve araştırmacıların tarihi veya el yazısı dokümanlar üzerinde araştırma yapabilmesi için ilgili dokümanı teker teker incelemesi gerekmektedir. Tarihi belgeler bilgisayar ortamına aktarıldıktan sonra araştırılmak istenen kelime girilerek belgedeki kelimenin geçtiği kısmın bulunması araştırmacı için büyük bir kolaylık sağlayacaktır. Bunun için el yazısı karakterlerin tanınması gerekmektedir. Doküman analizi uygulamalarından biri olan optik karakter tanıma (OCR) sistemlerinde bölütleme aşaması en önemli aşamalardan biridir. Satır bölütleme işlemi optik karakter tanıma sistemlerindeki aşamalar arasında ilk sıralarda yer aldığından dolayı daha iyi bir tanıma başarısı elde edilmesi için satırların yüksek doğrulukta bölütlenmesi gerekmektedir. Böylece devam eden diğer işlemlere daha doğru bir giriş verisi sağlanmış olur. Matbu belgeler için satır bölütleme işlemi başarılı bir şekilde yapılabilmektedir. El yazısı belgeler için satır bölütleme işlemi metin satırlarının eğik, eğri, dalgalı olması, satırlar arası boşlukların darlığı, örtüşen ve temas eden bileşenlerden dolayı hala zorlayıcı bir problemdir. Bu tez çalışmasında, el yazısı dokümanlar için süperpiksel tabanlı bir satır bölütleme yöntemi önerilmiştir. Önerilen yöntem Arapça, Çince ve İngilizce el yazısı doküman imgelerinden oluşan veri seti üzerinde uygulanıp performans metrikleri elde edilmiştir. Ayrıca tez kapsamında önerilen yöntem 853 adet Çince el yazısı doküman imgesi içeren HIT-MW veri seti üzerinde uygulanarak, % 98.03 tespit oranı, % 97.66 tanıma doğruluğu elde edilmiştir. Önerilen yöntem literatürde bulunan diğer yöntemlerle karşılaştırılmıştır. Anahtar Kelimeler: Süperpiksel, Satır Bölütleme, El Yazısı Belge, Doküman İmgesi Analizi, Metin Satırı Çıkarımı
dc.description	Historians and researchers need to examine the relevant document singly to research on historical or handwritten documents. After transferring historical documents to computer environment, it is a great convenience for the researcher to find the part of the word in the document by entering the word to be searched. Therefore, handwritten characters need to be recognized. In the OCR systems, one of the document analysis applications, the segmentation stage is one of the most important stages. To achieve a better recognition success, text lines must be segmented in high accuracy, since text line segmentation process is among the first steps in OCR systems. This allows more accurate input data to other processes in progress. For printed documents, text line segmentation can be done successfully. For handwritten documents, it is still a challenging problem because of the skewed, curved, fluctuated text lines, narrow gaps between the text lines, overlapping and touching components. In this thesis, a superpixel-based text line segmentation method for handwritten documents is proposed. The proposed method was applied on the dataset consisting of Arabic, Chinese and English handwritten document images and performance metrics were obtained. In addition, the method proposed within the thesis was applied on a dataset, HIT-MW, containing 853 Chinese handwritten document images, detection rate of 98.03% and recognition accuracy of 97.66% was obtained. Our method was compared with existing methods in the literature. Keywords: Superpixels, Text Line Segmentation, Handwritten Document, Document Image Analysis, Text-line Extraction
dc.language	tur
dc.publisher	Isparta : Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü,
dc.subject	Süleyman Demirel Üniversitesi
dc.title	Süperpiksel tabanlı satır bölütleme = Superpixel based text-line segmentation /
dc.type	text

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Fen Bilimleri Enstitüsü
Fen Bilimleri Enstitüsü koleksiyonlarını içerir.

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Süperpiksel tabanlı satır bölütleme = Superpixel based text-line segmentation /

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account