DSpace Repository

A Comparison of Different Approaches to Document Representation in Turkish Language

Show simple item record

dc.creator YILDIRIM, Savaş
dc.creator YILDIZ, Tuğba
dc.date 2018-08-15T00:00:00Z
dc.date.accessioned 2019-07-09T12:00:50Z
dc.date.available 2019-07-09T12:00:50Z
dc.identifier http://dergipark.org.tr/sdufenbed/issue/38975/456349
dc.identifier
dc.identifier.uri http://acikerisim.sdu.edu.tr/xmlui/handle/123456789/46944
dc.description Recently, deep learning methods have demonstrated state-of-the-art performance in numerous complex Natural Language Processing (NLP) problems. Easy accessibility of high-performance computing resources and open-source libraries makes Artificial Intelligence (AI) approaches more applicable for researchers. This sudden growth of available techniques shaped and improved standards in the field of NLP. Thus, we find an opportunity to compare different approaches to document representation, owing to various open-source libraries and a large amount of research. We evaluate four different paradigms to represent documents: Traditional bag-of-words approaches, topic modeling, embedding based approach and deep learning. As the main contribution of this article, we aim at evaluating all these representation approaches with suitable machine learning algorithms for document categorization problem in the Turkish language. The supervised architecture uses a benchmark dataset specifically prepared for this language. Within the architecture, we evaluate the representation approaches with corresponding machine learning algorithms such as Support Vector Machine (SVM), multi-nominal Naive Bayes Algorithm (m-NB) and so forth. We conduct a variety of experiments and present successful results for the Turkish document categorization. We also observed that tradition approaches have still comparable results with Neural Network models in terms of document classification.
dc.format application/pdf
dc.publisher Süleyman Demirel University
dc.publisher Süleyman Demirel Üniversitesi
dc.relation http://dergipark.org.tr/download/article-file/528955
dc.source Volume: 22, Issue: 2 569-576 en-US
dc.source 1308-6529
dc.subject Document representation,Deep learning; Natural language processing
dc.title A Comparison of Different Approaches to Document Representation in Turkish Language en-US
dc.type info:eu-repo/semantics/article


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account