Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search

Literature review is the first step in starting research for a deep understanding of the research interest. However, finding literature relevant to research interests is difficult and takes time. Skyline query is a method that can be used for filtering. An object p is said to dominate object q if p...

Full description

Saved in:
Bibliographic Details
Main Authors: Mardiah, Mardiah, annisa, annisa, Neyman, Shelvie Nidya
Format: UMS Journal (OJS)
Language:eng
Published: Department of Informatics, Universitas Muhammadiyah Surakarta, Indonesia 2023
Subjects:
Online Access:https://journals.ums.ac.id/index.php/khif/article/view/18127
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1805342481566400512
author Mardiah, Mardiah
annisa, annisa
Neyman, Shelvie Nidya
author_facet Mardiah, Mardiah
annisa, annisa
Neyman, Shelvie Nidya
author_sort Mardiah, Mardiah
collection OJS
description Literature review is the first step in starting research for a deep understanding of the research interest. However, finding literature relevant to research interests is difficult and takes time. Skyline query is a method that can be used for filtering. An object p is said to dominate object q if p equals q on all of its attributes, and p is at least better than q on one attribute. Categorical Data Skyline Search (CDSS) is an algorithm that can filter skyline objects in categorical data types such as documents. CDSS uses Extended Distance Wu and Palmer (DEWP) to calculate the distance between the user query and document keywords. The document keywords and user queries are represented as nodes in the ACM CCS ontology, and documents are assumed to be represented by a single keyword. This study aims to use the CDSS algorithm to search for skyline documents represented by more than one keyword by adding an aggregate function (average, minimum, maximum) to the CDSS algorithm, especially in calculating DEWP. This study used the thesis documents from the IPB University computer science department. Document keywords will be extracted using the Term Frequency-Inverse Term Frequency (TF-IDF) method. The collected keywords will be mapped in a mixed ontology tree that refers to the Association of Computing Machinery Computing Classification System 2012 (ACM CCS 2012) and Computer Science Ontology (CSO) as ontology standards in computer science. The skyline query algorithm for determining skyline documents is Block Nested Loop (BNL). The evaluation method uses the skyline ratio of each aggregate function in the CDSS. Based on the ratio value, CDSS using the maximum DEWP has the most relevant skyline results compared to the average DEWP and minimum DEWP.
format UMS Journal (OJS)
id oai:ojs2.journals.ums.ac.id:article-18127
institution Universitas Muhammadiyah Surakarta
language eng
publishDate 2023
publisher Department of Informatics, Universitas Muhammadiyah Surakarta, Indonesia
record_format ojs
spelling oai:ojs2.journals.ums.ac.id:article-18127 Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search Mardiah, Mardiah annisa, annisa Neyman, Shelvie Nidya categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency Literature review is the first step in starting research for a deep understanding of the research interest. However, finding literature relevant to research interests is difficult and takes time. Skyline query is a method that can be used for filtering. An object p is said to dominate object q if p equals q on all of its attributes, and p is at least better than q on one attribute. Categorical Data Skyline Search (CDSS) is an algorithm that can filter skyline objects in categorical data types such as documents. CDSS uses Extended Distance Wu and Palmer (DEWP) to calculate the distance between the user query and document keywords. The document keywords and user queries are represented as nodes in the ACM CCS ontology, and documents are assumed to be represented by a single keyword. This study aims to use the CDSS algorithm to search for skyline documents represented by more than one keyword by adding an aggregate function (average, minimum, maximum) to the CDSS algorithm, especially in calculating DEWP. This study used the thesis documents from the IPB University computer science department. Document keywords will be extracted using the Term Frequency-Inverse Term Frequency (TF-IDF) method. The collected keywords will be mapped in a mixed ontology tree that refers to the Association of Computing Machinery Computing Classification System 2012 (ACM CCS 2012) and Computer Science Ontology (CSO) as ontology standards in computer science. The skyline query algorithm for determining skyline documents is Block Nested Loop (BNL). The evaluation method uses the skyline ratio of each aggregate function in the CDSS. Based on the ratio value, CDSS using the maximum DEWP has the most relevant skyline results compared to the average DEWP and minimum DEWP. Department of Informatics, Universitas Muhammadiyah Surakarta, Indonesia 2023-04-10 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion application/pdf https://journals.ums.ac.id/index.php/khif/article/view/18127 10.23917/khif.v9i1.18127 Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika; Vol. 9 No. 1 April 2023 Khazanah Informatika; Vol. 9 No. 1 April 2023 2477-698X 2621-038X eng https://journals.ums.ac.id/index.php/khif/article/view/18127/8357 Copyright (c) 2023 Mardiah Mardiah, annisa annisa, Shelvie Nidya Neyman https://creativecommons.org/licenses/by/4.0
spellingShingle categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency
Mardiah, Mardiah
annisa, annisa
Neyman, Shelvie Nidya
Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
title Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
title_full Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
title_fullStr Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
title_full_unstemmed Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
title_short Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
title_sort aggregate functions in categorical data skyline search cdss for multi keyword document search
topic categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency
topic_facet categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency
url https://journals.ums.ac.id/index.php/khif/article/view/18127
work_keys_str_mv AT mardiahmardiah aggregatefunctionsincategoricaldataskylinesearchcdssformultikeyworddocumentsearch
AT annisaannisa aggregatefunctionsincategoricaldataskylinesearchcdssformultikeyworddocumentsearch
AT neymanshelvienidya aggregatefunctionsincategoricaldataskylinesearchcdssformultikeyworddocumentsearch