Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search
Literature review is the first step in starting research for a deep understanding of the research interest. However, finding literature relevant to research interests is difficult and takes time. Skyline query is a method that can be used for filtering. An object p is said to dominate object q if p...
Saved in:
Main Authors: | , , |
---|---|
Format: | UMS Journal (OJS) |
Language: | eng |
Published: |
Department of Informatics, Universitas Muhammadiyah Surakarta, Indonesia
2023
|
Subjects: | |
Online Access: | https://journals.ums.ac.id/index.php/khif/article/view/18127 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1805342481566400512 |
---|---|
author | Mardiah, Mardiah annisa, annisa Neyman, Shelvie Nidya |
author_facet | Mardiah, Mardiah annisa, annisa Neyman, Shelvie Nidya |
author_sort | Mardiah, Mardiah |
collection | OJS |
description | Literature review is the first step in starting research for a deep understanding of the research interest. However, finding literature relevant to research interests is difficult and takes time. Skyline query is a method that can be used for filtering. An object p is said to dominate object q if p equals q on all of its attributes, and p is at least better than q on one attribute. Categorical Data Skyline Search (CDSS) is an algorithm that can filter skyline objects in categorical data types such as documents. CDSS uses Extended Distance Wu and Palmer (DEWP) to calculate the distance between the user query and document keywords. The document keywords and user queries are represented as nodes in the ACM CCS ontology, and documents are assumed to be represented by a single keyword. This study aims to use the CDSS algorithm to search for skyline documents represented by more than one keyword by adding an aggregate function (average, minimum, maximum) to the CDSS algorithm, especially in calculating DEWP. This study used the thesis documents from the IPB University computer science department. Document keywords will be extracted using the Term Frequency-Inverse Term Frequency (TF-IDF) method. The collected keywords will be mapped in a mixed ontology tree that refers to the Association of Computing Machinery Computing Classification System 2012 (ACM CCS 2012) and Computer Science Ontology (CSO) as ontology standards in computer science. The skyline query algorithm for determining skyline documents is Block Nested Loop (BNL). The evaluation method uses the skyline ratio of each aggregate function in the CDSS. Based on the ratio value, CDSS using the maximum DEWP has the most relevant skyline results compared to the average DEWP and minimum DEWP. |
format | UMS Journal (OJS) |
id | oai:ojs2.journals.ums.ac.id:article-18127 |
institution | Universitas Muhammadiyah Surakarta |
language | eng |
publishDate | 2023 |
publisher | Department of Informatics, Universitas Muhammadiyah Surakarta, Indonesia |
record_format | ojs |
spelling | oai:ojs2.journals.ums.ac.id:article-18127 Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search Mardiah, Mardiah annisa, annisa Neyman, Shelvie Nidya categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency Literature review is the first step in starting research for a deep understanding of the research interest. However, finding literature relevant to research interests is difficult and takes time. Skyline query is a method that can be used for filtering. An object p is said to dominate object q if p equals q on all of its attributes, and p is at least better than q on one attribute. Categorical Data Skyline Search (CDSS) is an algorithm that can filter skyline objects in categorical data types such as documents. CDSS uses Extended Distance Wu and Palmer (DEWP) to calculate the distance between the user query and document keywords. The document keywords and user queries are represented as nodes in the ACM CCS ontology, and documents are assumed to be represented by a single keyword. This study aims to use the CDSS algorithm to search for skyline documents represented by more than one keyword by adding an aggregate function (average, minimum, maximum) to the CDSS algorithm, especially in calculating DEWP. This study used the thesis documents from the IPB University computer science department. Document keywords will be extracted using the Term Frequency-Inverse Term Frequency (TF-IDF) method. The collected keywords will be mapped in a mixed ontology tree that refers to the Association of Computing Machinery Computing Classification System 2012 (ACM CCS 2012) and Computer Science Ontology (CSO) as ontology standards in computer science. The skyline query algorithm for determining skyline documents is Block Nested Loop (BNL). The evaluation method uses the skyline ratio of each aggregate function in the CDSS. Based on the ratio value, CDSS using the maximum DEWP has the most relevant skyline results compared to the average DEWP and minimum DEWP. Department of Informatics, Universitas Muhammadiyah Surakarta, Indonesia 2023-04-10 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion application/pdf https://journals.ums.ac.id/index.php/khif/article/view/18127 10.23917/khif.v9i1.18127 Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika; Vol. 9 No. 1 April 2023 Khazanah Informatika; Vol. 9 No. 1 April 2023 2477-698X 2621-038X eng https://journals.ums.ac.id/index.php/khif/article/view/18127/8357 Copyright (c) 2023 Mardiah Mardiah, annisa annisa, Shelvie Nidya Neyman https://creativecommons.org/licenses/by/4.0 |
spellingShingle | categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency Mardiah, Mardiah annisa, annisa Neyman, Shelvie Nidya Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search |
title | Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search |
title_full | Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search |
title_fullStr | Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search |
title_full_unstemmed | Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search |
title_short | Aggregate Functions in Categorical Data Skyline Search (CDSS) for Multi-keyword Document Search |
title_sort | aggregate functions in categorical data skyline search cdss for multi keyword document search |
topic | categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency |
topic_facet | categorical data skyline search, aggregate function, ontology, skyline query, term frequency inverse term frequency |
url | https://journals.ums.ac.id/index.php/khif/article/view/18127 |
work_keys_str_mv | AT mardiahmardiah aggregatefunctionsincategoricaldataskylinesearchcdssformultikeyworddocumentsearch AT annisaannisa aggregatefunctionsincategoricaldataskylinesearchcdssformultikeyworddocumentsearch AT neymanshelvienidya aggregatefunctionsincategoricaldataskylinesearchcdssformultikeyworddocumentsearch |