All Issue

2024 Vol.12, Issue 4 Preview Page

Research Article

31 December 2024. pp. 93-105
Abstract
본 연구에서는 연결개방데이터(Linked Open Data: LOD) 클라우드에서 상이한 URI들로 식별되는 개체들이 실질적으로는 동일하다는 사실을 기술하는 동일연결 RDF 트리플들을 자동으로 생성하는 방안을 제안하였다. 동일연결은 소스개체로부터 타겟개체로의 방향성을 가지며 이를 통하여 소스개체에 대한 검색에 타겟개체 내용을 더할 수 있도록 함으로써 풍부한 검색결과 구성을 가능하게 한다. 동일연결 생성을 위해서는 소스, 타겟 개체가 동일함을 판단하여야 하는데 이를 위하여 본 연구에서는 의사식별자와 엔트로피 개념을 도입한 엔트로피 기반 의사식별자 구성 및 개체 동일성 평가(Entropy-based Pseudo Identifier Construction and Entity Sameness Evaluation: EPIC&ESE)를 제안하였다. 엔트로피가 높은 술어속성들을 선정하여 의사식별자를 구성하고 이들에 연결된 소스, 타겟 목적어값들의 유사수준을 평가하여 소스, 타겟 개체들이 충분히 동일한가를 판단하였다. EPIC&ESE는 신뢰적이고 확장적으로 동일연결 집합을 생성함을 실험을 통하여 확인하였다.
This study proposes a method for automatically generating sameAs RDF triples that describe the fact that entities identified by different URIs in the Linked Open Data (LOD) cloud are essentially identical. The sameAs links have a directional nature from the source entity to the target entity, allowing the inclusion of target entity content in searches for the source entity, thereby enabling the construction of enriched search results. To generate these sameAs links, it is necessary to determine whether the source and target entities are indeed identical. To this end, this study introduces the Entropy-based Pseudo Identifier Construction and Entity Sameness Evaluation (EPIC&ESE), which incorporates the concepts of pseudo-identifier and entropy. By selecting predicate properties with high entropy to construct pseudo-identifier and evaluating the similarity levels of the connected source and target object values, we determine whether the source and target entities are sufficiently identical. Experiments have confirmed that EPIC&ESE can reliably and extensively generate a set of sameAs links.
References
  1. C. Bizer, T. Heath, and T. Berners-Lee, "Linked data - The Story So Far", International Journal of Semantic Web, Vol. 5. No. 3, pp. 1-22, 2009. DOI: 10.4018/jswis.200908190110.4018/jswis.2009081901

    10.4018/jswis.2009081901
  2. W3C, "What is Linked Data". https://www.w3c.org/standard/semanticweb/data

  3. A. Abele, and J. McCrae, "The Linked Open Data Cloud Diagram". https://lod-cloud.net

  4. J. Byun, Y. Sohn, E. Bertino, and N. Li, "Secure Anonymization for Incremental Data Sets", Proceedings of the 3rd VLDB Workshop on Secure Data Management, Vol. 4165, pp. 46-63, 2006. DOI: 10.1007/11844662_4

    10.1007/11844662_4
  5. J. Byun, N. Li, E. Bertino, and Y. Sohn, "Privacy-Preserving Incremental Data Dissemination", ACM Journal of Computer Security, Vol. 17, No. 1, pp. 43-68, 2009. DOI: 10.5555/1517343.1517345

    10.3233/JCS-2009-0316
  6. S. Dastani, and M. Ommen, "Causual Entropy and Information Gain for Measuring Casual Control," Proceedings of the European Conference on Artificial Intelligence, pp. 45-61, 2023. DOI: 10.48550/arXiv.2309.07703

  7. F. Simoes, M. Dastani, and T. Ommen, "Fundamental Property of Casual Entropy and Information Gain," Proceedings of Machine Learning Research, pp. 188-208, 2024. DOI: 10.48550/arXiv.2402..01341

  8. W3C, "Resource Description Framework(RDF)". https://w3.org/RDF/

  9. W3C, "The Organization Ontology". https://w3.org/TR/vocab-org/

  10. W3C, "RDF-Schema 1.1". https://w3.org/TR/rdf-schema/

  11. W3C, "Web Ontology Language(OWL)". https://w3.org/OWL

  12. W3C, "OWL2 Web Ontology Language Profiles(Second Edition)". https://w3.org/TR/owl2-profiles/

  13. W3C, "SPARQL 1.2 Language". https://w3.org/TR/spaqql12-query

  14. W3C, "Sparqlendpoints". https://w3.org/SparqlEndpoints

  15. J. Volz, "Silk - A Link Discovery Framework for the Web of Data", Proceedings of the 2nd Workshop on Linked Data on the Web, pp. 238-247, 2009.

  16. A. Ngonga, and S. Auer, "LIMES - A Time Efficient Approach for Large Scale Link Discovery on the Web of Data", Proceedings of the 22nd IJCAI, pp. 2312-2317, 2011. DOI: 10.5591/978-1-55735-516-8/IJCAI11-385

  17. J. Park, and Y. Sohn, "A Syntax Added Link Evaluation Technique for Improving Trustworthiness of LOD's Linkages," Journal of KIISE: Databases, Vol. 41, No. 1, pp. 45-61, 2014.

  18. J. Park, and Y. Sohn, "Trustworthiness Improving Link Evaluation Technique for LOD Linkages giving Considerations to the Syntatic Properties of RDFS, OWL and OWL2", Journal of KIISE: Databases, Vol. 41, No. 4, pp. 226-241, 2014.

  19. Y. Sohn, "Reliability Improving sameAs Link Evaluation Technique for Linked Open Data Publication," INFORMATION, Vol. 19, No. 9, pp. 4271-4279, 2016.

  20. J. Jerson, and N. Preethi, "An Analysis of Levenstein Distance Using Dynamic Programing Method," Proceedings of the 3rd International Conference on Recent Trends in Machine Learning, IOT, Smart Cities and Applications, pp. 525-532, 2023. DOI: 10.1007/978-981-19-6088-8_46

    10.1007/978-981-19-6088-8_46
  21. W3C, "Sparql-wrapper". https://w3.org/sw/wiki/Sparql-wrappe

Information
  • Publisher :The Society of Convergence Knowledge
  • Publisher(Ko) :융복합지식학회
  • Journal Title :The Society of Convergence Knowledge Transactions
  • Journal Title(Ko) :융복합지식학회논문지
  • Volume : 12
  • No :4
  • Pages :93-105