ORCID
https://orcid.org/0000-0002-9362-6461
Date of Award
Spring 2025
Language
English
Embargo Period
11-30-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
College/School/Department
Department of Computer Science
Program
Computer Science
First Advisor
Charalampos Chelmis
Committee Members
Amir Masoumzadeh, Petko Bogdanov, Jeong-Hyon Hwang
Keywords
Knowledge Graphs, RDF, Distributed Computing, Big Data infrastructure, Ensemble Learning, Link Prediction
Subject Categories
Artificial Intelligence and Robotics | Computer Sciences | Data Science | Other Computer Sciences | Software Engineering | Systems Architecture | Theory and Algorithms
Abstract
Knowledge graphs (KGs) have become popular across various fields, providing convenient access to web-based knowledge while storing and formalizing domain-specific information. By analyzing KGs, patterns, connections, and dependencies can be identified across different data sources, enabling the inference of new knowledge from given facts. As the use of KGs expands, the size of modern KGs has grown significantly, making them impossible to process within the main memory of a single computer. Distributed computing offers a viable solution to this challenge by leveraging the combined capabilities of multiple servers within a cluster. This thesis explores how distributed computing can be effectively utilized to perform machine learning over large knowledge graphs, with a focus on facilitating scalable analytics and accelerating downstream tasks. Specifically, this thesis addresses three core challenges: (i) how to facilitate analytics over Knowledge Graphs in Apache Spark, (ii) how to compute KG embeddings at scale, and (iii) how to accelerate and enhance a representative downstream task.
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Gergin, Bedirhan, "Large Scale Machine Learning over Knowledge Graphs" (2025). Electronic Theses & Dissertations (2024 - present). 190.
https://scholarsarchive.library.albany.edu/etd/190
Included in
Artificial Intelligence and Robotics Commons, Data Science Commons, Other Computer Sciences Commons, Software Engineering Commons, Systems Architecture Commons, Theory and Algorithms Commons