Determining the Impact of Eric Clapton on Music Using RDF Graphs
Selected Challenges of Semantics Across and Within Datasets
Ronald P. Reck
Kenneth B. Sall
Principal Systems Engineer/XML Data Analyst
Ken Sall Consulting
Wendy A. Swanbeck
As music is a topic of interest to many, it is no surprise that developers have applied web and semantic technology to provide various RDF datasets for describing relationships among musical artists, albums, songs, genres, and more. As avid fans of blues and rock music, we wondered if we could construct SPARQL queries to examine properties and relationships between performers in order to answer global questions such as "Who has had the greatest impact on rock music?" Our primary focus was Eric Clapton, a musical artist with a decades-spanning career who has enjoyed both a very successful solo career as well as having performed in several world-renowned bands.
The application of semantic technology to a public dataset can provide useful insights into how similar approaches can be applied to realistic domain problems, such as finding relationships between persons of interest. Clearly understood semantics of available RDF properties in the dataset is of course crucial but is a substantial challenge especially when leveraging information from similar yet different data sources.
This paper explores the use of DBpedia and MusicBrainz data sources using OpenLink Virtuoso Universal Server with a Drupal frontend. Much attention is given to the challenges we encountered, especially with respect to relatively large datasets of community-entered open data sources of varying quality and the strategies we employed or recommend to overcome the challenges.