Silk Framework

Silk Framework

The Web of Data is built upon two simple ideas: First, to employ the RDF data model to publish structured data on the Web. Second, to set explicit RDF links between data items within different data sources. Background information about the Web of Data is found at the wiki pages of the W3C Linking Open Data community effort, in the overview article Linked Data - The Story So Far and in the tutorial on How to publish Linked Data on the Web.

The Silk Link Discovery Framework supports data publishers in accomplishing the second task. Using the declarative Silk - Link Specification Language (Silk-LSL), developers can specify which types of RDF links should be discovered between data sources as well as which conditions data items must fulfill in order to be interlinked. These link conditions may combine various similarity metrics and can take the graph around a data item into account, which is addressed using an RDF path language. Silk accesses the data sources that should be interlinked via the SPARQL protocol and can thus be used against local as well as remote SPARQL endpoints.

The main features of the Silk link discovery engine are:

  • Open source link discovery framework, running on all major platforms
  • Support of RDF link generation (owl:sameAs links as well as other types)
  • Flexible, declarative language for specifying link conditions
  • Employment in distributed environments (by accessing local and remote SPARQL endpoints)
  • Usable in situations where terms from different vocabularies are mixed and where no consistent RDFS or OWL schemata exist
  • Scalability and high performance through efficient data handling (speedup factor of 20 compared to Silk 0.2):
    • Reduction of network load by caching and reusing of SPARQL result sets
    • Multi-threaded computation of the data item comparisons (3 million comparisons per minute on a Core2 Duo)
    • Optional blocking of data items

Silk is implemented in Scala running on the Java Virtual Machine. In order to run Silk, developers need to:

  1. Have SPARQL access to the datasets that should be interlinked.
  2. Write a link specification as described in the Silk - User Manual.
  3. Install the Silk framework as described in the Installation and Usage section of the manual.

LOD2 Demos