Linked Data#

Linked data is structured data that can be queried in semantic way in combination with a broader body of existing data. In effect, it means that the data can be read and understood autonomously by computers by virtue of attached metadata with specific properties. Several web technologies facilitate the development of linked-data, which when made available online can be regarded as part of the semantic web. The primary technologies are:

The RDF is a data model for metadata. It allows metadata to be expressed via a directed graph, with each edge expressed as a ‘triple’. The elements of the triple are the subject node (that the edge leads from), the predicate joining two nodes and the object node that the edge enters. Each of the parts can be identified by a URI, or an object can be a literal. The RDF graph can be serialized, with the Terse RDF Triple Language (turtle) being common, with XML and JSON-LD being used also. If it needs to be combined with HTML then HTML RDFA can be used. Other associated technologies are SPARQL which is a query language for RDF graphs and RDF Schema (RDFS) and the Web Ontology Language (OWL) which allow semantic descriptions of RDF data. A triplestore is a type of database for holding RDF graphs as triples. The Shapes Constraint Language (SHACL) is used to describe and validate RDF graphs.

Ontologies in the context of linked data are way to describe classes of objects (individuals), via nouns, and relations (property assertions) between the objects, via verbs. They are constructed in such a way to allow autonomous formal (propositional) logical reasoning based on the relations they describe. That is, new information can be generated about an object or relationship through inference based on links descibed by an ontology. The OWL allows for authoring of ontologies.

Vocabularies and Ontologies#

Vocabularies can be used in RDF graphs, such as through RDFS or OWL. Persistent URLs (PURLs) are often used to host vocabularies, so that they can continue to be resolved even if their hosting address changes. purl.org and similar are often seen in vocabulary and ontology hosting.

Data and Datasets#

Collections:

  • schema.org: Vocabs include internet media (Book, Movie etc), Person, Organization

  • EuroVoc: EU Vocabularies

Software#

Python:

Java:

Research Objects#

https://wf4ever.github.io/ro/2016-01-28/ https://www.researchobject.org/specifications/bundle/

Scientific Applications#

The Infrastructure for Spatial Information in Europe (INSPIRE) is a set of resources for spatial data information sharing in Europe. It includes a set of directives to make sure that published data is interoperable. Some INSPIRE resources:

Further Reading#