Get Started

PyLOD works with Python 2.x and 3.x. It is easy to configure and use. Take a look at the following instructions!

Installation

PyLOD requires SPARQLWrapper to be installed. There are two installation options.

Either manually:

  1. Install SPARQLWrapper with instructions found here.

  2. Copy PyLOD.py to your project's directory.

Or use the pip command to install the package (and its requirement) from PyPi:

              
                pip install PyLOD
              
            

Usage

  1. Import the PyLOD class and create a PyLOD class object.

    					  
    						from PyLOD import PyLOD
    						pylod = PyLOD()
    					  
    					
  2. Provide a dictionary of desired namespaces.

    					  
    my_namespaces={
        "dbo": "http://dbpedia.org/ontology/",
        "dbp": "http://dbpedia.org/property/"
    }
    
    pylod.namespaces.set_namespaces(my_namespaces)
    					

    This step is optional, since PyLOD already incorporates a set of known namespaces. To get the list of defined namespaces, use this:

    					  
    print(pylod.namespaces.get_namespaces())
    					
  3. Define a dictionary of SPARQL endpoints to be queried:

    					  
    my_endpoints={
        "DBpedia": "http://dbpedia.org/sparql",
        "GeoLinkedData": "http://linkedgeodata.org/sparql"
    }
    
    pylod.endpoints.set_endpoints(my_endpoints)
    					

    If no endpoints are defined, PyLOD will use a pre-defined set of known endpoints. To get the list of these endpoints, do this:

    					  
    print(pylod.endpoints.get_endpoints())
    					
  4. Use PyLOD's expose functions to retrieve structured data from the endpoints. Set the optional argument limit_per_endpoint to limit the results per endpoint. For example:

    					  
    # Get entities of type owl:Class
    classes = pylod.expose.classes(limit_per_endpoint=100)
    
    # Get the sub-classes of a specific class 
    sub_classes = pylod.expose.sub_classes(super_class="dbo:Artist")
    
    # Get instances of a specific class 
    instances = pylod.expose.instances_of_class(cls="dbo:Artist", include_subclasses=True, limit_per_endpoint=50)
    
    # Execute custom SPARQL select query to all endpoints
    results = pylod.sparql.execute_select_to_all_endpoints(query="SELECT * WHERE {?s ?p ?o}")
    
    					

Expose functions

PyLOD's nested class Expose provides a set of functions to retrieve LOD with predefined SPARQL queries. All functions accept the optional argument limit_per_endpoint, which limits the quantity of retrieved results. Keep in mind, however, that certain endpoints also limit the returned results on the server side.

  • classes(int limit_per_endpoint=None)

    Retrieves the URIs of entities that are of type owl:Class.

  • sub_classes(str super_class, int limit_per_endpoint=None)

    Retrieves the URIs of classes that are in the domain of relationship rdfs:subClassOf, where the range is the given super class.

  • super_classes(str sub_class, int limit_per_endpoint=None)

    Retrieves the URIs of classes that are in the range of relationship rdfs:subClassOf, where the domain is the given sub class.

  • equivalent_classes(str cls, int limit_per_endpoint=None)

    Retrieves the URIs of classes that are related to the given class with property owl:equivalentClass.

  • disjoint_classes(str cls, int limit_per_endpoint=None)

    Retrieves the URIs of classes that are related to the given class with property owl:disjointWith.

  • sub_properties(str super_property, int limit_per_endpoint=None)

    Retrieves the URIs of properties that are in the domain of relationship rdfs:subPropertyOf, where the range is the given super property.

  • super_properties(str sub_property, int limit_per_endpoint=None)

    Retrieves the URIs of properties that are in the range of relationship rdfs:subPropertyOf, where the domain is the given sub-property.

  • triples(str subject=None, str predicate=None, str object=None, int limit_per_endpoint=None)

    Exposes triples with the given subject and/or predicate and/or object, within the scope of the tiple pattern Subject-Predicate-Object. If any of the arguments (subject, predicate, object) is not defined (None), then it will act as a variable in the query.

  • subjects(str predicate, str object, int limit_per_endpoint=None)

    Exposes entities found as subjects with the given predicate and object, within the scope of the tiple pattern Subject-Predicate-Object.

  • predicates(str subject, str object, int limit_per_endpoint=None)

    Exposes entities found as predicates with the given subject and object, within the scope of the tiple pattern Subject-Predicate-Object.

  • objects(str subject, str predicate, int limit_per_endpoint=None)

    Exposes entities found as objects with the given subject and predicate, within the scope of the tiple pattern Subject-Predicate-Object.

  • instances_of_class(str cls, bool include_subclasses=False, int limit_per_endpoint=None)

    Retrieves instances of the given class and (optionally) its subclasses..

  • labels(str entity, str language=None, int limit_per_endpoint=None)

    Retrieves the labels (rdfs:label) of the given entity. The optional language argument allows the definition of the desired language tag, as defined in BCP 47.

  • All arguments that correspond to entities (classes, properties, subjects, predicates, objects, etc.) may be provided either as URIs (e.g. "http://dbpedia.org/ontology/Artist" ) or with the use of defined prefixes (e.g. "dbo:Artist" ). PyLOD will check if the given arguments are URIs and will adjust the queries automatically. In case that a prefix is used but not defined (see section Usage above), an error or empty results may occur, based on the endpoint's policy.


SPARQL functions

Apart from the Expose functions, PyLOD incorporates a few functions for performing custom SPARQL select queries, based on SPARQLWrapper's features.

  • execute_select(str endpoint_url, str query, int limit=None)

    Executes the given select query against the provided endpoint.

  • execute_select_to_all_endpoints(str query, int limit_per_endpoint=None)

    Executes the given select query against all the defined endpoints in the PyLOD instance (see section Usage above).

  • is_active_endpoint(str endpoint_url)

    Checks if the given argument corresponds to an active SPARQL-served endpoint URL.