Wikidata:WikiProject Datasets/Data Structure/DCAT - Wikidata - Schema.org mapping

From Wikidata
Jump to navigation Jump to search

This is a working draft intended to document a mapping exercise between DCAT, Schema.org and Wikidata.

The mappings here defined refer to the following documents:

Important:

  • Not all classes have yet been mapped. As for the classes specified by the DCAT application profile, only mandatory and recommended classes have been mapped. For the properties, a mapping has been done for Dataset and Distribution.
  • Not all properties marked with [NEW PROPERTY] below have yet been proposed. Properties should only be proposed if they are actually going to be used. So feel free to propose them if you need these properties in a certain context.

DCAT Application Profile Classes[edit]

Mandatory Classes[edit]

DCAT Class name Usage note for the DCAT Application Profile DCAT URI Reference Schema.org Representation in Wikidata
Agent An entity that is associated with Catalogues and/or Datasets. If the Agent is an organisation, the use of the Organization Ontology is recommended. See section 7 for a discussion on Agent roles. foaf:Agent http://xmlns.com/foaf/spec/#term_Agent , http://www.w3.org/TR/vocab-org/ schema:Agent agent (Q24229398)
Catalogue A catalogue or repository that hosts the Datasets being described. dcat:Catalog http://www.w3.org/TR/2013/WD-vocab-dcat-20130312/#class-catalog schema:DataCatalog catalog (Q29937289)
Dataset A conceptual entity that represents the information published. dcat:Dataset http://www.w3.org/TR/2013/WD-vocab-dcat-20130312/#class-dataset schema:Dataset data set (Q1172284)
Literal A literal value such as a string or integer; Literals may be typed, e.g. as a date according to xsd:date. Literals that contain human-readable text have an optional language tag as defined by BCP 47. rdfs:Literal http://www.w3.org/TR/rdf-concepts/#section-Literals schema:PropertyValue Literal (Q121572)
Resource Anything described by RDF. rdfs:Resource http://www.w3.org/TR/rdf-schema/#ch_resource schema:DataDownload web resource (Q3427877)

Recommended Classes[edit]

DCAT Class name Usage note for the DCAT Application Profile DCAT URI Reference Schema.org Representation in Wikidata
Category A subject of a Dataset. skos:Concept http://www.w3.org/TR/2013/WD-vocab-dcat-20130312/#class-category-and-category-scheme pending.schema:CategoryCode

(defined in the pending.schema.org extension)

??
Category scheme A concept collection (e.g. controlled vocabulary) in which the Category is defined. skos:ConceptScheme http://www.w3.org/TR/2013/WD-vocab-dcat-20130312/#class-category-and-category-scheme pending.schema:CategoryCodeSet

(defined in the pending.schema.org extension)

??
Distribution A physical embodiment of the Dataset in a particular format. dcat:Distribution  http://www.w3.org/TR/2013/WD-vocab-dcat-20130312/#class-distribution schema:DataDownload

schema:Dataset

(as described at http://schema.org/distribution)

digital distribution (Q269415)

(to be discussed)

Licence document A legal document giving official permission to do something with a resource. dct:LicenseDocument http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#LicenseDocument schema:CreativeWork

schema:URL

(as described at schema:license)

license (Q79719)

Optional Classes[edit]

[Contribution needed] Do a mapping for the remaining classes here if you need them anywhere.

DCAT Class name Usage note for the DCAT Application Profile DCAT URI Reference Schema.org Representation in Wikidata
TBD TBD TBD TBD TBD TBD

Properties per Class[edit]

Dataset[edit]

Mandatory properties[edit]

DCAT Property name DCAT URI Range Usage note Card. Schema.org Representation in Wikidata
description dct:description rdfs:Literal This property contains a free-text account of the Dataset. This property can be repeated for parallel language versions of the description. 1..n schema:description description
title dct:title rdfs:Literal This property contains a name given to the Dataset. This property can be repeated for parallel language versions of the name. 1..n schema:name label

Recommended properties[edit]

contact point dcat:contactPoint vcard:Kind This property contains contact information that can be used for sending comments about the Dataset. 0..n schema:contactPoint 1st proposal (Not done) 2nd proposal (Not done)
dataset distribution dcat:distribution dcat:Distribution This property links the Dataset to an available Distribution. 0..n schema:distribution dataset distribution (P2702)
keyword/ tag dcat:keyword rdfs:Literal This property contains a keyword or tag describing the Dataset. 0..n schema:keywords can be omitted or be represented by specific properties
publisher dct:publisher foaf:Agent This property refers to an entity (organisation) responsible for making the Dataset available. 0..1 schema:publisher publisher (P123)
theme/ category dcat:theme, subproperty of dct:subject skos:Concept This property refers to a category of the Dataset. A Dataset may be associated with multiple themes. 0..n schema:about main subject:P921

facet of (P1269)

Optional properties[edit]

access rights dct:accessRights dct:RightsStatement This property refers to information that indicates whether the Dataset is open data, has access restrictions or is not public. A controlled vocabulary with three members (:public, :restricted, :non-public) will be created and maintained by the Publications Office of the EU. 0..1 ?? [NEW PROPERTY]
conforms to dct:conformsTo dct:Standard This property refers to an implementing rule or other specification. 0..n ?? [NEW PROPERTY]
documentation foaf:page foaf:Document This property refers to a page or document about this Dataset. 0..n schema:url URL (P2699) with qualifier of pointing to documentation (Q788790)
frequency dct:accrualPeriodicity dct:Frequency This property refers to the frequency at which the Dataset is updated. 0..1 ?? publication interval (P2896)
has version dct:hasVersion dcat:Dataset This property refers to a related Dataset that is a version, edition, or adaptation of the described Dataset. 0..n ?? followed by (P156)
identifier dct:identifier rdfs:Literal This property contains the main identifier for the Dataset, e.g. the URI or other unique identifier in the context of the Catalogue. 0..n schema:identifier catalog code (P528) (optional with qualifier catalog (P972)
is version of dct:isVersionOf dcat:Dataset This property refers to a related Dataset of which the described Dataset is a version, edition, or adaptation. 0..n ?? follows (P155)
landing page dcat:landingPage foaf:Document This property refers to a web page that provides access to the Dataset, its Distributions and/or additional information. It is intended to point to a landing page at the original data provider, not to a page on a site of a third party, such as an aggregator. 0..n schema:url official website (P856)
language dct:language dct:LinguisticSystem This property refers to a language of the Dataset. This property can be repeated if there are multiple languages in the Dataset. 0..n schema:inLanguage language of work or name (P407)
other identifier adms:identifier adms:Identifier This property refers to a secondary identifier of the Dataset, such as MAST/ADS, DataCite, DOI, EZID or W3ID 0..n schema:identifier specific id property if needed
provenance dct:provenance dct:ProvenanceStatement This property contains a statement about the lineage of a Dataset. 0..n ?? [NEW PROPERTY]
related resource dct:relation rdfs:Resource This property refers to a related resource. 0..n schema:url URL (P2699) with qualifier of pointing to documentation (Q788790)
release date dct:issued rdfs:Literal typed as xsd:date or xsd:dateTime This property contains the date of formal issuance (e.g., publication) of the Dataset. 0..1 schema:datePublished publication date (P577)
sample adms:sample dcat:Distribution This property refers to a sample distribution of the dataset 0..n schema:distribution dataset distribution (P2702) with qualifier of (P642) with value example (Q14944328)
source dct:source dcat:Dataset This property refers to a related Dataset from which the described Dataset is derived. 0..n schema:Dataset tbd

Note: imported from Wikimedia project (P143) should probably not be used given its definition and scope of use.

spatial/ geographical coverage dct:spatial dct:Location This property refers to a geographic region that is covered by the Dataset. 0..n schema:spatial location (P276)

(as there is an example with the value "worldwide" in the Wikidata property example)

temporal coverage dct:temporal dct:PeriodOfTime This property refers to a temporal period that the Dataset covers. 0..n schema:datasetTimeInterval According to the description, start of covered period (P7103) and end of covered period (P7104) are appropriate. "4.11.1 Optional properties for Period of Time" in DCAT-AP v1.1 says schema:startDate and schema:endDate are the corresponding properties, and start time (P580) and end time (P582) says they are equivalent to those. (These relationships imply that start of covered period (P7103) should be the sub-property of start time (P580)).
type dct:type skos:Concept This property refers to the type of the Dataset. A controlled vocabulary for the values has not been established. 0..1 ?? specific property if needed
update/ modification date dct:modified rdfs:Literal typed as xsd:date or xsd:dateTime This property contains the most recent date on which the Dataset was changed or modified. 0..1 schema:dateModified significant event (P793)  < reduction of the property >

point in time (P585)  < date > or new property changed, depending on the outcome of the property proposal

Note: The usefulness of this property in a non-automatized environment would need to be discussed.

version owl:versionInfo rdfs:Literal This property contains a version number or other version designation of the Dataset. 0..1 schema:version [NEW PROPERTY]
version notes adms:versionNotes rdfs:Literal This property contains a description of the differences between this version and a previous version of the Dataset. This property can be repeated for parallel language versions of the version notes. 0..n ?? [NEW PROPERTY]

Distribution[edit]

Mandatory properties[edit]

DCAT Property name DCAT URI Range Usage note Card. Schema.org Representation in Wikidata
access URL dcat:accessURL rdfs:Resource This property contains a URL that gives access to a Distribution of the Dataset. The resource at the access URL may contain information about how to get the Dataset. 1..n schema:url URL (P2699)

Recommended properties[edit]

description dct:description rdfs:Literal This property contains a free-text account of the Distribution. This property can be repeated for parallel language versions of the description. 0..n schema:description description
format dct:format dct:MediaTypeOrExtent This property refers to the file format of the Distribution. 0..1 schema:fileFormat file format (P2701)
licence dct:license dct:LicenseDocument This property refers to the licence under which the Distribution is made available. 0..1 schema:keywords license (P275)

Optional properties[edit]

byte size dcat:byteSize rdfs:Literal typed as xsd:decimal This property contains the size of a Distribution in bytes 0..1 schema:contentSize data size (P3575)
checksum spdx:checksum spdx:Checksum This property provides a mechanism that can be used to verify that the contents of a distribution have not changed 0..1 ?? [NEW PROPERTY]

"checksum" with qualifier "algorithm"

documentation foaf:page foaf:Document This property refers to a page or document about this Distribution. 0..n schema:url URL (P2699) with qualifier of (P642) pointing to technical documentation
download URL dcat:downloadURL rdfs:Resource This property contains a URL that is a direct link to a downloadable file in a given format. 0..n schema:contentURL URL (P2699) with qualifier of (P642) pointing to download (Q7126717)
language dct:language dct:LinguisticSystem This property refers to a language used in the Distribution. This property can be repeated if the metadata is provided in multiple languages. 0..n schema:inLanguage language (P407)
linked schemas dct:conformsTo dct:Standard This property refers to an established schema to which the described Distribution conforms. 0..n ?? [NEW PROPERTY]
media type dcat:mediaType, subproperty of dct:format dct:MediaTypeOrExtent This property refers to the media type of the Distribution as defined in the official register of media types managed by IANA. 0..1 schema:fileFormat media type (P1163)
release date dct:issued rdfs:Literal typed as xsd:date or xsd:dateTime This property contains the date of formal issuance (e.g., publication) of the Distribution. 0..1 schema:datePublished publication date (P577)
rights dct:rights dct:RightsStatement This property refers to a statement that specifies rights associated with the Distribution. 0..1 ?? [NEW PROPERTY]
status adms:status skos:Concept This property refers to the maturity of the Distribution 0..1 ?? [NEW PROPERTY]
title dct:title rdfs:Literal This property contains a name given to the Distribution. This property can be repeated for parallel language versions of the description. 0..n schema:name label
update/ modification date dct:modified rdfs:Literal typed as xsd:date or xsd:dateTime This property contains the most recent date on which the Distribution was changed or modified. 0..1 schema:dateModified significant event (P793)  < reduction of the property >

point in time (P585)  < date > or new property changed, depending on the outcome of the property proposal

Agent[edit]

Mandatory properties[edit]

DCAT Property name DCAT URI Range Usage note Card. Schema.org Representation in Wikidata
name foaf:name rdfs:Literal This property contains a name of the agent. This property can be repeated for different versions of the name (e.g. the name in different languages) 1..n schema:name label

Recommended properties[edit]

type dct:type skos:Concept This property refers to a type of the agent that makes the Catalogue or Dataset available 0..1 schema:potentialAction subject has role (P2868)

Category[edit]

Mandatory properties[edit]

DCAT Property name DCAT URI Range Usage note Card. Schema.org Representation in Wikidata
preferred label skos:prefLabel rdfs:Literal This property contains a preferred label of the category. This property can be repeated for parallel language versions of the label. 1..n schema:name label

Properties of the Optional Classes[edit]

[Contribution needed] Do a mapping for the remaining classes here if you need them anywhere.