The PROV Family of Documents provides an interoperable way to interchange provenance information in heterogeneous environments such as the Web. PROV was deliberately kept as generic and extensible as possible, to allow for all possible use cases. This document decribes an extension to PROV to enable the modelling of provenance of information diffusion in the context of social media. More specifically, it introduces a number of new attributes to extend [[PROV-DM]] and [[PROV-CONSTRAINTS]], structured in an ontology for information diffusion on social media.

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This specification describes an extension to the PROV Family of Documents with the goal of enabling the modelling of provenance of information diffusion in the context of social media. The editors welcome comments, implementations and suggestions.

Introduction

Traditionally, research on information diffusion in social media is mainly focused on the aspect of how information will be diffused from a certain point and not on the aspect of the provenance of this information. The former aspect is interesting for use cases such as the influence maximization problem, while the latter contributes to use cases such as online journalism where trustworthiness and quality of information sources needs to be judged. One important point where both aspects overlap, is the influence that users have on each other on social media, both through interactions or social connections. The methods to describe this influence vary across applications, and are typically catered towards one specific goal. Furthermore, no interoperable model to describe the evolution of the social graph and interaction graph is currently available.

The figure below shows an example of such a social and interaction graph.

Example of the influences of social media messages.

In this document, we therefore specify an extension to the W3C [[PROV-DM]] data model, specifically to model the information diffusion on social media in a generic and interoperable way. This includes the evolution of the social graph and interaction graph, and the lineage of messages. All extensions are directly usable with the original PROV model, and preserve valid PROV. Furthermore, all current PROV serializations are supported.

Namespaces

The following namespaces and prefixes are used throughout this document.

The PROV namespace URI is http://www.w3.org/ns/prov# and has prefix prov:.

The PROV-SAID namespace URI is http://semweb.datasciencelab.be/ns/prov-said/ and has prefix prov-said:.

Conceptual Extensions to PROV-DM

In this section, we introduce a number of new values for the attributes prov:type and prov:role, to be used with various [[PROV-DM]] concepts.

The figure below shows a generic overview of the model.

High-level overview of the PROV-SAID model.

Messages

At the core of any social network is the ability for users to emit messages. In this section, we introduce the necessary attribute values to model this behaviour.

Message Types

In general, social media have three distinct types of messages. First and foremost, there are original messages, which are not based on any other message. Additionally, messages can be copied or revised, when they are re-emitted.

prov-said:Message is a subtype of prov:Entity. It denotes a content fragment, which was emitted in the context of a social network.

prov-said:OriginalMessage is a subtype of prov-said:Message. It denotes a prov-said:Message that was constructed without using any other prov-said:Message.

prov-said:CopiedMessage is a subtype of prov-said:Message. It denotes a prov-said:Message that was constructed by copying the content of another prov-said:Message.

prov-said:RevisedMessage is a subtype of prov-said:Message. It denotes a prov-said:Message that was constructed by altering another prov-said:Message.

prov-said:ReplyMessage is a subtype of prov-said:Message. It denotes a prov-said:Message that is a reply to another prov-said:Message.

prov-said:MentionMessage is a subtype of prov-said:Message. It denotes a prov-said:Message that includes a mention of another prov:Agent.

prov-said:EmotionMessage is a subtype of prov-said:Message. It denotes a prov-said:Message conveys an emotion with regards to another prov-said:Message (such as a like, favorite, etc.).

Note that these message types are not mutually exclusive. For example, a prov-said:ReplyMessage could also contain a mention, and thus be considered a prov-said:MentionMessage.

  prefix TDN-status: <http://twitter.com/TomDeNies/status/>
  prefix RV-status <http://twitter.com/RubenVerborgh/status/>
  prefix it-status <http://twitter.com/itaxidou/status/>
  
  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  // User @itaxidou re-tweeted the revised message
  prov:entity(it-status:67891, [prov:type='prov-said:CopiedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
            

Message Attribution

Like any other prov:Entity, a prov-said:Message can be attributed to a prov:Agent. In this case, the prov:Agent represents the user who emitted the prov-said:Message.

  prefix twitter: <http://twitter.com/>
  prefix TDN-status: <http://twitter.com/TomDeNies/status/>

  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  prov:agent(twitter:TomDeNies)
  // TDN-status:12345 was emitted by twitter:TomDeNies
  prov:wasAttributedTo(TDN-status:12345, twitter:TomDeNies)
            

Message Emission

As is made clear in the constraints, the message type is dependent on the usage of other messages by its emission. To model this, prov-said:EmitMessage is defined as a subtype of prov:Activity.

prov-said:EmitMessage is a subtype of prov:Activity. It denotes the emission of a prov-said:Message, which is generated by the prov-said:EmitMessage.

  prefix TDN-status: <http://twitter.com/TomDeNies/status/>
  prefix RV-status <http://twitter.com/RubenVerborgh/status/>
  prefix it-status <http://twitter.com/itaxidou/status/>

  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  // User @itaxidou re-tweeted the revised message
  prov:entity(it-status:67891, [prov:type='prov-said:CopiedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  
  // TDN-status:12345 was generated by the activity emit-12345
  prov:activity(emit-12345, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(TDN-status:12345, emit-12345)

  // RV-status:23456 was generated by the activity emit-23456, which used TDN-status:12345
  prov:activity(emit-23456, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(RV-status:23456, emit-23456)
  prov:used(emit-23456, TDN-status:12345)

  // it-status:67891 was generated by the activity emit-67891, which used RV-status:23456
  prov:activity(emit-67891, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(it-status:67891, emit-67891)
  prov:used(emit-67891, RV-status:23456)

            

Message Derivation

Whereas a prov-said:OriginalMessage does not have any dependencies of type prov-said:Message, messages of type prov-said:CopiedMessage, prov-said:RevisedMessage, prov-said:ReplyMessage, or prov-said:EmotionMessage can be traced back to their originals through derivation - i.e., they cannot exist on their own. [[PROV-DM]] already provides most of the concepts needed to model this for copied and revised messages, in the form of prov:Quotation, prov:Revision, and prov:PrimarySource, as illustrated by the example below.

  prefix TDN-status: <http://twitter.com/TomDeNies/status/>
  prefix RV-status <http://twitter.com/RubenVerborgh/status/>
  prefix it-status <http://twitter.com/itaxidou/status/>

  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  // User @itaxidou re-tweeted the revised message
  prov:entity(it-status:67891, [prov:type='prov-said:CopiedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])

  // TDN-status:12345 was generated by the activity emit-12345
  prov:activity(emit-12345, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(TDN-status:12345, emit-12345)

  // RV-status:23456 was generated by the activity emit-23456, which used TDN-status:12345
  prov:activity(emit-23456, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(gen-23456; RV-status:23456, emit-23456)
  prov:used(use-12345; emit-23456, TDN-status:12345)

  // it-status:67891 was generated by the activity emit-67891, which used RV-status:23456
  prov:activity(emit-67891, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(gen-67891; it-status:67891, emit-67891)
  prov:used(use-23456; emit-67891, RV-status:23456)
  
  // RV-status:23456 was derived from TDN-status:12345,
  // which is also its primary source (at least in the context of Twitter)
  prov:wasDerivedFrom(RV-status:23456, TDN-status:12345, emit-23456, gen-23456, use-12345, [prov:type='prov:Revision', prov:type='prov:PrimarySource'])

  // it-status:67891 was quoted from RV-status:23456 (which is not its primary source)
  prov:wasDerivedFrom(it-status:67891, RV-status:23456, emit-67891, gen-67891, use-23456, [prov:type='prov:Quotation'])
            

Some applications modeling information diffusion also have the capability of detecting possible indirect dependencies between messages. This is easily modeled in [[PROV-DM]] by asserting a derivation without specifying the activity, generation, or usage. However, here we choose to provide a concept to explicitly indicate such an indirect connection: prov-said:IndirectDerivation

prov-said:IndirectDerivation is a subtype of prov:Derivation. It denotes the transformation of a prov-said:Message into another, subject to an unknown number of prov:Quotation and/or prov:Revision.

  prefix TDN-status: <http://twitter.com/TomDeNies/status/>
  prefix RV-status <http://twitter.com/RubenVerborgh/status/>
  prefix it-status <http://twitter.com/itaxidou/status/>

  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  // User @itaxidou re-tweeted the revised message
  prov:entity(it-status:67891, [prov:type='prov-said:CopiedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
            
  // it-status:67891 was indirectly derived from TDN-status:12345
  prov:wasDerivedFrom(it-status:67891, TDN-status:12345,  [prov:type='prov-said:IndirectDerivation'])
            

Replies and expressions of emotions about messages are specific types of derivation, since they cannot exist on their own. As these types are not modeled explicitly in [[PROV-DM]], we introduce two new subtypes of prov:Derivation to model this behaviour.

prov-said:Reply is a subtype of prov:Derivation. It denotes the generation of a prov-said:ReplyMessage by replying to a prov-said:Message.

prov-said:Emotion is a subtype of prov:Derivation. It denotes the generation of a prov-said:EmotionMessage by expressing an emotion about a prov-said:Message.

Note that a prov-said:MentionMessage can exist on its own, without being derived from an existing message. Therefore, no Derivation subtype is defined for mentions.

Influence

Another important aspect in social media, is the influences that users are submitted to. We model these influences using influence types, influence activities, and influence roles.

Influence Types

In this section, we introduce extensions to prov:Influence: one generic prov-said:InfluenceRelationship to denote an influence between agents on social media and two subtypes: one specific prov-said:FollowRelationship to denote an agent being influenced by another by establishing a unidirectional relationship to the latter and one specific prov-said:InteractionInfluence to denote an agent being influenced by interaction(s) with another agent. Note here that establishing a unidirectional relationship practically means subscribing to the messages of another agent in most social media.

prov-said:InfluenceRelationship is a subtype of prov:Influence. It denotes an influence between agents in the context of a social network.

prov-said:FollowRelationship is a subtype of prov-said:InfluenceRelationship. It denotes that an agent was influenced by another agent, by explicitly subscribing to the messages emitted by the latter.

  prefix twitter: <http://twitter.com/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:itaxidou)
 
  // User @TomDeNies followed user @itaxidou, so a prov-said:FollowRelationship existed between them
  prov:wasInfluencedBy(twitter:TomDeNies, twitter:itaxidou, [prov:type='prov-said:FollowRelationship'])
            

prov-said:InteractionInfluence is a subtype of prov-said:InfluenceRelationship. It denotes that an agent was influenced by another agent, by having done one or more of the following:

  • mentioned the latter agent;
  • quoted, revised, replied to, and/or expressed an emotion of at least one message of the latter agent.

prov-said:SelfInfluence is a subtype of prov-said:InfluenceRelationship. It denotes that an agent was influenced by him- or herself (e.g., while promoting or editing their own content).

prov-said:ExternalInfluence is a subtype of prov-said:InfluenceRelationship. It denotes that an agent was influenced by some external, possibly unknown entity (e.g., an event, a public speaker, etc.).

  prefix twitter: <http://twitter.com/>
  prefix TDN-status: <http://twitter.com/TomDeNies/status/>
  prefix RV-status <http://twitter.com/RubenVerborgh/status/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:RubenVerborgh)

  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  
  // User @RubenVerborgh revised a message from @TomDeNies, so a prov-said:InteractionInfluence existed between them
  prov:wasInfluencedBy(twitter:RubenVerborgh, twitter:TomDeNies, [prov:type='prov-said:InteractionInfluence'])
            

Note that although [[PROV-DM]] recommends to use more specific relations than influence, the influence relationship has its own merit in the context of modeling social media information diffusion. By providing the prov-said:InfluenceRelationship type and its subtypes, it becomes possible to use provenance to reconstruct the social graph and interaction graph at a certain moment in time.

Influence Activities

Most traditional models for social influence graphs model only the direct relationships between agents. However, for in-depth analysis scenarios, it is also useful to provide more details about these relationships, such as when they started and ended, what triggered them, etc. For these purposes, we introduce three subtypes of prov:Activity: prov-said:InfluenceActivity, prov-said:FollowActivity, and prov-said:InteractionInfluenceActivity

prov-said:InfluenceActivity is a subtype of prov:Activity. It denotes the activity of one agent influencing another.

prov-said:FollowActivity is a subtype of prov:InfluenceActivity. It denotes the activity of one agent following another.

  prefix twitter: <http://twitter.com/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:itaxidou)

  // User @TomDeNies followed user @itaxidou, so a prov-said:FollowRelationship existed between them
  prov:wasInfluencedBy(twitter:TomDeNies, twitter:itaxidou, [prov:type='prov-said:FollowRelationship'])
  
  // A prov-said:FollowActivity was started at the moment user @TomDeNies followed user @itaxidou.
  // Since @TomDeNies was still following @itaxidou at the time of assertion, there is no end time for the activity.
  activity(tomdenies-follows-itaxidou, 2015-01-09T13:00:00, - , [ prov:type='prov-said:FollowActivity' ])
            

prov-said:InteractionInfluenceActivity is a subtype of prov:InfluenceActivity. It denotes the activity of one agent influencing another by interacting with the latter.

  prefix twitter: <http://twitter.com/>
  prefix TDN-status: <http://twitter.com/TomDeNies/status/>
  prefix RV-status <http://twitter.com/RubenVerborgh/status/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:RubenVerborgh)

  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  // RV-status:23456 was generated by the activity emit-23456, which used TDN-status:12345
  prov:activity(emit-23456, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(gen-23456; RV-status:23456, emit-23456)
  prov:used(use-12345; emit-23456, TDN-status:12345)
  
  // A prov-said:InteractionInfluenceActivity was started and ended at the moment user @RubenVerborgh modified and re-emitted the message.
  activity(rubenverborgh-influencedby-tomdenies, 2015-01-09T13:05:00, 2015-01-09T13:05:00 , [ prov:type='prov-said:InteractionInfluenceActivity' ])
  wasStartedBy(rubenverborgh-influencedby-tomdenies, RV-status:23456, emit-23456, 2015-01-09T13:05:00)
  wasEndedBy(rubenverborgh-influencedby-tomdenies, RV-status:23456, emit-23456, 2015-01-09T13:05:00)
  
  // User @RubenVerborgh revised a message from @TomDeNies, so a prov-said:InteractionInfluence existed between them
  prov:wasInfluencedBy(twitter:RubenVerborgh, twitter:TomDeNies, [prov:type='prov-said:InteractionInfluence'])
            

Note that a prov-said:InteractionInfluenceActivity is instantaneous, and thus has the same start and end time. This means that for every interaction, a new prov-said:InteractionInfluenceActivity is asserted. However, it is not necessary to re-assert a wasInfluencedBy(a1, a2, [prov:type='prov-said:InfluenceRelationship']) statement where agent a1 was influenced by agent a2 for every instance of prov-said:InteractionInfluenceActivity associated with the a1 and using a2. This makes sense, since multiple interactions between the same agents in the same direction do not change the fact that one agent was influenced by the other through interaction at some point, which is all the prov-said:InteractionInfluence signifies.

prov-said:SelfInfluenceActivity is a subtype of prov:InfluenceActivity. It denotes the activity of one agent influencing him or herself.

prov-said:ExternalInfluenceActivity is a subtype of prov:InfluenceActivity. It denotes the activity of one agent being influenced by an external, possibly unknown entity.

Influence Roles

Finally, to clarify the roles of the agents involved in a prov-said:InfluenceRelationship, we define six values for the prov:role attribute, to be used with prov:Association and prov:Usage.

prov-said:Influencer is used as the value of a prov:role attribute, in the context of a prov:Usage of a prov:Agent by a prov-said:InfluenceActivity. It denotes that the used prov:Agent influences the prov:Agent associated with the prov-said:InfluenceActivity.

prov-said:Influencee is used as the value of a prov:role attribute, in the context of a prov:Association of a prov-said:InfluenceActivity with a prov:Agent. It denotes that the associated prov:Agent is influenced by the prov:Agent used by the prov-said:InfluenceActivity.

  prefix twitter: <http://twitter.com/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:RubenVerborgh)

  // User @RubenVerborgh was influenced by @TomDeNies
  prov:wasInfluencedBy(twitter:RubenVerborgh, twitter:TomDeNies, [prov:type='prov-said:InfluenceRelationship'])

  // A prov-said:InfluenceActivity is used to model the influence
  activity(rubenverborgh-influencedby-tomdenies, 2015-01-09T13:05:00, - , [ prov:type='prov-said:InfluenceActivity' ])
  used(rubenverborgh-influencedby-tomdenies, twitter:TomDeNies, [ prov:role='prov-said:Influencer' ])
  wasAssociatedWith(rubenverborgh-influencedby-tomdenies, twitter:RubenVerborgh, [ prov:role='prov-said:Influencee' ])
            

Note that in the case of prov-said:SelfInfluence, the prov-said:Influencer and prov-said:Influencee are the same, and that for prov-said:ExternalInfluence, the prov-said:Influencer can be unknown.

prov-said:Followee is a subtype of prov-said:Influencer and is used as the value of a prov:role attribute, in the context of a prov:Usage of a prov:Agent by a prov-said:FollowActivity. It denotes that the used prov:Agent is followed by the prov:Agent associated with the prov-said:FollowActivity.

prov-said:Follower is a subtype of prov-said:Influencee and is used as the value of a prov:role attribute, in the context of a prov:Association of a prov-said:FollowActivity with a prov:Agent. It denotes that the associated prov:Agent follows the prov:Agent used by the prov-said:FollowActivity.

  prefix twitter: <http://twitter.com/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:itaxidou)

  // User @TomDeNies followed user @itaxidou, so a prov-said:FollowRelationship existed between them
  prov:wasInfluencedBy(twitter:TomDeNies, twitter:itaxidou, [prov:type='prov-said:FollowRelationship'])

  // A prov-said:FollowActivity was started at the moment user @TomDeNies followed user @itaxidou.
  activity(tomdenies-follows-itaxidou, 2015-01-09T13:00:00, - , [ prov:type='prov-said:FollowActivity' ])
  used(tomdenies-follows-itaxidou, twitter:itaxidou, [ prov:role='prov-said:Followee' ])
  wasAssociatedWith(tomdenies-follows-itaxidou, twitter:TomDeNies, [ prov:role='prov-said:Follower' ])
            

prov-said:InteractionInfluencer is a subtype of prov-said:Influencer and is used as the value of a prov:role attribute, in the context of a prov:Usage of a prov:Agent by a prov-said:InteractionInfluenceActivity. It denotes that the used prov:Agent influences the prov:Agent associated with the prov-said:InteractionInfluenceActivity through an interaction.

prov-said:InteractionInfluencee is a subtype of prov-said:Influencee and is used as the value of a prov:role attribute, in the context of a prov:Association of a prov-said:InteractionInfluenceActivity with a prov:Agent. It denotes that the associated prov:Agent is influenced by the prov:Agent used by the prov-said:InteractionInfluenceActivity through an interaction.

  prefix twitter: <http://twitter.com/>

  prov:agent(twitter:TomDeNies)
  prov:agent(twitter:RubenVerborgh)

  // User @RubenVerborgh was influenced by @TomDeNies
  prov:wasInfluencedBy(twitter:RubenVerborgh, twitter:TomDeNies, [prov:type='prov-said:InteractionInfluence'])

  // User @TomDeNies tweeted a message "Hello, world!"
  prov:entity(TDN-status:12345, [prov:type='prov-said:OriginalMessage', prov:label='Hello, world!'])
  // User @RubenVerborgh modified and re-emitted the "Hello, world!" message
  prov:entity(RV-status:23456, [prov:type='prov-said:RevisedMessage', prov:label='Hello from me too! MT @TomDeNies: Hello, world!'])
  // RV-status:23456 was generated by the activity emit-23456, which used TDN-status:12345
  prov:activity(emit-23456, [prov:type='prov-said:EmitMessage'])
  prov:wasGeneratedBy(gen-23456; RV-status:23456, emit-23456)
  prov:used(use-12345; emit-23456, TDN-status:12345)

  // A prov-said:InteractionInfluenceActivity was started and ended at the moment user @RubenVerborgh modified and re-emitted the message.
  activity(rubenverborgh-influencedby-tomdenies, 2015-01-09T13:05:00, 2015-01-09T13:05:00 , [ prov:type='prov-said:InteractionInfluenceActivity' ])
  wasStartedBy(rubenverborgh-influencedby-tomdenies, RV-status:23456, emit-23456, 2015-01-09T13:05:00)
  wasEndedBy(rubenverborgh-influencedby-tomdenies, RV-status:23456, emit-23456, 2015-01-09T13:05:00)
 
  used(rubenverborgh-influencedby-tomdenies, twitter:TomDeNies, [ prov:role='prov-said:InteractionInfluencer' ])
  wasAssociatedWith(rubenverborgh-influencedby-tomdenies, twitter:RubenVerborgh, [ prov:role='prov-said:InteractionInfluencee' ])

            

Note that in the case of prov-said:Followee and prov-said:Follower, the -ee and -er suffixes are reversed when compared to the other roles. While this might seem counter-intuitive at first, it is in fact correct. The best way to avoid mistakes is to remember that the used agent exerts the influence and the associated agent is being influenced. Indeed, in the case of a prov-said:FollowActivity, the used prov-said:Followee influences the associated prov-said:Follower. However, in the case of prov-said:InteractionInfluenceActivity, the used prov-said:InteractionInfluencer influences the associated prov-said:InteractionInfluencee.

Extensions to PROV-Constraints

In this section, we describe the constraints and inferences that govern the use of the concepts described above. These constraints and inferences are put in place to ensure correct use and semantics of the PROV-SAID concepts. Since all PROV-SAID concepts are attributes, their use in combination with [[PROV-DM]] will always result in a valid PROV instance, as long as the PROV instance itself (without the PROV-SAID attributes) is valid. Additionally, compliance with the constraints and inferences in this section constitutes a valid PROV-SAID instance. In order to fully understand this section and its notational conventions such as the use of PROV-N and underscores, we highly recommend reading the conventions, basic concepts and definitions of the original [[PROV-CONSTRAINTS]] as well.

Inferences

A prov-said:Message is always generated by a prov-said:EmitMessage.

Inference 1 (message-generation)

IF prov:entity(m1, [prov:type='prov-said:Message']) THEN there exists an a1 for which prov:wasGeneratedBy(m1, a1) and 'prov-said:EmitMessage' ∈ typeOf(a1) hold.

A prov-said:CopiedMessage or prov-said:RevisedMessage is always generated by a prov-said:EmitMessage that uses another prov-said:Message.

Inference 2 (message-generation-usage)
  1. IF prov:entity(m1, [prov:type='prov-said:CopiedMessage']) and prov:activity(a1, [prov:type='prov-said:EmitMessage']) and prov:wasGeneratedBy(m1, a1) THEN there exists an m2 for which prov:used(a1, m2) holds.
  2. IF prov:entity(m3, [prov:type='prov-said:RevisedMessage']) and prov:activity(a2, [prov:type='prov-said:EmitMessage']) and prov:wasGeneratedBy(m3, a2) THEN there exists an m4 for which prov:used(a2, m4) holds.
  3. IF prov:entity(m3, [prov:type='prov-said:ReplyMessage']) and prov:activity(a2, [prov:type='prov-said:EmitMessage']) and prov:wasGeneratedBy(m3, a2) THEN there exists an m4 for which prov:used(a2, m4) holds.
  4. IF prov:entity(m3, [prov:type='prov-said:EmotionMessage']) and prov:activity(a2, [prov:type='prov-said:EmitMessage']) and prov:wasGeneratedBy(m3, a2) THEN there exists an m4 for which prov:used(a2, m4) holds.

A prov-said:Message must always be attributed to a prov:Agent

Inference 3 (message-attribution)

IF 'prov-said:Message'typeOf(e1) THEN there exists an a1 for which prov:wasAttributedTo(e1, a1) holds.

Generation of a prov-said:CopiedMessage m2 and usage of a prov-said:Message m1 by a prov-said:EmitMessage implies that m2 was derived from m1 by means of prov:Quotation. Analogously, generation of a prov-said:RevisedMessage m4 and usage of a prov-said:Message m3 by a prov-said:EmitMessage implies that m4 was derived from m3 by means of prov:Revision.

Inference 4 (copied-revised-message-implies-derivation)
  1. IF prov:activity(a1, [prov:type='prov-said:EmitMessage']) and prov:entity(m1, [prov:type='prov-said:Message']) and prov:entity(m2, [prov:type='prov-said:CopiedMessage']) and prov:wasGeneratedBy(g1; m2, a1) and prov:used(u1; a1, m1) THEN prov:wasDerivedFrom(m2, m1, a1, g1, u1, [prov:type=prov:Quotation])
  2. IF prov:activity(a2, [prov:type='prov-said:EmitMessage']) and prov:entity(m3, [prov:type='prov-said:Message']) and prov:entity(m4, [prov:type='prov-said:RevisedMessage']) and prov:wasGeneratedBy(g2; m4, a2) and prov:used(u2; a2, m3) THEN prov:wasDerivedFrom(m4, m3, a2, g2, u2, [prov:type=prov:Revision])

For possible extension of the prov-said:InfluenceRelationship, we ensure that it always implies a prov-said:InfluenceActivity, prov:Usage and prov:Association.
These inferences also specify the prov:role that the prov:Usage and prov:Association of a prov-said:InfluenceActivity should have.

Inference 5 (influence-activity-association-usage)
  1. IF prov:wasInfluencedBy(ag1, ag2, [prov:type='prov-said:InfluenceRelationship']) THEN there exists an a1, as1 and u1 for which prov:activity(a1, [prov:type='prov-said:InfluenceActivity']) and prov:wasAssociatedWith(as1; a1, ag1, [prov:role='Influencee']) and prov:used(u1; a1, ag2, [prov:role='Influencer']) holds.
  2. IF prov:wasInfluencedBy(ag3, ag4, [prov:type='prov-said:FollowRelationship']) THEN there exists an a2, as2 and u2 for which prov:activity(a2, [prov:type='prov-said:FollowActivity']) and prov:wasAssociatedWith(as2; a2, ag3, [prov:role='Follower']) and prov:used(u2; a2, ag4, [prov:role='Followee']) holds.
  3. IF prov:wasInfluencedBy(ag5, ag6, [prov:type='prov-said:InteractionInfluence']) THEN there exists an a3, as3 and u3 for which prov:activity(a3, [prov:type='prov-said:InteractionInfluenceActivity']) and prov:wasAssociatedWith(as3; a3, ag5, [prov:role='InteractionInfluencee']) and prov:used(u3; a3, ag6, [prov:role='InteractionInfluencer']) holds.
  4. IF prov:wasInfluencedBy(ag7, ag7, [prov:type='prov-said:SelfInfluence']) THEN there exists an a4, as4 and u4 for which prov:activity(a4, [prov:type='prov-said:InteractionInfluenceActivity']) and prov:wasAssociatedWith(as4; a4, ag7, [prov:role='InteractionInfluencee']) and prov:used(u4; a4, ag7, [prov:role='InteractionInfluencer']) holds.
  5. IF prov:wasInfluencedBy(ag8, -, [prov:type='prov-said:ExternalInfluence']) THEN there exists an a5, as5 for which prov:activity(a5, [prov:type='prov-said:ExternalInfluenceActivity']) and prov:wasAssociatedWith(as5; a5, ag8, [prov:role='InteractionInfluencee']) holds.

A prov-said:InteractionInfluenceActivity must always start (and end, since it is instantaneous) with the emission of a prov-said:CopiedMessage or prov-said:RevisedMessage.

Inference 6 (interactioninfluenceactivity-start)

IF prov:activity(a1, [prov:type='prov-said:InteractionInfluenceActivity']) THEN there exists an e2, _a2 and _t2 for which prov:wasStartedBy(a1, e2 , _a2, _t2) and prov:wasEndedBy(a1, e2 , _a2, _t2) and ('prov-said:CopiedMessage' ∈ typeOf(e2) or 'prov-said:RevisedMessage' ∈ typeOf(e2)) holds.

The roles prov-said:Followee and prov-said:InteractionInfluencer imply prov-said:Influencer.
The roles prov-said:Follower and prov-said:InteractionInfluencee imply prov-said:Influencee.

Inference 7 (influencer-influencee-subtypes)
  1. IF prov:used(_a1, _ag1, [prov:role='prov-said:Followee']) THEN prov:used(_a1, _ag1, [prov:role='prov:Influencer'])
  2. IF prov:used(_a2, _ag2, [prov:role='prov-said:InteractionInfluencer']) THEN prov:used(_a2, _ag2, [prov:role='prov:Influencer'])
  3. IF prov:used(_a3, _ag3, [prov:role='prov-said:Follower']) THEN prov:used(_a3, _ag3, [prov:role='prov:Influencee'])
  4. IF prov:used(_a4, _ag4, [prov:role='prov-said:InteractionInfluencee']) THEN prov:used(_a4, _ag4, [prov:role='prov:Influencee'])

Constraints

A prov-said:EmitMessage that generated a prov-said:OriginalMessage may not use a prov-said:Message.

Constraint 1 (originalmessage-generation-usage)

IF prov:entity(m2, [prov:type='prov-said:OriginalMessage']) and prov:activity(a1, [prov:type='prov-said:EmitMessage']) and prov:entity(m1, [prov:type='prov-said:Message']) and prov:wasGeneratedBy(m2, a1) and prov:used(a1, m1) THEN INVALID

A prov-said:OriginalMessage may not be derived from a prov-said:Message.

Constraint 2 (originalmessage-derivation)

IF prov:entity(m2, [prov:type='prov-said:OriginalMessage']) and prov:entity(m1, [prov:type='prov-said:Message']) and prov:wasDerivedFrom(m2, m1) THEN INVALID

Note that this does not mean that a prov-said:OriginalMessage cannot be derived from anything. Constraint 1 and 2 only apply to activities of type prov-said:EmitMessage and entities of type prov-said:Message. For example, it is still possible that a prov-said:OriginalMessage was derived from an external source, such as a news article.

A prov-said:InteractionInfluenceActivity is instantaneous.

Constraint 3 (interactioninfluenceactivity-instantaneous)

IF prov:activity(a1, t1, t2, [prov:type='prov-said:InteractionInfluenceActivity']) THEN t1 = t2

A prov-said:SelfInfluence implies that the prov:Agents that influence each other are the same.

Constraint 4 (selfinfluence)

IF prov:wasInfluencedBy(ag1, ag2, [prov:type='prov-said:SelfInfluence']) and ag1 &neq; ag2 THEN INVALID

Type Constraints

The types prov-said:OriginalMessage, prov-said:CopiedMessage, prov-said:RevisedMessage, prov-said:ReplyMessage, prov-said:EmotionMessage, and prov-said:MentionMessage are subtypes of prov-said:Message and prov:Entity.

Constraint 4 (message-subtypes)
  1. IF 'prov-said:Message'typeOf(id) THEN 'prov:Entity'typeOf(id).
  2. IF 'prov-said:OriginalMessage'typeOf(id) THEN 'prov-said:Message'typeOf(id).
  3. IF 'prov-said:CopiedMessage'typeOf(id) THEN 'prov-said:Message'typeOf(id).
  4. IF 'prov-said:RevisedMessage'typeOf(id) THEN 'prov-said:Message'typeOf(id).
  5. IF 'prov-said:ReplyMessage'typeOf(id) THEN 'prov-said:Message'typeOf(id).
  6. IF 'prov-said:EmotionMessage'typeOf(id) THEN 'prov-said:Message'typeOf(id).
  7. IF 'prov-said:MentionMessage'typeOf(id) THEN 'prov-said:Message'typeOf(id).

The types prov-said:OriginalMessage, prov-said:CopiedMessage, and prov-said:RevisedMessage are disjoint.

Constraint 5 (messagetypes-disjoint)
  1. IF 'prov-said:OriginalMessage'typeOf(id) and 'prov-said:CopiedMessage'typeOf(id) THEN INVALID.
  2. IF 'prov-said:CopiedMessage'typeOf(id) and 'prov-said:RevisedMessage'typeOf(id) THEN INVALID.
  3. IF 'prov-said:RevisedMessage'typeOf(id) and 'prov-said:OriginalMessage'typeOf(id) THEN INVALID.
  4. IF 'prov-said:OriginalMessage'typeOf(id) and 'prov-said:ReplyMessage'typeOf(id) THEN INVALID.
  5. IF 'prov-said:OriginalMessage'typeOf(id) and 'prov-said:EmotionMessage'typeOf(id) THEN INVALID.

The type prov-said:EmitMessage is a subtype of prov:Activity.

Constraint 6 (emission-subtype-activity)

IF 'prov-said:EmitMessage'typeOf(id) THEN 'prov:Activity'typeOf(id).

prov-said:IndirectDerivation, prov-said:Reply, and prov-said:Emotion are a subtypes of prov:Derivation.

Constraint 7 (derivation-subtypes)
  1. IF 'prov-said:IndirectDerivation'typeOf(id) THEN 'prov:Derivation'typeOf(id).
  2. IF 'prov-said:Reply'typeOf(id) THEN 'prov:Derivation'typeOf(id).
  3. IF 'prov-said:Emotion'typeOf(id) THEN 'prov:Derivation'typeOf(id).

The types prov-said:FollowRelationship, prov-said:InteractionInfluence, 'prov-said:ExternalInfluence', and 'prov-said:SelfInfluence' are subtypes of prov-said:InfluenceRelationship and prov:Influence.

Constraint 8 (influence-subtypes)
  1. IF 'prov-said:InfluenceRelationship'typeOf(id) THEN 'prov:Influence'typeOf(id).
  2. IF 'prov-said:FollowRelationship'typeOf(id) THEN 'prov-said:InfluenceRelationship'typeOf(id).
  3. IF 'prov-said:InteractionInfluence'typeOf(id) THEN 'prov-said:InfluenceRelationship'typeOf(id).
  4. IF 'prov-said:ExternalInfluence'typeOf(id) THEN 'prov-said:InfluenceRelationship'typeOf(id).
  5. IF 'prov-said:SelfInfluence'typeOf(id) THEN 'prov-said:InfluenceRelationship'typeOf(id).

The types prov-said:FollowActivity, prov-said:InteractionInfluenceActivity, 'prov-said:ExternalInfluenceActivity', and 'prov-said:SelfInfluenceActivity' are subtypes of prov-said:InfluenceActivity and prov:Activity.

Constraint 9 (influenceactivity-subtypes)
  1. IF 'prov-said:InfluenceActivity'typeOf(id) THEN 'prov:Activity'typeOf(id).
  2. IF 'prov-said:FollowActivity'typeOf(id) THEN 'prov-said:InfluenceActivity'typeOf(id).
  3. IF 'prov-said:ExternalInfluenceActivity'typeOf(id) THEN 'prov-said:InfluenceActivity'typeOf(id).
  4. IF 'prov-said:SelfInfluenceActivity'typeOf(id) THEN 'prov-said:InfluenceActivity'typeOf(id).

Serializations

Since all our extensions to PROV are in the form of attributes and values, no changes have to be made in the current serializations available for PROV. The values of prov:type and prov:role are asserted in the same way as in the serialization specifications [[PROV-N]], [[PROV-O]], and [[PROV-XML]].

Acknowledgements

This document was produced in the context of research activities that were funded by Ghent University, iMinds, the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT), the Fund for Scientific Research-Flanders (FWO-Flanders), and the European Union.

The editors also thank Ben De Meester, Pieter Heyvaert, Ruben Verborgh, Anastasia Dimou and Peter Fischer for their suggestions and reviews.