Saturday, March 13, 2010

On events versus data

The word "data" always reminds me of the android from Star Trek The Next Generation whose name was data. The word data (in computing) typically is very general and refers to anything the is represented on digital media, the picture of data above is also a piece of data, like many other things. The word "event" also has a broad term which means something that happened.

Recently Paul Vincent wondered in his Blog about the difference between event and data, as some people think that events are footnotes to data. Since by the definitions above, obviously event and data are not really the same, I'll try to talk about the touch points among them, since those are the reason of misconceptions.

There are various touch points between events and data:

  1. Event representation contains data. Event is represented in the computing domain by "event object" or "event message" which usually is also is called "event" as a short name. This event representation includes some information about -- what is the event type, where it happened, when it happened, what happened, who were the players etc... Example: the event is "enter to the building", the event's payload contains information that answer questions such as: what building? who entered? when ? and maybe more. The payload of the event is data, it may be stored (see event store), or just pass by the system.
  2. Data store can store historical events. Event representations can be accumulated and stored in a data store, for further usage. There are large data stores that collect weather events. Note that in order to navigate in historical events, these events may be stored in a temporal database an area that I've dealt with in the past, sometimes if the events are spatial then it have to be stored in spatiotemporal database.
  3. Database can be event producer. In active databases the event were database operations; insert, modify, delete and retrieve, in this case the fact that some data-element has been updated or accessed is the "something that happens" (which may or may not reflect something that happens in reality), and the database acts as event producer and emits event for processing by an event processing network. Note that actually all event producer contains some data that is turned into event, for example transaction instrumentation like what IBM has done in CICS as event producer.
  4. Derived events as database updates. An event processing application take events from somewhere as input, does something, and creates derived events, and send them somewhere, this is all event processing is in one sentence, a derived event created in this process may go to an event consumer, the event consumer may be a DBMS or another type of consumer whose action is to update some data store.
  5. Event enrichment by data during the event processing. During the event processing operations, sometimes enrichments of events is requested, let's return to the event of a person enters a building, the event processing application deals with security access control, and needs to know what is the person security clearance, this information is not provided with the event which provides only identification of the person, and there need to be some enrichment process in which an enrichment event processing agent accesses some global store, in this case reference data, to extract the clearance value and put it inside the event for further processing.
Thus the main issue is not the "versus" issue but the various relationships between the two terms.

No comments: