r6 - 27 Oct 2006 - 09:58:16 - RandyLetnessYou are here: OSAF >  Journal Web  >  ContributorNotes > RandyLetnessNotes > CosmoZeroPointSixSharingNotes

Cosmo 0.6 Sharing

Introduction

One of the most important features in Cosmo 0.6 is to support Chandler’s rich sharing format. In Cosmo 0.5, Chandler accomplishes calendar sharing using CalDAV and WebDAV, by creating separate event (.ics) and application specific (.xml) resources inside DAV collections. Each resource represents a calendar event item in Chandler. The goal for 0.6 is to support sharing any type of collection, without the need to manage multiple resources per item (.ics and .xml).

EIM

Chandler has the notion of an external information model (EIM), which is an intermediate layer between internal objects and their externalized representations. The idea is that internal objects are first converted to EIM form, and then serialized to an external format. The reverse is that external data is first converted to EIM, and then into internal objects.

EIM can be thought of as a collection of records. Each record has a record type denoted by a unique namespace, and a set of values (fields). Each value or field in the record type has a data type associated with it. The data type is one of a number of primitive data types (string, integer, boolean, etc).

In Chandler, a “Sharing Schema” is responsible for converting internal objects to and from EIM form. A sharing schema will be provided that covers all out-of-box object types, but anyone can define their own sharing schema by writing export/import code to/from EIM. As long as Cosmo provides a way to store/access data in EIM form, then it will be possible to share any data in Chandler with other Chandler users.

For Cosmo 0.6, we have to implement the bi-directional arrow labeled “Morse Code”. “Morse Code” can be thought of everything involved in getting the EIM to Cosmo and back. Cosmo also needs to continue to support getting and putting data using other protocols such as CalDAV, WebDAV, Atom, etc. For example, if a Chandler client shares a calendar collection, then that collection should be viewable/updatable using CalDAV and vice-versa.

Another goal for 0.6 is for Cosmo to support sharing of new types of data. That is, if someone decides to develop an add-on for Chandler, which uses a new item type, as long as they write schema export/import code in Chandler that converts the new data to EIM, they should be able to share the data using Cosmo. More importantly, new data types defined in Chandler should not require any changes to Cosmo code to enable sharing of that data (Chandler-to-Chandler sharing). The only case where this needs to be done is if you wanted Cosmo to do something intelligent with the new data type, such as providing access to the data using a different protocol.

A Closer Look at EIM

This is an example taken from ExternalInformationModel

Assume we have a Note, 2 Tags, and a Contact.

Note (UUID 1)

  • displayName = "Example note"
  • body = "Example body"
  • createdOn = 2006-07-13 12:26:00-07:00
  • tags = ref collection of (UUID 2, UUID 3)
  • lastModifiedBy = UUID 4

Tag (UUID 2)

  • displayName = "Work"
  • createdOn = 2006-07-13 12:28:00-07:00
  • items = ref collection of (UUID 1)
  • lastModifiedBy = UUID 4

Tag (UUID 3)

  • displayName = "Sharing"
  • createdOn = 2006-07-13 12:29:00-07:00
  • items = ref collection of (UUID 1)
  • lastModifiedBy = UUID 4

Contact (UUID 4)

  • createdOn = 2006-07-13 12:30:00-07:00
  • emailAddress = "morgen@example.com"
  • contactName = UUID 5
  • itemsLastModified = ref collection of (UUID 1, UUID 2, UUID 3, UUID 4, UUID 5)
  • lastModifiedBy = UUID 4

ContactName? (UUID 5)

  • createdOn = 2006-07-13 12:31:00-07:00
  • firstName = "Morgen"
  • lastName = "Sagen"
  • lastModifiedBy = UUID 4

Example items in EIM form

In EIM form, these Chandler items are represented by 3 record types, shown here as tables:

Table 1: namespace http://schemas.osafoundation.org/pim/contentitem

# (Item UUID, title, body, createdOn, description, lastModifiedBy UUID)

  • (1, "Example note", "Example body", "2006-07-13 12:26:00-07:00", N/A, 4)
  • (2, "Work", N/A, "2006-07-13 12:28:00-07:00", N/A, 4)
  • (3, "Sharing", N/A, "2006-07-13 12:29:00-07:00", N/A, 4)
  • (4, N/A, N/A, "2006-07-13 12:30:00-07:00", N/A, 4)

Table 2: namespace http://schemas.osafoundation.org/pim/contentitem/tags

# (Item UUID, Tag UUID)

  • (1, 2)
  • (1, 3)

Table 3: namespace http://schemas.osafoundation.org/pim/contact

# (Contact UUID, emailAddress, firstName, lastName)

  • (4, "morgen@example.com", "Morgen", "Sagen")

Notes

  • In Chandler, a content is a Note by default, which is why there is no "note" recordtype.

Cosmo Data Model

In Cosmo 0.5, everything in Cosmo is an Item. Items have a UUID, a name, a parent, and a set of Attribute objects. Attributes have a name and a value. There are a number of Attribute types including:

StringAttribute String value
IntegerAttribute Integer value
BooleanAttribute Boolean value
DateAttribute Date value
BinaryAttribute byte[] value
MultiValueStringAttribute Set value
DictionaryAttribute Map<String, String> value

Currently there is no support for namespaces in attributes. We get around this for DAV properties by appending the namespace to the attribute name, but for 0.6 we need to add namespace support.

Next, we need to come up with a set of attribute types. There should be an attribute type that corresponds to each primitive data type in the EIM.

EIM Type Cosmo Type
text StringAttribute?
integer IntegerAttribute?
Lob(encoding, mimeType, lob) LobAttribute?
DateTime? (precision in seconds, offset from GMT, optional timezone) DateTimeAttribute?
bytes BinaryAttribute?

TODO: finalize primitive types and mapping

Cosmo Item Hierarchy

In 0.5 there are content items and collection items. Content items have a piece of content (data) associated with them (which is stored as a blob), along with other properties that describe the data (length, mimetype, encoding, language). Cosmo is able to support WebDAV and CalDAV by storing and retrieving resources using ContentItems? (WebDAV) and CalendarEventItems? (CalDAV). This works for regular DAV resources and events (.ics resources), but what about Notes, Tags, Contacts, or any other object that Chandler wants to share?

Question: Will Chandler still use .ics as event format or will event properties be stored in EIM records? It's easy to create an .ics from EIM (assuming you know about the record types), but the reverse is harder (what do you do with extra data).

Where does the EIM fit into the Cosmo data model? Does it fit? EIM is all about records. Cosmo revolves around Items. In EIM, records have any number of primitive values. In Cosmo, Items have any number of primitive attributes. Seems to match up pretty well right? The one difference is that in EIM, many different records can describe a single item in Chandler’s world. For example a Contact object in Chandler is decomposed to a content item record and a contact record. And if there are any associations (such as annotations), each association is also a record.

So that means representing every record from the EIM as an item doesn’t make sense. We have to take a set of related records and represent them as an item, and then given an item, be able to decompose it into a set of records. This can be done using Attributes scoped by namespace. Each record has a namespace associated with it, and a set of values. Each value in a record can be represented as an Item Attribute. The Attribute would have a namespace equal to the namespace of the record. The Item can be decomposed to records by grouping all the attributes by namespace.

For example considering the following EIM records:

namespace http://schemas.osafoundation.org/pim/contentitem

#(Item UUID, title, body, createdOn, description, lastModifiedBy UUID)

(4, N/A, N/A, "2006-07-13 12:30:00-07:00", N/A, 4)

namespace http://schemas.osafoundation.org/pim/contact

#(Contact UUID, emailAddress, firstName, lastName)

(4, "morgen@example.com", "Morgen", "Sagen")

There are two records representing a Contact item in Chandler. In Cosmo these records are reperesented as Item attributes:

AttributeType Namespace Attribute Name Value
StringAttribute? http://schemas.osafoundation.org/pim/contentitem UUID 4
StringAttribute? http://schemas.osafoundation.org/pim/contentitem title null
LobAttribute? http://schemas.osafoundation.org/pim/contentitem Body null
DateTimeAttribute? http://schemas.osafoundation.org/pim/contentitem createdOn 2006-07-13 12:30:00-07:00
StringAttribute? http://schemas.osafoundation.org/pim/contentitem Description null
StringAttribute? http://schemas.osafoundation.org/pim/contentitem lastModifiedBy 4
StringAttribute? http://schemas.osafoundation.org/pim/contact UUID 4
StringAttribute? http://schemas.osafoundation.org/pim/contact Email morgen@example.com
StringAttribute? http://schemas.osafoundation.org/pim/contact firstName Morgen
StringAttribute? http://schemas.osafoundation.org/pim/contact lastname Sagen

Questions:

  1. In Cosmo, an Item has a UUID. In EIM there is really no way to determine what this should be. How do we decide what the UUID is? Can we assume the first field of a record is the item UUID.
  2. There is duplication in the above example. The item’s UUID is stored twice. Is this preventable? If we assume that the first field in all EIM records is the item UUID, then we wouldn’t even have to store that Attribute.

In order to support updating and diffing, we need to be able to determine the identity of an EIM record. One or more fields in the record determine the primary key and there has to be a way to determine this. This is currently under discussion.

Association Records in EIM

Some record types in EIM don’t describe an item, rather an association between items. For example consider item tags. The association between items and tags is represented as a record containing an item UUID and a tag UUID. There can be any number of these records for every tag associated with an item. How do these records fit into the Cosmo Model? We can’t store them as Attributes because there can be multiple records in the same namespace, and this isn’t supported on a single Item, as the Attribute names would collide.

One way to support this is to change the Cosmo Item to have a Set of Attribute Maps, scoped by namespace. A Map of Attributes can be thought of an EIM record. The namespace is the record type, and each entry in the map corresponds to a record field.

          public abstract class AttributeSet {    
              String namespace;
              Map<String, Attribute> attributes;
          }

The problem with this is that its no longer easy to get a specific Attribute by name as you have to navigate through sets of attributes. And what if you have multiple set's with the same namespace?

The object model looks something like:

It turns out some EIM records can't be mapped to items, because they describe associations between items. Take for example a many-to-many relationship. These records can't be stored in the item. Instead, we can store these records in the Cosmo Collection Item. The same goes for any record types that Cosmo doesn't understand. For 0.6 Cosmo will understand Calendar Events and maybe Contacts, but any other record types will be stored in the collection item.

The process of translating EIM records to Cosmo Items looks something like:

EIM and other Protocols

Now that we have a way to store EIM data in Cosmo, we have to figure out how Cosmo uses that data to support other protocols like CalDAV. In order to support CalDAV, Cosmo must know that an item is a calendar resource, and have access to the .ics data, which it uses to index data for time range and other calendar resource queries. This means that Cosmo needs to be aware of calendar resource EIM record types. What happens when a CalDAV client updates a calendar collection with a new event? Cosmo needs to be able to convert that event to EIM to be able to share with Chandler. Cosmo will have to know about the EIM schema to do this. For 0.6 Cosmo will have to provide a way to convert a CalendarItem? created with CalDAV/internal APIs into Chandler EIM form.

The process of sharing an event using Chandler and accessing that event using a CalDAV client looks like:

And the reverse:

Notes

  • If the schema for known object types such as calendar events changes, then Cosmo will require changes if Cosmo needs to to anything special with the schema change. For example, if the "description" field is moved to a different recordtype, then Cosmo would require changes to make sure that the "description" field is correctly shared using CalDAV.

EIM and Schema Versions

Chandler supports having multiple sharing schemas. A sharing schema can be thought of a collection of record types (namespaces) and the import/export logic for those record types. Is the goal for Cosmo to support multiple sharing schemas? If Chandler A shares a collection using version 2 of a sharing schema and Chandler B (doesn’t support version 2) wants to subscribe, should Cosmo be able to serve the data using another sharing schema? It seems like the sharing schema version needs to be stored somewhere in the Cosmo Item.

Sharing Representation

After Chandler data gets converted to EIM, in needs to be serialized into some external representation and transported to Cosmo. The logical choice is XML.

An example of an XML representation of EIM records found in ExternalInformationModel is:

<records xmlns="http://schemas.osafoundation.org/sharingformat/1"
    xmlns:con="http://schemas.osafoundation.org/pim/contentitem"
    xmlns:tag="http://schemas.osafoundation.org/pim/contentitem/tags"
    xmlns:cta="http://schemas.osafoundation.org/pim/contact" >
    <con:record>
        <uuid>1</a> <!-- In this example I used simple uuid values for readability, 
           but real uuids will be in RFC 4122 form, e.g. 611fcf54-296e-11db-b36c-bc8a258a92d5 -->
        <con:title>Example note</con:title>
        <con:body mimetype="text/plain" encoding="utf-8">VGgZ2V2tlbg==</con:body> <!-- Lob -->
        <con:createdOn>2006-08-08 9:50:58.432510 US/Pacific</con:createdOn>
        <con:lastModifiedBy>4</con:lastModifiedBy>
    </con:record>
    <con:record>
        <uuid>2</a>
        <con:title>Work</con:title>
        <con:createdOn>2006-08-08 9:50:58.432510 US/Pacific</con:createdOn>
        <con:lastModifiedBy>4</con:lastModifiedBy>
    </con:record>
    <con:record>
        <uuid>3</a>
        <con:title>Work</con:title>
        <con:createdOn>2006-08-08 9:50:58.432510 US/Pacific</con:createdOn>
        <con:lastModifiedBy>4</con:lastModifiedBy>
    </con:record>
    <con:record>
        <uuid>4</a>
        <con:createdOn>2006-08-08 9:50:58.432510 US/Pacific</con:createdOn>
        <con:lastModifiedBy>4</con:lastModifiedBy>
    </con:record>
    <tag:record>
        <tag:item>1</tag:item>
        <tag:target>2</tag:target>
    </tag:record>
    <tag:record>
        <tag:item>1</tag:item>
        <tag:target>3</tag:target>
    </tag:record>
    <cta:record>
        <uuid>4</a>
        <cta:emailAddress>morgen@example.com</cta:emailAddress>
        <cta:firstName>Morgen</cta:firstName>
        <cta:lastName>Sagen</cta:lastName>
    </cta:record>
</records>

This makes sense as each EIM record is an XML element. A collection is represented as all the EIM records for all the items in the collection.

Question: Where is item kind represented in EIM?

It would seem Chandler would have to do some extra work after the data is converted to EIM to group the data together like this. It would have to know something about the data (such as the associations and the type).

Question: Is the goal for sharing to be able to convert Chandler data to EIM, and then share using EIM? If so, I don’t see how grouping EIM records into items like the above example would work with additional item types without custom code after the EIM conversion occurs.

Another question, how is Cosmo going to take something like:

<records xmlns="http://schemas.osafoundation.org/sharingformat/1"
    xmlns:con="http://schemas.osafoundation.org/pim/contentitem"
    xmlns:tag="http://schemas.osafoundation.org/pim/contentitem/tags"
    xmlns:cta="http://schemas.osafoundation.org/pim/contact" >
    <con:record>
        <uuid>1</a> <!-- In this example I used simple uuid values for readability, 
           but real uuids will be in RFC 4122 form, e.g. 611fcf54-296e-11db-b36c-bc8a258a92d5 -->
        <con:title>Example note</con:title>
        <con:body mimetype="text/plain" encoding="utf-8">VGgZ2V2tlbg==</con:body> <!-- Lob -->
        <con:createdOn>2006-08-08 9:50:58.432510 US/Pacific</con:createdOn>
        <con:lastModifiedBy>4</con:lastModifiedBy>
    </con:record>
. 
.
.
</records>

and realize that con:title is a String, or Date, or Boolean, etc? There has to be some type of mapping between namespace:attribute_name to data type. How is this specified? Is this stored in the DB? Configuration file? Passed as part of the request? Included in the XML (like <string>val</string>).

Answer: There is discussino about having meta-data record types, that is, record types that describe records, such as the field types and the primary key for the recordtype. TODO: Figure this out.

Sharing Protocol

Once we decide on how the data is represented, we can come up with a protocol for sharing with Cosmo. Because Chandler is using WebDAV and CalDAV now, it makes sense to come up with a RESTful protocol using an XML representation of the EIM. The protocol will support publishing/subscribing/updating collections. The protocol will also have to support some type of synchronization, so as to only transmit changed data. In WebDAV and CalDAV, this was accomplished using ETags. The synchronization process in 0.5 is something like:

  1. do a propfind on collection to get etags for all resources
  2. compare list of resources from server to local cache
  3. for any item with a different etag, re-fetch the item
  4. for any item not in cache, fetch the item from the server
  5. for any item in cache that is not in the list from the sever, do a put

For 0.6, we should look into providing a synch method similar to the addrbk-synch in CardDAV?. That is, provide a synch method that takes in a synchronization token and returns only changes since that token was generated. A new token is returned as part of the response, which the client can use on successive synchs.

-- RandyLetness - 23 Oct 2006

toggleopenShow attachmentstogglecloseHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
jpgjpeg CosmoSharingBasics.jpeg manage 18.1 K 23 Oct 2006 - 10:25 RandyLetness Cosmo sharing basics
jpgjpeg ItemHierarchy.jpeg manage 16.8 K 23 Oct 2006 - 10:46 RandyLetness Item Hierarchy
jpgjpeg BasicCosmoSharing.jpeg manage 24.3 K 25 Oct 2006 - 10:43 RandyLetness basic Cosmo sharing diagram 2
jpgjpeg EIMToCosmoItem.jpeg manage 41.1 K 25 Oct 2006 - 13:48 RandyLetness diagram of how EIM records are stored in cosmo
jpgjpeg CalDAVToEIM.jpeg manage 36.7 K 25 Oct 2006 - 13:50 RandyLetness flow of how CalDAV? write gets to Chandler
jpgjpeg ChandlerToCalDAV.jpeg manage 46.0 K 25 Oct 2006 - 13:50 RandyLetness flow of how Chandler shared item gets to CalDAV? client
gifgif AttributeDiagram.gif manage 7.9 K 26 Oct 2006 - 09:40 RandyLetness Attribute UML diagram
jpgjpeg BasicCosmoSharing-2.jpeg manage 36.4 K 27 Oct 2006 - 09:25 RandyLetness basic Cosmo sharing diagram 2
jpgjpeg ChandlerToCalDAV-2.jpeg manage 53.7 K 27 Oct 2006 - 09:26 RandyLetness flow of how Chandler shared item gets to CalDAV? client
jpgjpeg EIMToCosmoItem-2.jpeg manage 57.3 K 27 Oct 2006 - 09:27 RandyLetness diagram of how EIM records are stored in cosmo
Edit | WYSIWYG | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r6 < r5 < r4 < r3 < r2 | More topic actions
 
Open Source Applications Foundation
Except where otherwise noted, this site and its content are licensed by OSAF under an Creative Commons License, Attribution Only 3.0.
See list of page contributors for attributions.