Conor's Blog
What is a Caregraf?
A Caregraf is
a patient’s structured data published according to Linked-Data principles
Yes, it’s a mouthful, but if you’re reading this, then you understand structured patient data – the likes of problem lists, prescriptions, procedures, consultations ordered and their results. You’re aware that machine-processable descriptions of a patient’s health start off in EHRs and all too often remain locked there. But what’s Linked-Data?
Let’s start with something we all know, the Web of Documents. In this web, information is marked up as HTML pages, addressed with URIs (http://www.google.com) and transmitted from server to browser with the HTTP protocol. Right now, you’re in the Web of Documents, reading a HTML-marked up document.
Linked-Data, also called the Web of Data, refers to a set of best practices for publishing and connecting structured data on the Web
. The same people who conceived the web of documents took its best features (identify with URIs, transmit with HTTP) but for HTML, which is made for document markup, they substituted a way to specify structured data called RDF. In RDF, data comes out of its silo’s and can become a native citizen of the web.
Here’s a piece of patient data: there once was an exam …
Dr Fred Jones performed a chest X-Ray on Joe Smith on the 2nd January, 2011
It’s a bunch of information but it all reduces to a set of very simple statements or triples, each with a subject, a predicate (or verb) and an object …
The exam patient was Joe Smith The exam procedure was a Chest X Ray The exam was performed by Dr Fred Jones The exam was performed on 2nd January, 2011
You can keep forgetting your grammar class: there are no subordinate clauses, cases or moods with triples. This is a land of Jane saw Mary
but it supplies all the nuance we need: all statements – medical or otherwise – can be reduced to one or more triples.
Now for RDF. RDF let’s us markup triples using the web’s tried and true: just as every document in the web of documents gets a URI, so does every subject, verb and object in Linked-Data. Every time we want to refer to Dr Jones, let’s use http://hospital.caregraf.com/personnel/10, for this exam, http://hospital.caregraf.com/data/9999, for the verb, performed by, http://hospital.caregraf.com/schema/performer, so that …
<http://hospital.caregraf.com/data/9999>
<http://hospital.caregraf.com/schema/performer> <http://hospital.caregraf.com/personnel/10>says The exam was performed by Dr Fred Jones
. Using a shorthand for predicates, all four triples are expressed with …
<http://hospital.caregraf.com/data/9999>
cghs:performer <http://hospital.caregraf.com/personnel/10;
cghs:patient <http://hospital.caregraf.com/patient/100> ;
cghs:procedure <http://datasets.caregraf.org/cpt/71010>;
cghs:when "2011-01-02".And we can go on making statements but this time let’s reuse widely-adopted verbs: sharing terms makes data easier to exchange. foaf:name is a widely-used verb for declaring a person’s name and rdfs:label is the standard way to label.
<http://hospital.caregraf.com/personnel/10> foaf:name "Fred Jones" . <http://hospital.caregraf.com/patient/100> foaf:name "Joe Smith" . <http://datasets.caregraf.org/cpt/71010> rdfs:label "CHEST X-RAY".
Now while that’s it for RDF’ing our patient data, we’re not quite finished. Linked-Data isn’t any old RDF: every URI in that RDF must resolve as readily and in the same way as the oh-so-clicked http://www.google.com. Linked-Data is resolvable RDF on the web
and before you ask, yes, every URI in our RDF resolves: click on http://hospital.caregraf.com/data/9999, http://datasets.caregraf.org/cpt/71010 …
Web publish patient data! Madness!
, you cry? But remember, the web has tiers of privacy. It hosts the very public, like a bank’s contact information, along with the secured, like the particulars of your bank account.
Health-data has this split too: there’s public information like ICD 9 terminology definitions or a doctor’s publicly available state registration and then there’s the stuff of Caregraf’s, patient-specific information that must be secured. This same breakdown is reflected in web-based Personal Health Record (PHR) systems: yes, a patient must login to see the specifics of their record and yet that record links out to the public web for general information.
Let me restate my definition of a Caregraf to emphasize security…
a patient’s structured data published, securely, according to linked-Data principles
I’m going to end with this claim: a Caregraf is
the best medium for representing, securely exchanging and interpreting patient data.
and I’ll explain why in coming posts.

Your Thoughts?