Caregraf

Link Health-Data Now

Conor's Blog

What is a Caregraf?

A Caregraf is

a patient’s structured data published according to Linked-Data principles

Yes, it’s a mouthful, but if you’re reading this, then you understand structured patient data – the likes of problem lists, prescriptions, procedures, consultations ordered and their results. You’re aware that machine-processable descriptions of a patient’s health start off in EHRs and all too often remain locked there. But what’s Linked-Data?

Let’s start with something we all know, the Web of Documents. In this web, information is marked up as HTML pages, addressed with URIs (http://www.google.com) and transmitted from server to browser with the HTTP protocol. Right now, you’re in the Web of Documents, reading a HTML-marked up document.

Linked-Data, also called the Web of Data, refers to a set of best practices for publishing and connecting structured data on the Web. The same people who conceived the web of documents took its best features (identify with URIs, transmit with HTTP) but for HTML, which is made for document markup, they substituted a way to specify structured data called RDF. In RDF, data comes out of its silo’s and can become a native citizen of the web.

Here’s a piece of patient data: there once was an exam …

Dr Fred Jones performed a chest X-Ray on Joe Smith on the 2nd January, 2011

It’s a bunch of information but it all reduces to a set of very simple statements or triples, each with a subject, a predicate (or verb) and an object …

The exam patient was Joe Smith
The exam procedure was a Chest X Ray
The exam was performed by Dr Fred Jones
The exam was performed on 2nd January, 2011

You can keep forgetting your grammar class: there are no subordinate clauses, cases or moods with triples. This is a land of Jane saw Mary but it supplies all the nuance we need: all statements – medical or otherwise – can be reduced to one or more triples.

Now for RDF. RDF let’s us markup triples using the web’s tried and true: just as every document in the web of documents gets a URI, so does every subject, verb and object in Linked-Data. Every time we want to refer to Dr Jones, let’s use http://hospital.caregraf.com/personnel/10, for this exam, http://hospital.caregraf.com/data/9999, for the verb, performed by, http://hospital.caregraf.com/schema/performer, so that …

<http://hospital.caregraf.com/data/9999> 
      <http://hospital.caregraf.com/schema/performer> <http://hospital.caregraf.com/personnel/10>

says The exam was performed by Dr Fred Jones. Using a shorthand for predicates, all four triples are expressed with …

<http://hospital.caregraf.com/data/9999> 
    cghs:performer <http://hospital.caregraf.com/personnel/10; 
    cghs:patient <http://hospital.caregraf.com/patient/100> ;
    cghs:procedure <http://datasets.caregraf.org/cpt/71010>;
    cghs:when "2011-01-02".

And we can go on making statements but this time let’s reuse widely-adopted verbs: sharing terms makes data easier to exchange. foaf:name is a widely-used verb for declaring a person’s name and rdfs:label is the standard way to label.

<http://hospital.caregraf.com/personnel/10> foaf:name "Fred Jones" . 
<http://hospital.caregraf.com/patient/100> foaf:name "Joe Smith" .
<http://datasets.caregraf.org/cpt/71010> rdfs:label "CHEST X-RAY".

Now while that’s it for RDF’ing our patient data, we’re not quite finished. Linked-Data isn’t any old RDF: every URI in that RDF must resolve as readily and in the same way as the oh-so-clicked http://www.google.com. Linked-Data is resolvable RDF on the web and before you ask, yes, every URI in our RDF resolves: click on http://hospital.caregraf.com/data/9999, http://datasets.caregraf.org/cpt/71010

Web publish patient data! Madness!, you cry? But remember, the web has tiers of privacy. It hosts the very public, like a bank’s contact information, along with the secured, like the particulars of your bank account.

Health-data has this split too: there’s public information like ICD 9 terminology definitions or a doctor’s publicly available state registration and then there’s the stuff of Caregraf’s, patient-specific information that must be secured. This same breakdown is reflected in web-based Personal Health Record (PHR) systems: yes, a patient must login to see the specifics of their record and yet that record links out to the public web for general information.

Let me restate my definition of a Caregraf to emphasize security…

a patient’s structured data published, securely, according to linked-Data principles

I’m going to end with this claim: a Caregraf is

the best medium for representing, securely exchanging and interpreting patient data.

and I’ll explain why in coming posts.

Your Thoughts?

Conor At Caregraf

About this Blog

It's a ramble around patient-data representation and analysis, following the mantra to work through examples, don't just talk principles.. Here are the topics ...

About Conor

Conor Dowling is the CTO of Caregraf.

What strikes me is the match: on one side is Linked-Data, the most powerful way to exchange diverse data, and, waiting on the other, too-long unattended, is the volume and diversity of health-data. All they need is a push.