HIPAA and Time-based Data
At the risk of glossing over details, it is fair to say that given current HIPAA regulatory restrictions and criminal penalties surrounding the release of personal health information (PHI), Dates and DateTime fields within a database need to be hidden from non-PHI privileged users. The rationale behind HIPAA date and datetime restrictions are that if someone knows a date and very little else that it is relatively easy to pinpoint the identity of a patient within your medical study. Dates-of-birth are too obviously useful pieces of information. Anyone calling into a health clinic these days is asked their name and DOB (and little else). If that person passes that simple telephone authentication exercise they have access all kinds of personal health information to which they should have no access whatsoever. Rightfully, HIPAA requirements cast dates of birth strictly into the PHI category. For slightly more subtle reasons, HIPAA also places most other time-related data into this category as well. Even an appointment date at a clinic with a little nefarious detective work can lead to personal health information.So, if access to dates is PHI, how do you enable analysis of time-related medical information when that data cannot be seen by members of a study team who are chartered with statistical analysis? Do you simply open patient dates up to a wider group of study team members? By doing so what legal risks are you then exposing your study team, your institution and yourself to over the entire duration of the study? HIPAA penalties are significant, and not just for institutions; individuals can be criminally prosecuted under these statutes as well.
How can statistical analysis be done if access to dates is so narrowly restricted?
Let's first look at the what is important about dates in a medical/clinical context. The fact that the onset of a symptom or the date of a baseline examination occurs on a particular date is not as significant one might, at first, believe. What is often more significant is the duration between two dates, the distance in time between entry into an ER and onset of secondary symptoms; the age of a patient at onset of a particular illness and not that patient's birth date. The medical significance of a thirty-two-year-old contracting breast cancer is higher than for an eighty-six-year-old, but their actual dates-of-birth are relatively insignificant. It is the age of the patient at onset that can be significant to analysis.
What impact does this have on the kinds of data stored within a database? Or more particularly - what can the presentation layer (EDC forms, reports, data extracts and the like) do to make access to time-related PHI information appropriate for subsets of users within a study team?
The first step is to identify PHI-related time-based information within your study's database. One obvious way to do this would be to cast a wide net over all fields defined within your database with the data type of DATE or DATETIME, but that may not gather a long enough list of PHI candidates as sometimes dates and date time values are stored within character storage fields that would be seen by the system a simply data type CHARACTER or TEXT fields. So the ability to automatically assess PHI risk by also looking at the actual field names within the database's data dictionary or like-minded meta data repository (see above screenshot). Words such as 'date', 'time', or 'DOB' and be able to mark them automatically as PHI data. Having a capability built into the layering of the application development mechanisms during the creation and maintenance of your study's database that enables the easy setting of such institutional standards makes this otherwise onerous process nearly automatic is of obvious benefit. Unforturnately, of-the-shelf systems, built for general database storage and maintenance do not offer such capabilities.
The next step is to have mechanisms built into the application toolset maintaining your study's data that restrict access only to pre-authorized users. From EDC forms to data extract mechanisms, to graphic reports - all outwardly facing interfaces to the system must be able to shield PHI from prying eyes.
The final step is to be able to represent PHI in a non-PHI fashion to users that do not have PHI privileges. One conceptually simple way to do this is to define calculated fields that stand "next" to your PHI time-related fields that can distill that which is analytically relevant to the study, leaving PHI data hidden, encapsulated beneath and within the calculation. These calculated fields must be flexible enough to enable arbitrarily complex mathematical calculation, database lookup, traversal and analysis during the calculation, and enable customized presentation of the calculation to meet a wide variety of potential study requirements. Finally, these fields must be defined and maintained in a central data dictionary so that all elements responsible for data access, maintenance and presentation comply with institutional/study-wide standards.
Bill Hedge, QuesGen Systems, Inc.
No comments:
Post a Comment