I have read with great interest an article posted in ClinPage – The Future of ODM, SDTM and CDISC. These discussions relate primarily to the proposed requirement from the FDA for data submissions to be made in XML format rather than SAS Transport file format. I don’t think we will see many arguments around this point – XML is now the accepted extensible method of describing the combined data and metadata. What is more contentious is that it is requested that data be provided in the HL7 v3 Message format. FDA Docket No. FDA-2008-N-0428 from August 2008 elaborates on where the FDA are in the process.
In addition to the move to an HL7 Message format rather than SAS XPT, commentary exists on a suggestion that a move to ODM rather than SDTM would be considered. This point is also put forward by Jozef Aerts of xml4pharma.
I would like to comment on a comparison of SDTM versus ODM.
Operational Data Model
ODM was the first CDISC standard to successfully go through the authoring process. It was aimed as a means to represent data in to context of data capture. Data was indexed to Visits and Forms. The syntax was designed to describe data not from an effective storage format, but from a source to destination format. You could get data from System A by Visit and Form to System B by Visit and Form. This is great where the presentation of the data has importance and meaning.
Submission Data Tabulation Model
SDTM, unlike ODM, focuses on groupings of data – not by CRF Form – but by the use of data. All Demographics information appears on the same record for example. The SDTM structure has now also become the basis for data delivery and storage within many organizations. A number of large PharmaBio companies based internal cross company standards on SDTM.
Modelling from Data Captured
The format of data will differ depending on the medium used to capture the data. Some form factors might have 30 questions on a form, others such as Patient Diaries, might only have 1 or 2 question per form. In addition, when designing a CRF for ease of use, it may not make sense to apply the content of each SDTM domain as the basis for deciding what does and does not go onto a single form. Whether the data appeared on one form, or across many forms is not important when it comes to the value of the data. Many EDC vendors have gone down the route of designing the database for data capture according to EAV rules – Entity-Attribute-Value form – where each value captured on any form is dropped into a single table. Once captured, data is then re-modelling into a relation structure that may or may not model the layout of the page. (xForms is a generic technology touted as being a potential means of addressing this challenge – I will leave further discussion on this to a later article).
Based on the above, it would seem logical that SDTM is of greater value when used as the method of delivery of data for submission or analysis than ODM.
However, that is not the only reason why SDTM makes sense over ODM when developing and executing eClinical studies. The primary reason related to metadata re-use.
ODM is not a suitable format for modelling studies because it does not lend itself to ensuring that similar studies are able to effectively re-use metadata. Sure – I can take a study, copy the metadata, and I have another study… easy… but what about changes. What if I remove a few fields, add a few fields, change the visit structure. That will of course change the data outputs format if ODM was the format- an issue – see above, but, more importantly it will greatly impact any rules that might exist on the forms. Rules that use some form of wildcarding mechanism may, or may not work. Anyway, this is not a posting on metadata architecture, so I will leave it at that.
Bringing together SDTM and HL7 v3
So back to SDTM and HL7. Is this the right way to go? I can understand the logic behind this. Being able to bring EHR and Clinical Trial data together within a common standard could be very useful. However, at what cost?
I am not aware of any eClinical application that automatically creates SDTM compliant data sets – regardless of transport layer. The mapping of proprietary metadata to SDTM is quiet involved with varying degrees of software development required from the various system vendors. Typically, either SAS macro transformations are used, or, some form of ETL (Extract, Transform and Load) Tool. This is all complicated enough. Creating a tool that creates SDTM datasets in HL7 v3 is considerably more complicated. Even for large companies it will be a major development under taking. The complexity is such that smaller companies will simply fail to manage to effectively deliver the data in a cost effective way.
Tools providers may step in – they may offer a means to convert a basic SDTM ASCII file with additional information into a SDTM HL7 v3 file. XML4Pharma as based on recent critique of the approach do not appear to be wishing to jump into supporting this, but, if this becomes a mandate, some companies will.
Playing on the other side of the argument – one of the principles of XML is that the data is also human readable. In reality, once you add all of the ‘overhead’, especially with a complicate syntax such as HL7 v3, you end up with something that is only readable by technical gurus. But then, maybe it shouldn’t be people that interpret these files, maybe the complexity has got to the point where it only makes sense that a computer application interprets the files and then presents the appropriate information to the user. Modern eClinical systems offer views on data. Maybe the presentation of the Submission data is managed in the same way – through an application that presents a view based on purpose.