Sunday, October 12, 2008

Give em what they want

I’ve worked within life science informatics for many years across a variety of companies. In the life sciences things at the bench can change very quickly. This often means that existing systems and applications are inappropriate for capturing the data for the latest experiments or techniques being carried out. It’s always been a goal of mine to provide ultimately flexible systems but I have to admit that I have never succeeded. Whether it is a rigid database or UI, the process workflow, or the fact that the data is just so completely different from what we have seen before, things just don’t seem to work out. Even when we try to provide a specific system for them, it takes time to implement and could be out of date even before it is delivered and is often rejected by the users. However having said all of that, the scientists always seem to manage on their own and are often more content with their solutions than those provided by the techies. This is largely down to Microsoft Excel which has established itself as the firm favourite for data capture amongst the scientists. It doesn’t seem to matter to them that everyone is doing their own thing, using different formats and storing the files all over the place on the network. They just seem to get by. I think there is a very good reason for this, it does everything they need it to do and it does not restrict them to some workflow or constrain them to some defined UI.

However mention the words Microsoft Excel to informatics and you will get a very different response, usually consisting of expletives. For informatics, Excel is definitely not considered as a data capture environment and is a pain to integrate especially when the same data is captured in many formats. If and when informatics come to add this data to newly devised systems there is often a huge overhead in formatting, potentially hundreds of files, into a standard that can be imported. It’s partly our own fault as we should have worked with the scientists to define templates in the first place, at least as an interim solution, but for one reason or another, and I can think of many, that doesn’t happen.

I have seen how the Semantic Web aids with search and query and how it provides a flexible UI experience. However I have struggled to see how it could help with a flexible UI for data entry and especially one that requires little training or understanding from the user perspective. That is until recently when I saw a demo from Lee Feigenbaum of Cambridge Semantics of their Anzo for Excel product. It was really cool; simply map the data in your spreadsheet to an ontology and magically it’s in the database. Not only that; another user could connect to the database from excel and format the same data in a completely different way for their own purpose. Even more impressive was that a web page could be generated of the data. All of this was made even more impressive still by the fact that the data in the spreadsheets and the webpage was live and the database could be updated through either.

It is easy to see how a product like Anzo could revolutionise data capture using simple tools like Excel. After all, it’s already on everyone’s desktop, no one needs training to use it and our scientists have already shown us that it can work for them. The only issue might be that we have to get our informatics people to embrace Excel as a key tool in our box and not to dismiss it based on prior prejudice.

No comments: