Thursday, April 17, 2008

Data Warehouse / BI Security

Peter O'Donnell and myself are currently supervising an honours student who is looking at the issue of data warehouse security, with a view to doing a survey of DW security practices in Australian companies. It's still early days, but one of the things that Justin has found is that there is very little literature (academic or otherwise) talking about the issue (either highlighting problems, or outlining best practice). This is both good and bad news: it means that Justin will be making a real contribution, but he's going to have trouble writing the literature review section of his thesis!

To give you some idea of where our thinking is at, here's a generic architecture for the flow of information through a data warehouse:

Each component of the diagram above is a potential security problem. Just the ETL process, for example, poses problems of massive amounts of data moving around a network, taken out of what is presumably an initially secure environment. We've found very little that talks about securing the individual components of the architecture, or of taking an holistic view and securing the whole process, end-to-end. On the flip-side, security often poses a problem from a functionality or performance perspective - what can we do to make the whole thing as responsive and functional as possible while still protecting an important organisational assett?

Any thoughts, war stories, pointers to resources or comments would be appreciated!

No comments: