by Anthony Dina

Advancements in healthcare seem to be occurring faster than in any other time in history: from the way we treat diseases such as cancer and diabetes to the development of new medications. Big Data plays a crucial role in these developments. 

According to Health Affairs, a California Healthcare Foundation publication, the mining of Big Data is poised to help healthcare institutions improve patient care and outcomes, while reducing costs. (You can read the article in its entirety here.)  But healthcare institutions must educate and protect themselves to avoid security risks due to their use of Big Data solutions.  

There are options to help limit security risks when using Big Data solutions. For example, Intel has been a major contributor to help integrate enterprise class security features into Hadoop with Project Rhino, an open-source effort to improve the data protection capabilities of the Hadoop ecosystem.

Cloudera also has made significant progress in security with Project Sentry. Its Apache Sentry is a unified authorization mechanism allowing customers to store sensitive data in Hadoop. It is a fully integrated component of CDH and provides fine-grained authorization and role-based access control all through a single system.  

Combining this with Dell servers and networking, security features enable customers to build end-to-end enterprise ready, secure Hadoop solutions.

In this posting, we'll look at other steps that can help to limit potential risks.

Have Vigilant, Repeatable Big Data Governance Processes in Place 

Personal health information (PHI) is protected by the Health Insurance Portability and Accountability Act (HIPAA) for very important reasons. A breach in security can lead to identify theft, insurance fraud, medication theft, and even discrimination. 

Having in place vigilant and repeatable governance processes for accessing PHI Big Data is imperative to protecting that information. Limiting access and creating specific and uniform procedures to gain access are key. Additionally, far too many organizations allow individual groups to circumnavigate set procedures, placing the entire organization at risk for a security breach. (You can read more about Big Data governance programs here.)

As we mentioned earlier, one tool that can help with governance is Cloudera CDH with Sentry. It enables the granularity required to secure access to data within the Hadoop cluster for the majority of SQL, BI, and search tools and use cases - all through a single system.

Sentry includes:

  • Improved Regulatory Compliance – Customers can utilize Hadoop while aligning with regulatory mandates like HIPAA, SOX, and PCI
  • Role-Based Administration – Administrators can unlock key role-based access control (RBAC) requirements and define requirements  on what users can do with data within a server, database, table, view, and search indexes
  • Data Classification – Content producers and owners can intersperse sensitive data with non-sensitive data in the same data set through fine-grained controls
  • Expanded User Base – Operations staff can extend the power of Hadoop to more users through a central administration group - allowing different departments to access different

Make the Information Anonymous When Possible

Depending on the type of data being mined, removing any identifying information greatly reduces the risk of PHI being compromised. For example, hospitals can access their data to positively impact everything from staffing needs to asset allocation without requiring any identifying personal information.   

Get an Annual Check Up 

Conduct regular risk assessments / audits. They ensure: 

  • Set processes are being followed
  • No breach, even a minor one, has occurred
  • Any potential risks are identified

Big Data offers a treasure trove of possibility for the healthcare industry. Keeping the information that is mined secure guarantees those possibilities can become reality.