BioBox Data Security and Compliance

This document provides an abridged summary of data security and compliance/policies at BioBox.

  1. Access Controls
    1. RBAC and ABAC on the BioBox Platform

      On the BioBox Platform users have a set of permissions defined by static roles within an Organization. More specific attribute policies within a principle-of-least-privilege framework can be used on specific resources within the application to allow or deny access to specific actions by other specific users.

      Within an Organization, there is a role hierarchy going from Member < Owner < Admin. Each role may promote other members up to their own role. The role of a user will determine whether they are able to either view, create, or edit resources within the Organization.

      Beyond the organizational roles of Admin, Owner, and Member, member specific access controls can be implemented by an Admin or Member within the Library System. Individual members can be assigned read or edit privileges on Entities, within a principle-of-least-privilege framework. This permits workflows where administrators can have read/write permissions on specific pieces of metadata, and other users cannot tell if the metadata even exists.

      For more detailed information on RBAC and ABAC on the BioBox Platform, see Library Model and Entity Security.
    2. Access Control Implementation and Infrastructure

      Access control infrastructure to support the RBAC and ABAC systems as described above is implemented via a policy execution framework which runs behind the internal network boundary. These policies are thoroughly tested for all cases upon each release of the BioBox Platform.
    3. User Authentication

      The BioBox Platform does not manage user identities, rather, it uses a 3rd party identity provider. The identity provider has ISO27001, SOC 2 Type II, ISO27018, and HIPAA BAA compliance. The identity platform is responsible for storing the email and password pair of each user, and hosting OAuth 2.0 login infrastructure. The BioBox Platform stores a copy of the opaque user ID from the identity platform and email address. The BioBox Platform DOES NOT store the user password. The password policy enforces minimum 8 characters, at least one of each of lowercase and uppercase characters, numbers, and special characters.

      The user session lasts a maximum of 24 hours at which time the user will be required to re-enter their credentials on the identity provider. At any time the user can voluntarily log out of their session.

      The credentials which are used to maintain a user session are non-Javascript accessible browser cookies containing opaque tokens. Login is performed via an OAuth 2.0 flow. The OIDC token returned from this flow, a JWT, is only stored on the backend in our internal network perimeter. Each subsequence request is authenticated by associating the opaque browser cookie token with an active session. The JWT does not leave the backend.
  2. Audit and Logging
    1. Log Storage and Retrieval

      Application logs for every API request are stored on our infrastructure in Google Cloud Platform in the northamerica-northeast1 region. Application logs are stripped of user identifying information, including the JWT retrieved during the authentication process. Logs from our database, API gateway software, and other supporting platform software are collected as well. Access controls implemented with IAM are used to limit internal access to logs. Application and system logs will be stored in perpetuity.
  3. Data
    1. Physical Location of Data Storage

      When users create an Organization, depending on which location they associate with the Organization, a Google Cloud Storage bucket will be provisioned either within northamerica-northeast1 (Montréal) or europe-west2 (London) zone. User initiated workflows run on ephemeral compute infrastructure on Google Cloud Platform in the northamerica-northeast1 region.

      Our database and infrastructure for the BioBox Platform runs on Google Cloud Platform in the northamerica-northeast1 region.
    2. Transfer of Data from Google Cloud Platform to the User

      All of the AJAX requests made by the BioBox Platform MUST be authenticated via the browser cookie and JWT as described above. The only resources which are an exception to this are static site content such as CSS and fonts, and when the user is retrieving content directly from Google Cloud Storage buckets, described below.

      The BioBox Platform will make API requests on behalf of the user to request signed URLs for downloading content from Google Cloud Storage. The credentials to create signed URLs only live within the backend in our network perimeter. These signed URLs are only valid for a temporary time. 
    3. Transfer of Data from Google Cloud Platform to the User’s Local System

      Content requested by the BioBox Platform on behalf of the user cannot be transferred to the user’s local system, it will only live within the Javascript heap memory of the browser’s session. The single exception is when a user requests to download a file which is retrieved from Google Cloud Storage using temporary signed URLs, as described above.
    4. Backup and Restore Procedures, Disaster Recovery
      Nightly backups are performed against our database. The restore process requires an operator with sufficient credentials to trigger an automated process, which requires approval from another privileged operator. Backup files are kept within Google Cloud Storage in the northamerica-northeast1 region.

      In the event of a disaster recovery event, an internal document formalizes the process to restore the database and alert affected parties within 24 hours.
    5. Data Retention Policy


      Once a user initiates the deletion of an Organization, all Organization data will be deleted, including the Google Cloud Storage bucket associated with that Organization. 


      When a user requests their account to be deleted, it will be removed from the BioBox Platform as well as the identity provider.

    6. Data Export

      On client request, all client data can be exported within 2 weeks.
  4. Encryption
    1. Data Encryption At Rest
      1. Object Storage

        Users may upload files into a Google Cloud Storage bucket located either in northamerica-northeast1 or europe-west2 as described above. Pipelines ran on the platform may produce additional files which are stored within this bucket. Every Organization has its own bucket. The buckets are encrypted at rest with AES256 encryption. Data encryption keys are managed by Google Cloud Platform.
      2. Database Storage

        The persistent disk backing our database is encrypted at rest with AES256 encryption. Data encryption keys are managed by Google Cloud Platform.
    2. Data Encryption in Transit
      1. External

        All network requests over the public internet made by the BioBox Platform are encrypted with TLS. A minimum version of TLS 1.2 is enforced for clients. TLS certificates are obtained through the LetsEncrypt service.
  5. Glossary
BioBox Platform

The web application accessible via https://biobox.app.

Organization

A collection of grouped users on the BioBox Platform wherein there are 1 or more Admins, any number of Owners, and any number of Members.

Role

A tiered list of identifying strings which map to a set of permissions.

Admin

An organizational level Role.

Owner

An organizational level Role.

Member

An organizational level Role.

Library System

The portion of the BioBox Platform where custom metadata field Properties can be assigned to Entities.

Entity

A collection of Properties and display name.

RBAC

Role Based Access Control

ABAC

Attribute Based Access Control

IAM

Identity Access Management

OIDC

OpenID Connect

AJAX

Asynchronous Javascript and XML