Cytobank has released version 5.5.0 with enhancements to our API, enabling more flexible and functional workflows that leverage Cytobank’s secure infrastructure and cloud-based compute and storage. Among the enhancements are new API endpoints for viSNE, CITRUS, SPADE, sharing, Sample Tags, and compensation.
At Cytobank, we’ve seen emerging needs among scientists and research organizations the world over that are driving the development of our API. These needs often demand functionality beyond that given by basic browser-based analysis sessions, with themes including connecting the Cytobank platform directly to other information systems, allowing batch processing and chaining of native functionality, and supporting pull and push of data, configurations, statistics, and attachments from Cytobank to support external pipelines, algorithms, and studies.
In this article, we present a variety of workflows highlighting how the Cytobank API can increase the efficiency and velocity of research efforts. Illustrated workflows include:
- Pulling clinical data and programmatically applying Sample Tags in Cytobank, as well as batching numerous CITRUS runs based on these data
- Automatically applying bubbles to SPADE trees
- Broad-scope analysis across different analyzed datasets to gather information for comparing analysis methods and training a cell identity classifier
- Running PCA on data in Cytobank
Fetch Clinical Research Data, Apply Sample Tags, and Batch CITRUS Runs
Researchers are often interested in the association between single cell analysis results stored in Cytobank and multiple clinical variables, including those that weren’t the primary endpoints of their study. Key information on these clinical variables is usually stored separately from the single cell analysis results in Cytobank, and the cross-referencing of this information can be burdensome. This burden increases when consolidating analytical results and findings in other information systems for compliance.
Even once the clinical data and specimen data have been combined, analysis itself provides another set of challenges. In the case of a dataset involving many participants with a variety of clinical variables and outcomes, combing through the data to find signatures correlated to each clinical variable can be tedious and time consuming.
Using the Cytobank API, you can seamlessly transfer structured clinical information to Cytobank as Sample Tags to provide context for analysis. Couple this simultaneously with the automatic batching of numerous CITRUS analyses to thoroughly evaluate every permutation of clinically-related groupings of samples and quickly discover any significant correlations within the data.
Workflow 2: Automatically Bubble a SPADE Tree
SPADE is a useful way to analyze and visualize data; however, one drawback of the method is that each SPADE tree must be categorized into biological populations from scratch per run of the algorithm in a process termed “bubbling.” This presents a bottleneck for analysis. Automating the categorization of SPADE clusters into phenotypic groups would greatly accelerate the interpretation of SPADE results.
How to best categorize clustered or single cell data into discrete populations is an oft-debated topic. However, for those users that have defined their own criteria for assigning cellular identity, Cytobank offers an API endpoint for setting node-to-bubble relationships. A script can be developed easily to read the lightweight cluster metadata for a SPADE run, establish a biological identity for each cluster, and then post these relationships to Cytobank as bubbles. Researchers can then interact with the categorizations and further explore the dataset without the time burden of bubbling.
Workflow 3: Meta-Analysis of Analyzed Datasets
As the number of analyses executed by the many researchers using Cytobank grows over time, a rich repository of information and knowledge accumulates. Using search tools on Cytobank, years of existing research can be polled for data relevant to current questions, be it for a particular biomarker, disease area, therapeutic compound, or any scientific variable. With operations via the Cytobank API, more sophisticated analyses that combine data from multiple experiments can be orchestrated to ask deeper questions and extract value from large swaths of the centralized, structured data on Cytobank.
One example of a meta-analysis you might want perform using the Cytobank API would be to train a classifier that can assign population identities to single cell data. The large number of data sets that have already been analyzed and labeled with biological context by human experts on Cytobank can be used as training and validation sets. Regardless of the method used to categorize cells into populations (sequential gating, SPADE, viSNE, CITRUS, etc.), an identical core set of statistical metadata to describe these populations can be obtained.
Using the API, all of this summary classification data from a variety of experiments can be extracted simply into a standard format. This information can then be combined and mined for patterns that inform the automatic classification of future single cell data, controlling for the categorization method that was used. Alternatively, it can be used to make comparisons between categorization methods.
Workflow 4: Run PCA on Data in Cytobank
The number of useful analytical methods applicable to cytometry data has grown rapidly in recent years. Cytobank doesn’t natively support many of the methods being developed in the field or currently allow custom scripts to be executed inside our platform. However, we are developing a middle ground where a researcher can install a software package such as the R software environment (a simple process akin to installing any computer application), and Cytobank will provide all the more challenging scaffolding to connect data on Cytobank to published algorithms with a single command. The experience will be as similar as possible to setting up a native algorithm within Cytobank and free the researcher from grappling with the complexities of using the underlying analytical packages.
Get Started Using the Cytobank API
Our API opens up the Cytobank platform to workflows and creative use far beyond the native functionality of Cytobank. The resources below will help you get started:
- Check out the Cytobank API Endpoint Reference for documentation on available endpoints.
- Install the CytobankAPI R package to make interaction with the Cytobank API a breeze in the R programming language.
- Get in touch with Cytobank Support for help using the API or to discuss possible workflows.
- Contact Cytobank Research and Analytical Services to discuss advanced API integrations or your hardest problems in data production, analysis, and visualization.
Modern efforts in basic and clinical research routinely bring together teams of many people with different skill sets investigating biology across different data types and geographies. Beyond the immediate challenges of doing high quality science (producing and analyzing data) exists an equally challenging problem of coordination, logistics, and centralization such that research programs can produce results in a timely and reproducible fashion while being assured of security, privacy, and storage fidelity of data and results. This latter problem is often approached without the best system for the job and can cause diminished productivity from the resulting overhead.
The Cytobank platform was designed with these needs in mind to make modern research programs efficient and productive. Our integrated suite of tools handles basic analysis, advanced algorithms and visualization, organization and searchable archiving, and permissioned sharing. A secured and redundant cloud infrastructure scalably powers the platform and allows access from anywhere in the world on a web-enabled device.