Annotate gene names in cluster file

As you have gone through the tutorial till this stage, you would have noticed that the gene symbols are used as the field when plotting tSNE, UMAP and PCA plots. We prefer the gene names in the final cluster file for ease of use, so we don’t have to re-check what symbols match which genes. The AnnData format we use in this workflow already has the gene names stored in the object, which we need to extract. We will use a tool to extract the specific table with the gene annotation information and then use join and cut tools to obtain the final table with cluster information for each generated cluster.

For users running the workflow -

  • Scroll down to the three tools “Inspect AnnData”, “Join two datasets” and “Cut”

  • You don’t need to change any values for the parameters. You are now ready to run the workflow

  • Scroll to the top of the page and click “Run Workflow”

For users running each step -

The first step is to obtain the annotation information from the AnnData object

  • Open up the tool “Inspect AnnData” and select the output of “Scanpy FindMarkers” as the input object

  • Under “What to inspect”, select “Key-indexed annotation of variables/features (var)”

  • Click on “Execute”

The second step is to join the table from above with the table from “Scanpy FindMarkers”

  • Search for the tool “Join two Datasets”

  • Under “Join”, select the tabular output from the “Scanpy FindMarkers” tool

  • Under “using column”, enter 4

  • Under “with”, select the tabular output from the “Inspect AnnData” tool

  • Under “and column”, enter 2

  • Under “Keep lines of first input that do not join with second input”, select Yes

  • Under “Keep lines of first input that are incomplete”, select Yes

  • Under “Fill empty columns”, select No

  • Under “Keep the header lines”, select Yes

  • Click on “Execute”

The third step is to cut unnecessary columns and only retain the ones we need and in the right order

  • Open the tool “Cut columns from a table” from the search toolbar

  • Under “Cut columns”, enter “c1,c2,c3,c4,c11,c5,c6,c7,c8”

  • Under “Delimited by”, select Tab

  • Provide the dataset from the “Join two Datasets”

  • Click on “Execute”

You are now done with Single-cell tertiary analysis and ready to look at your clusters and markers within them.