scGRNdb Tutorials
Welcome to the tutorials page. Here you will find detailed guides for using scGRNdb's analysis pipelines.
Overview
scGRNdb provides 3 core functions:
- Explore Networks: Browse available cell type GRNs and query your genes of interest.
- Module-Pathway Enrichment Pipeline: Identify modular components in your own network and functionally annotate them with pathway and disease databases.
- Network Prioritization Pipeline: Query your genes of interest against the scGRNdb database to find cell type GRNs and driver genes that best model your data.
Explore Networks
This section allows you to browse and visualize available gene regulatory networks in scGRNdb. You can also compare networks for similar network structures.
Step 1: Browse Available Networks
The network table provides all the available GRNs in scGRNdb. You can filter networks by species (human or mouse), tissue, and cell type, as well as the scRNAseq data atlas used to generate the network. Once you selected your search criteria, click on up to 2 table entries, and the input forms will automatically populate with your choices.
After selecting the networks, choose how you would like to visualize the network under "Network Scope". "Direct Neighbors" allows you to input a list or file of genes to visualize in the networks. "Full Network" allows you to visualize the entire network at the module level, where you can explore the functional pathways of those modules.
Example Input Gene File
GENE1 |
GENE2 |
GENE3 |
GENE4 |
GENE5 |
GENE6 |
GENE7 |
GENE8 |
Step 2: Visualize the Network
Direct Neighbors
If you selected "Direct Neighbors" in the previous step, you will be able to visualize the network of your input genes. The size of the nodes (genes) will be proportional to their degree in the network, and the edges will be weighted by the edge weight.
There are a couple of styling options available:
- Adjust node size based on degree.
- Filter edges by weight.
- Refresh layout whenever you adjust the filtering parameters.
- Expand the search depth from your input genes network.
Full Network
If you selected "Full Network" in the previous step, you will be able to visualize the entire network at the module level. The size of the nodes (modules) will be proportional to the number of genes in the module, and the edges will be weighted by the number of outgoing edges from the module.
You can search for your genes of interest by supplying a list or txt file of genes in the same format as the Step 1 input genes. Modules that contain your genes will be highlighted in the network, and you can click them to explore in more detail. Double-click on any module to view the genes in the it, as well as its associated functional pathways and diseases. This will display a second network visualization below, similar to the Direct Neighbors visualization.
There are a couple of styling options available:
- Adjust node size based on the number of genes in the module.
- Filter edges by weight.
- Refresh layout whenever you adjust the filtering parameters.
Compare Networks
When 2 networks are selected in either the "Direct Neighbors" or "Full Network" visualizations, you can compare their network structures. Click the "Combine Networks" button will combine both networks. The first network's nodes and edges will be colored in red, the second in blue, and the shared in green. The similar styling options are available as the "Direct Neighbors" visualization.
Module-Pathway Enrichment Pipeline
This pipeline helps you identify functional modules in your gene network and connect them to biological pathways and disease gene signatures.
Step 1: Select a Network
The main input for the analysis is a network file. It can be a network that you generated or a sample network provided. The network file should have the following columms:
HEAD
: Source gene (HGNC/MGI symbol)TAIL
: Target gene (HGNC/MGI symbol)WEIGHT
: Edge weight (numeric value)
Example Network File
HEAD | TAIL | WEIGHT |
---|---|---|
GENE1 | GENE2 | 0.1 |
GENE1 | GENE3 | 1 |
GENE4 | GENE3 | 0.75 |
Step 2: Select Module Parameters
The module detection algorithm will find the densely connected subgraphs in the network. It is based on Leiden clustering, which detects communities in the network by optimizing a modularity score. Since we want to analyze the function of the genes within the modules, we need to control the number of genes in the modules to provide interpretable pathway enrichment results. To do this, you can set the following parameters:
- Minimum module size (recommended default: 10 genes)
- Maximum module size (recommended default: 300 genes)
Step 3: Select Species and Pathway Databases
The next step is to select the species (human or mouse) and pathway databases for enrichment analysis. We have collected a list of pathway databases for each species, and you can select one or more of them. For sample data, we recommend starting with GO Biological Process and DisGeNET.
Pathway Databases
GO Biological Process | KEGG |
GO Cellular Component | Reactome |
GO Molecular Function | Biocarta |
DisGeNET | GWAS Catalog |
Step 4 (Optional): Provide your Email
If you provide your email, you will receive an email notification when the analysis is complete. If you do not provide an email, remember to save your sessionID to retrieve your results later.
Step 5: Submit and Monitor
Click Submit to start the analysis. You can monitor the progress of the analysis in the Review Files tab. If you provided an email, you will receive an email notification when the analysis is complete.
Step 6: Explore Results
Downloads
Once the analysis is complete, you can download the results in the downloads table:
- Modules - A .txt file listing all genes and their associated modules
- Pathway Enrichment - A .txt file with the full enrichment analysis results
Pathway Enrichment Table
The Pathway Enrichment file is also displayed as a table below the downloads table. Here is a description of the columns:
- ID - Unique ID
- MODULE ID - Identifier for each module
- PATHWAY - Name of the enriched pathway
- PATHWAY SOURCE - Database used for pathway enrichment
- P - P-value calculated using the hypergeometric test
- FDR - False Discovery Rate
- RISK RATIO - Enrichment score
- module_size - Number of genes in the module
- pathway_size - Number of genes in the pathway
- overlap - Number of overlapping genes between the module and the pathway
You can filter the results by module ID, size, or overlap. You can also adjust the number of rows displayed per page and download the entire table.
Step 7: Key Driver Analysis
Review the results table and choose pathways for further Key Driver Analysis (KDA). You can select as many pathways as you want. When you click "Prepare KDA", you will be redirected to the KDA tab, where you can review pathways you selected. Then, you can click Run KDA, which will take you to the KDA analysis page on Mergeomics Web Server. Your session will carry over to the Mergeomics Web Server with all input files and recommended parameters already set. All you will need to do is provide your email and click submit.
We recommend the default parameters for KDA. More details about the KDA parameters can be found on the Mergeomics tutorial page.
Network Prioritization Pipeline
This pipeline allows you to model your gene set against cell type GRNs in scGRNdb and identify the cell type specific mechanisms that best explain your data.
Step 1: Prepare Your Gene Set File
The main input for the analysis is one or more gene sets. If you have one gene set, you can provide it as a comma-separated list of genes, or as txt file. If you have multiple gene sets, you can provide it as a txt file. If you provide a txt file, it should have the following columns:
genes
: Gene names (HGNC/MGI symbol)module
: Gene set name
Example Gene Set File
genes | module |
---|---|
GENE1 | module1 |
GENE2 | module1 |
GENE3 | module1 |
GENE1 | module2 |
GENE4 | module2 |
GENE3 | module3 |
Step 2: Select Species and Atlas
The next step is to select the species (human or mouse) and their corresponding scRNAseq atlases used to generate the GRNs. You can select as many atlases as you want. For brain tissues, we recommend selecting any of the Allen Brain Atlases. For other tissues, we recommend Tabula Sapiens and GTEx for human and Tabula Muris for mouse.
scRNAseq Data Atlases
Human | Mouse |
---|---|
Allen Brain Atlas (10X) | Allen Brain Atlas (10X) |
Allen Brain Atlas (SmartSeq) | Allen Brain Atlas (SmartSeq) |
Tabula Sapiens | Tabula Muris (10X) |
Human Cell Landscape | Tabula Muris (SmartSeq) |
GTEx | Tabula Muris Senis (10X) |
Tabula Muris Senis (SmartSeq) | |
Mouse Cell Atlas |
Step 3 (Optional): Provide your Email
If you provide your email, you will receive an email notification when the analysis is complete. If you do not provide an email, remember to save your sessionID to retrieve your results later.
Step 4: Submit and Monitor
Click Submit to start the analysis. You can monitor the progress of the analysis in the Review Files tab. If you provided an email, you will receive an email notification when the analysis is complete.
Step 5: Explore Results
Downloads
Once the analysis is complete, you can download the results in the downloads table:
- Enriched Networks - A .txt file listing the gene set and their enriched networks.
Enriched Networks Table
The Enriched Networks file is also displayed as a table below the downloads table. Here is a description of the columns:
- ID - Unique ID
- GENESET - Name of gene set
- NETWORK TISSUE - Tissue of the enriched GRN
- NETWORK CELLTYPE - Cell Type of the enriched GRN
- NETWORK MODULE - GRN subnetwork with enrichment for gene set
- NETWORK CELLTYPE - scGRNdb cell atlas of the enriched GRN
- P - P-value calculated using the hypergeometric test
- FDR - False Discovery Rate
- RISK RATIO - Enrichment score
- GENESET SIZE - Number of genes in the gene set
- NETWORK MODULE SIZE - Number of genes in the GRN module
- OVERLAP - Number of overlapping genes between the GRN module and the gene set
You can filter the results by any column. The most common filtering would be to identify your geneset, sort the FDR column, filter to any tissue or cell type of interest, and select your networks.
Step 6: Key Driver Analysis
Review the results table and choose networks for Key Driver Analysis (KDA). You can select as many networks as you want. When you click "Prepare KDA", you will be redirected to the KDA tab, where you can review networks you selected. Each unique network will have its own KDA run, and their mapped genesets will be combined into one file. Then, you can click Run KDA, which will take you to the KDA analysis page on Mergeomics Web Server. Your session will carry over to the Mergeomics Web Server with all input files and recommended parameters already set. All you will need to do is provide your email and click submit.
We recommend the default parameters for KDA. More details about the KDA parameters can be found on the Mergeomics tutorial page.