Post-hoc analysis for scRNA-seq and scATAC-seq Data Analysis DREAM Challenge
-
Clone the repo
git clone https://github.com/Sage-Bionetworks-Challenges/Multi-seq-Data-Analysis-Post-Analysis cd Multi-seq-Data-Analysis-Post-Analysis
-
Create a conda environment using python 3.9:
conda create --name synapse python=3.9 -y conda activate synapse
-
Install Python dependencies
python -m pip install challengeutils==4.2.0
check if
synapseclient
andchallengeutils
are installed via:synapse --version challengeutils -v
-
Install R dependencies
R -e 'source("install.R")'
Note:
The task 2 analysis usesbedr
package that has two requisitions - bedpos and tabix needed to be installed as well. -
Set up Synapse credentials via CLI, or manually store the credentials to
~/.synapseConfig
- see details here synapse login --rememberMe
Download all final submission results and each individual test case's scores to data/
folder:
Rscript submission/get_submissions.R
final_submissions_{task}.rds
: Esseential information of final submission, e.g submission id, team, ranksfinal_scores_{task}.rds
: All test case scores from final submissions, consists of test case name, scores of primary and secondary metrics
Download output files (imputed gene expression / called peaks) of all final submissions to data/model_output/
# replace {task} with 'task1' or 'task2'
Rscript submission/get_predictions_{task}.R
Warning For Task 1, the output (imputation) of each submission has large size ~30G. Please be aware of the available disk space.
Report statistics about submissions
Rscript -e 'rmarkdown::render("stats/get_submission_stats.rmd")'