Week 1 Goals
• Set up your environment (we will be working in Python .ipynb files)
o Use whichever IDE you prefer (though I use VSCode)
o set up your virtual environment in the space (folder, etc.) that you will be doing all your work in (you’ll use the venv command).
o Once that is set up, start installing necessary libraries (pip install, or conda install from CLI). Some libraries you will need are:
numpy
matplotlib
scanpy
pandas
scikit-learn
scikit-image
celltypist
seaborn
ipykernel
o make sure to also create your jupyter kernel
• Add your project folder to GitHub and make sure it’s in your local environment so you can push/pull updates
• Load in the data that I will send you and create an anndata object (this is specifically used in the scanpy library and its the object we will use for data manipulation throughout the single cell project)
• Explore the dataset and get comfortable with the structure
• Perform. some QC on the cells based on gene expression:
o Look at how many genes are in each cell
o How many cells express each gene
o Clean the data up to make sure we are only keeping healthy cells. You can
find some documentation on standards, but we want cells that are expressing at least 300 genes and genes that are expressed in at least 3 cells.
o Make some plots to show your findings
• Deliverables for next week:
o .ipynb jupyter notebook file with the start of you work
o summary of what you have done, things you,ve learned, and questions for me in PDF format, uploaded to slack prior to our Wednesday meeting