Software and data required for the course

Important

The course instructions assume you are using a Linux environment
If you’re using a Mac or Windows computer, you might find it easier to set up a Linux Virtual Machine using software like Virtual Box. (Instructions below.)
However, all of the software used should also be installable on Mac or Windows computers.

Assumed file system structure

All of the practical sessions are written to refer to various pieces of data in a root directory called /course. If you’re using a Virtual Machine, you can just make this directory (sudo mkdir /course) and put the various pieces of data there. If you’re using your own computer, and put the data elsewhere like somewhere in your home folder, you’ll need to modify the course instructions appropriately.

Setting up a Linux virtual machine for the course

Follow the Ubuntu instructions for creating an Ubuntu VM.
You’ll need to allocate at least 8GB of memory to the VM to run every step of the course.

Installing software for the course

Docker

Docker allows you to run “containers”: reproducible builds of certain tools. Install Docker Desktop (or alternatives like Podman).

Anaconda

Conda allows you to create “environments”: sets of tools and libraries that depend on each other. Install Anaconda distribution.

Sirius

Sirius is a tool for analysting metabolite data. Install Sirius 4.

MZmine

MZmine is a tool for processing mass-spectrometery data. Install MZmine 3.

Gemma

Gemma is a tool for working with genome-wide association studies. Install Gemma 0.98.3.

Bedtools

Bedtools is a set of tools for genomic analysis. Install Bedtools 2.30.0.

Dependencies

cd /course (assuming you are using a Virtual Machine, see notes above)

This fetches the course notes, some code notebooks, and various dependencies and datasets: git clone https://github.com/ebi-metagenomics/holofood-course.git docs

This creates Conda environments with the dependencies required for the practical sessions: cd docs/sessions/Metabolomics/

conda create -f Metabolomics.yml

cd docs/sessions/metagenomics/notebooks/

conda create --name jupyter -c conda-forge jupyterlab

conda acivate jupyter

pip install -r requirements.txt

conda create --name r --channel conda-forge "r-base>=4.0.3" r-devtools

conda activate r

conda install -c conda-forge r-reshape2 r-ggplot2

Copying data for the course

For the MAG generation practical

Download all of the data from this EBI-hosted FTP site.

Unzip any of the .tar.gz files, using e.g. tar -xzf eukaryotes.tar.gz.

For the multi-kingdom metagenomics practical

wget http://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_courses/biata_2021/virify_tutorial.tar.gz
or
rsync -av --partial --progress rsync://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_courses/biata_2021/virify_tutorial.tar.gz .

Once downloaded, extract the files from the tarball:

tar -xzvf virify_tutorial.tar.gz

Now change into the virify_tutorial directory and setup the environment by running the following commands in your current terminal session:

cd virify_tutorial
docker load --input docker/virify.tar
docker run --rm -it -v $(pwd)/data:/opt/data virify
mkdir obs_results