Alphafold2
Warning
Alphafold2 runs are divided into two separate runs. This is beacuse the sequence alignment does not use any GPU whatsoever.
It is strongly advised to separate the alignment and the prediction steps for optimal efficiency!
Sequence alignment should be run on either the CPU or BigData partitions, depending on the size of the protein lenght.
Sequence alignment
#!/bin/bash
#SBATCH -A {account-name}
#SBATCH --job-name={name}
#SBATCH --partition={cpu OR bigdata}
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --output=%x_%j.out
source /opt/software/packages/miniconda/3/etc/profile.d/conda.sh
conda activate parafold
export LD_LIBRARY_PATH=/opt/software/packages/alphafold/ParallelFold/nvidialib/:$LD_LIBRARY_PATH
/opt/software/packages/alphafold/ParallelFold/run_alphafold.sh -d /opt/software/packages/alphafold/alphafold-db -o {output} -p monomer_ptm -i {fastafile.fasta} -t 1800-01-01 -m model_1,model_2,model_3,model_4,model_5 -f
Note
This will run the featurization step, the output will be a feature.pkl file. After this the structure prediction can be run.
Running the prediction on the GPU partition
#!/bin/bash
#SBATCH -A {account-name}
#SBATCH --job-name={name}
#SBATCH --partition={gpu OR ai}
#SBATCH --gres=gpu:1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=32
#SBATCH --gres=gpu:1
#SBATCH --output=%x_%j.out
source /opt/software/packages/miniconda/3/etc/profile.d/conda.sh
conda activate parafold
export LD_LIBRARY_PATH=/opt/software/packages/alphafold/ParallelFold/nvidialib/:$LD_LIBRARY_PATH
/opt/software/packages/alphafold/ParallelFold/run_alphafold.sh -d /opt/software/packages/alphafold/alphafold-db -o {output} -p monomer_ptm -i {fastafile.fasta} -t 1800-01-01 -m model_1,model_2,model_3,model_4,model_5 -g -u $CUDA_VISIBLE_DEVICES
Multimer runs
Warning
For multimer runs, the default -t 1800-01-01 will not work because the model will not be able to find structures, resulting in NaN coordinates, and the run will fail at energy minimization:
simtk.openmm.OpenMMException: Particle coordinate is nan
For multimer runs, you have to use a later date, e.g., -t 2022-01-01
Multimer sequence alignment
#!/bin/bash
#SBATCH -A {account-name}
#SBATCH --job-name={name}
#SBATCH --partition={cpu OR bigdata}
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --output=%x_%j.out
source /opt/software/packages/miniconda/3/etc/profile.d/conda.sh
conda activate parafold
export LD_LIBRARY_PATH=/opt/software/packages/alphafold/ParallelFold/nvidialib/:$LD_LIBRARY_PATH
/opt/software/packages/alphafold/ParallelFold/run_alphafold.sh -d /opt/software/packages/alphafold/alphafold-db -o {output} -p monomer_ptm -i {fastafile.fasta} -t 1800-01-01 -m model_1_multimer_v3,model_2_multimer_v3,model_3_multimer_v3,model_4_multimer_v3,model_5_multimer_v3 -f
Note
This will run the featurization step, the output will be a feature.pkl file. After this the structure prediction can be run.
Multimer GPU prediction
#!/bin/bash
#SBATCH -A {account-name}
#SBATCH --job-name={name}
#SBATCH --partition={gpu OR ai}
#SBATCH --gres=gpu:1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=32
#SBATCH --gres=gpu:1
#SBATCH --output=%x_%j.out
source /opt/software/packages/miniconda/3/etc/profile.d/conda.sh
conda activate parafold
export LD_LIBRARY_PATH=/opt/software/packages/alphafold/ParallelFold/nvidialib/:$LD_LIBRARY_PATH
/opt/software/packages/alphafold/ParallelFold/run_alphafold.sh -d /opt/software/packages/alphafold/alphafold-db -o {output} -p monomer_ptm -i {fastafile.fasta} -t 2022-01-01 -m model_1_multimer_v3,model_2_multimer_v3,model_3_multimer_v3,model_4_multimer_v3,model_5_multimer_v3 -g -u $CUDA_VISIBLE_DEVICES