.. _binder_design: Binder Design ============= We aren't BindCraft. But we can help you build your own binder design pipeline, just wait until we tell you how! In this tutorial, we are going to design binders for the protein Epidermal Growth Factor Receptor (EGFR). To achieve this goal, we need several protein design tools: - `RFdiffusion `_ (backbone generation) - `LigandMPNN `_ (threading sequences onto backbones) - `Rosetta `_ (energy-based backbone optimization) - `ESMFold `_ (structure prediction) - `ColabFold `_ (complex prediction) Make sure these are installed on your local machine and configured in the ProtFlow config file! .. note:: If you are having trouble setting up ProtFlow, please check out the :doc:`Quickstart guide `. Introduction ------------ We are going to create a pipeline that requires the receptor structure (EGFR) as input. Luckily, this structure is available in the Protein Data Bank (PDB code: 6aru). Using RFdiffusion, we will create de novo binders that should bind to certain regions on the receptor (so-called hotspot residues). Next, we are going to generate sequences that fit to the diffused backbones with LigandMPNN, optimize the structures with Rosetta, and predict the sequences with ESMFold and Colabfold. The complete pipeline including the input structure can be found in `this repository `_. This tutorial assumes you are already familiar with how to set up ProtFlow runners and metrics. If you are not, please check out one of our other tutorials. Hotspot residues ---------------- .. figure:: ../../assets/binder_tutorial/egfr_hotspot_residues.png :align: right :figwidth: 200px Structure of truncated EGFR. Hotspot residues are shown in magenta. We identified 5 residues on EGFR that we want to use as hotspot residues: L325, H346, L348, A415, and I438. These residues are located on a hydrophobic patch of EGFR in a region where cetuximab, a monoclonal antibody, binds to EGFR. We are defining these residues as ResidueSelections, a special ProtFlow class for selection of protein residues. We modified our input PDB slightly, so that the target EGFR is on chain B. In addition, we truncated the EGFR structure to 162 residues to decrease runtime, as RFdiffusion inference time depends on the size of the input structures. We assign each pose a ResidueSelection containing the hotspot residues twice: once to preserve the original hotspot residues, and in another column to update the selection post-diffusion as the residue numbers change after structure generation with RFdiffusion. .. code-block:: python from protflow.poses import Poses from protflow.residues import residue_selection receptor_pdb = "path/to/egfr_truncated.pdb" output_dir = "path/to/output_dir" # import all PDB files in input dir as poses poses = Poses(poses=receptor_pdb, work_dir=output_dir) hotspot_residues = "B18,B39,B41,B108,B131" # residue numbers are different in the truncated EGFR! hotspot_residue_selection = residue_selection(hotspot_residues, delim=",") # add hotspot residues to poses df # assign each pose the residue selection of the hotspot residues poses.df["hotspot_residues_original"] = [hotspot_residues for _ in poses.poses_list()] # this residue selection will be updated post-diffusion (to track changing residue numbers) poses.df["hotspot_residues_postdiffusion"] = [hotspot_residues for _ in poses.poses_list()] RFdiffusion ----------- RFdiffusion comes with a special potential for binder generation with hotspot residues. We are also going to use the "Beta" model to increase topology diversity of diffused structures. Our binder should have a length of 80 residues. Before we start diffusing, we extract sequence information of the receptor. .. note:: For further details on RFdiffusion options, the hotspot potential and Beta model, please check out the `RFdiffusion documentation `_. .. code-block:: python from protflow.utils.biopython_tools import load_structure_from_pdbfile, get_sequence_from_pose from protflow.jobstarters import SbatchArrayJobstarter from protflow.tools.rfdiffusion import RFdiffusion # extract the receptor sequence receptor_structure = load_structure_from_pdbfile(receptor_pdb) receptor_sequence = get_sequence_from_pose(receptor_structure) # set up jobstarters gpu_jobstarter = SbatchArrayJobstarter(max_cores=5, gpus=1) # set up a jobstarter that can use GPUs rfdiffusion = RFdiffusion(jobstarter=gpu_jobstarter) # set up rfdiffusion # define diffusion contig (B1-162 is the receptor EGFR, binder should be 80 residues) contig = "B1-162/0 80-80" # define path to Beta model (comes with RFdiffusion) beta_model_path = "/path/to/Complex_beta_ckpt.pt" # rfdiffusion options, same as when running rfdiffusion on the cmd line diff_opts = f"diffuser.T=50 'contigmap.contigs=[{contig}]' 'ppi.hotspot_res=[{hotspot_residues}]' inference.ckpt_override_path={beta_model_path}" # run rfdiffusion rfdiffusion.run( poses=poses, prefix='rfdiff', num_diffusions=100, multiplex_poses=5, options=diff_opts, fail_on_missing_output_poses=False, update_motifs=['hotspot_residues_postdiffusion'], ) We are multiplexing (or copying) our input pose (egfr_truncated.pdb) 5 times and running 100 diffusions, creating 500 output structures. Multiplexing is used to parallelize inference, as we defined 5 cores with one GPU each in our jobstarter. We are also updating our hotspot ResidueSelection, as RFdiffusion outputs structures with continuous residue numbers over both chains. Depending on your available computing power, it might take a while until diffusion is completed. The poses dataframe will now contain all diffused structures and the respective scores. Post-diffusion analysis ----------------------- We want to filter out all low-quality diffused structures. Our criteria are: - globular binder (low radius of gyration) - defined secondary structure (stability of binder) - high number of contacts to hotspot residues (to ensure binder binds at the correct position) .. code-block:: python from protflow.metrics.dssp import DSSP from protflow.metrics.ligand import LigandContacts from protflow.metrics.generic_metric_runner import GenericMetric # set up a jobstarter that runs just on CPUs small_cpu_jobstarter = SbatchArrayJobstarter(max_cores=10) # set up metrics to calculate scores contacts = LigandContacts(ligand_chain="B", min_dist=0, max_dist=10, atoms=['CA'], jobstarter=small_cpu_jobstarter) # calculates number of C-alpha atoms within 10 Angstrom of any atom that is on chain B # the generic metric runner can run any function that takes a pose as input using a jobstarter rog_calculator = GenericMetric(module="protflow.utils.metrics", function="calc_rog_of_pdb", options={"chain": "A"}, jobstarter=small_cpu_jobstarter) # calculates radius of gyration for chain A rescontacts_calculator = GenericMetric(module="protflow.utils.metrics", function="residue_contacts", jobstarter=small_cpu_jobstarter) # calculates number of atoms/residues that are within a certain distance from a target atom or residue dssp = DSSP(jobstarter=small_cpu_jobstarter) # calculates secondary structure content # retrieve updated hotspot residues (same for all diffused backbones, as length of diffused backbones is always 80 residues) hotspot_residues_postdiffusion = poses.df["hotspot_residues_postdiffusion"].iloc[0] # calculate rog, general contacts and hotspot contacts rog_calculator.run(poses=poses, prefix="rfdiff_rog") # calculate ROG contacts.run(poses=poses, prefix="rfdiff_contacts", normalize_by_num_atoms=False) # calculate number of contacts between chain A and B dssp.run(poses=poses, prefix="dssp") # determine secondary structure content # calculate number of contacts for each hotspot residue for res in hotspot_residues_postdiffusion.to_list(): # iterate over hotspot residues chain, resnum = [res[0], res[1:]] # extract chain and residue number rescontact_opts={"max_distance": 12, "target_chain": chain, "partner_chain": "A", "target_resnum": int(resnum), "target_atom_names": ["CA"], "partner_atom_names": ["CA"]} # define options rescontacts_calculator.run(poses=poses, prefix=f"hotspot_{res}_contacts", options=rescontact_opts) # calculate overall hotspot contacts poses.df["hotspot_contacts"] = sum([poses.df[f"hotspot_{res}_contacts_data"] for res in hotspot_residues_postdiffusion.to_list()]) # sum up all contacts to hotspot residues After we calculated all scores, we can visualize the data: .. code-block:: python from protflow.utils.plotting import violinplot_multiple_cols # make a plot of the hotspot_contacts cols_contact = ["rfdiff_plddt" , "rfdiff_rog_data", "hotspot_contacts"] + [f"hotspot_{res}_contacts_data" for res in hotspot_residues_postdiffusion.to_list()] # show contacts for each individual hotspot residue violinplot_multiple_cols(dataframe=poses.df, cols=cols_contact, y_labels=cols_contact, out_path=os.path.join(poses.plots_dir, "diff_contacts.png")) # make a plot of the secondary structure content ss_content_cols = [col for col in poses.df.columns if col.startswith("dssp") and col.endswith("content")] # show content for each secondary structure element violinplot_multiple_cols(dataframe=poses.df, cols=ss_content_cols, y_labels=ss_content_cols, out_path=os.path.join(poses.plots_dir, "diff_ss_content.png")) .. figure:: ../../assets/binder_tutorial/egfr_diffusion_scores_part1.png :align: center :figwidth: 700px Rfdiffusion pLDDT, radius of gyration, total number of hotspot contacts and per-hotspot contacts for 500 diffused backbones. .. figure:: ../../assets/binder_tutorial/egfr_diffusion_scores_part2.png :align: center :figwidth: 700px Secondary structure content for 500 diffused backbones. H = helix, B = residue in beta-bridge, E = extended strand, G = 3-helix, I = 5-helix, T = hydrogen-bonded turn, S = bend, L = loop. Looking at the plots, we notice that some of our backbones have a high radius of gyration (rfdiff_rog_data) or a low number of hotspot contacts (meaning they do not bind at the intended position). While most are predominantely helical (dssp_H_content) or contain beta-sheets (dssp_E_content), some have significant amount of unordered regions (dssp_L_content). We want to remove all of these poses using the Poses filtering options before we optimize the rest. .. code-block:: python poses.filter_poses_by_value(score_col="rfdiff_rog_data", value=18, operator="<=", prefix="rfdiff_rog", plot=True) # remove all poses with ROG higher than 20 poses.filter_poses_by_value(score_col="hotspot_contacts", value=20, operator=">=", prefix="rfdiff_hotspots_contacts", plot=True) # remove all poses with fewer total contacts to hotspot residues than the set cutoff poses.filter_poses_by_value(score_col="dssp_L_content", value = 0.25, operator="<", prefix = "L_content", plot = True) # remove all poses with more than 25% unordered (L) regions .. figure:: ../../assets/binder_tutorial/egfr_filters.png :align: left :figwidth: 700px Each filter will reduce the number of poses. To further narrow down our poses, we can calculate a composite score comprised of all scores that interest us, and filter for the 100 best poses according to this score. We assign a different importance (weight) to each scoreterm, and the sign of the weight depends on if higher or lower numbers indicate a good score. .. code-block:: python # calculate composite score diffusion_comp_score_scoreterms = ["rfdiff_rog_data", "hotspot_contacts", "dssp_L_content"] diffusion_comp_score_weights = [1, -2, 1] poses.calculate_composite_score(name="diffusion_comp_score", scoreterms=diffusion_comp_score_scoreterms, weights=diffusion_comp_score_weights, plot=True) # filter for the top 100 poses poses.filter_poses_by_rank(score_col="comp_score_before_opt", n=100, prefix="comp_score", plot=True, plot_cols=diffusion_comp_score_scoreterms) .. figure:: ../../assets/binder_tutorial/egfr_diff_comp_score_filter.png :align: center :figwidth: 700px Now that we filtered down our backbones sufficiently, we can continue with sequence generation and iterative optimization to maximize in silico binding affinities. Iterative Optimization ---------------------- RFdiffusion only generates backbones but not the corresponding sequences. For this, we are going to use a combination of LigandMPNN and Rosetta. To test whether our designed structures look as expected, we are going to predict the corresponding sequences and compare them to the design models. In addition, we are going to predict the binder-receptor complex and evaluate binding affinities. The best structures will be passed on for another cycle of LigandMPNN-Rosetta-LigandMPNN and structure prediction, hopefully improving binding affinities with each cycle. .. code-block:: python from protflow.tools.ligandmpnn import LigandMPNN ligandmpnn = LigandMPNN(jobstarter=gpu_jobstarter) cycle = 1 # rfdiffusion outputs continous residue numbers over chains, so our 162-AA receptor ranges from residue 151 to 313 fixed_residues = ' '.join([f'B{i}' for i in range(151, 314)]) mpnn_opts = f"--fixed_residues {fixed_residues}" # we create 5 sequenes per pose, and return the PDBs with the sequences threaded on the structures instead of the .fasta files # the receptor chain residues are kept fixed (mpnn_opts) ligandmpnn.run(poses=poses, prefix=f"cycle_{cycle}_seq_thread", nseq=5, model_type="soluble_mpnn", options=mpnn_opts, return_seq_threaded_pdbs_as_pose=True) LigandMPNN created structures with amino acid sequences out of our backbones. To improve our backbones, we are going to employ Rosetta Relax, a specialized Rosetta protocol that optimizes protein structures by minimizing energies via introduction of small movements. .. code-block:: xml .. note:: For further information on Rosetta, RosettaScripts and Rosetta-XML-protocols, please check out the `RosettaCommons `_. .. code-block:: python from protflow.tools.rosetta import Rosetta # define a cpu jobstarter for more demanding tasks cpu_jobstarter = SbatchArrayJobstarter(max_cores=1000) rosetta = Rosetta(jobstarter=cpu_jobstarter, fail_on_missing_output_poses=True) # relax poses relax_protocol = "path/to/fastrelax_interaction.xml" fr_options = f"-parser:protocol {relax_protocol} -beta" # define options for rosetta relax runs (beta weights, and path to relax xml) rosetta.run(poses=poses, prefix=f"cycle_{cycle}_rlx", nstruct=3, options=fr_options, rosetta_application="rosetta_scripts.default.linuxgccrelease") # 3 relax trajectories per pose # calculate composite score containing surface aggregation propensity, total score and interaction energy between binder and receptor (all defined in Rosetta relax xml) relax_comp_scoreterms = [ f"cycle_{cycle}_rlx_sap_score", f"cycle_{cycle}_rlx_total_score", f"cycle_{cycle}_rlx_intE_interaction_energy", ] relax_comp_weights = [1, 2, 2] poses.calculate_composite_score(name=f"cycle_{cycle}_threading_comp_score", scoreterms=relax_comp_scoreterms, weights=relax_comp_weights, plot=True) # apply filter to get best structure for each rfdiffusion output using group_col poses.filter_poses_by_rank(n=1, score_col=f"cycle_{cycle}_threading_comp_score", group_col="rfdiff_location") # generate sequences for relaxed poses ligandmpnn.run(poses=poses, prefix=f"cycle_{cycle}_mpnn", nseq=30, model_type="soluble_mpnn", options=mpnn_opts, return_seq_threaded_pdbs_as_pose=True) For each diffused backbone, we generated 5 sequences with LigandMPNN and relaxed each one 3 times. After relaxing our poses, we filtered for the best structure of each diffused pose according to a combination of Surface Aggregation Propensity, total score an interaction energy between binder and receptor. Now that we have energy-optimized backbones, we again create sequences using LigandMPNN. This combination of LigandMPNN-Rosetta-LigandMPNN improves the quality of the generated sequences for a given backbone. Next, we need to evaluate if our sequences fold into the design models. For this, we are going to use ESMFold because it offers faster inference time compared to ColabFold. Since ESMFold can only predict monomers, we have to remove the receptor first. The custom ProtFlow class ChainRemover is suited for this task. Since we are only interested in predictions with high confidence, we again filter our poses. The top 200 poses according to pLDDT that agree with the design model are passed on to the next step. .. code-block:: python from protflow.metrics.tmscore import TMalign from protflow.tools.esmfold import ESMFold from protflow.tools.protein_edits import ChainRemover chain_remover = ChainRemover(jobstarter=small_cpu_jobstarter) esmfold = ESMFold(jobstarter=gpu_jobstarter) # remove receptor chain chain_remover.run(poses=poses, prefix=f"cycle_{cycle}_rm_target", chains=["B"]) # write .fasta files without target poses.convert_pdb_to_fasta(prefix=f"cycle_{cycle}_fasta", update_poses=True) # predict sequences with ESMFOLD esmfold.run(poses=poses, prefix=f"cycle_{cycle}_esm") # filter for prediction confidence (plddt) poses.filter_poses_by_value(score_col=f"cycle_{cycle}_esm_plddt", value=70, operator=">", prefix=f"cycle_{cycle}_esm_plddt", plot=True) # calculate tm score between prediction and relaxed pose tm_score_calculator.run(poses=poses, prefix=f"cycle_{cycle}_tm", ref_col=f"cycle_{cycle}_rlx_location") # remove predictions that don't look like relaxed pose poses.filter_poses_by_value(score_col=f"cycle_{cycle}_tm_TM_score_ref", value=0.9, operator=">", prefix=f"cycle_{cycle}_tm_score", plot=True) # calculate composite score poses.calculate_composite_score(name=f"cycle_{cycle}_esm_composite_score", scoreterms=[f"cycle_{cycle}_tm_TM_score_ref", f"cycle_{cycle}_esm_plddt"], weights=[-1,-2], plot=True) # filter to cycle input poses (max 10 poses per optimization cycle input pose) poses.filter_poses_by_rank(n=10, score_col=f"cycle_{cycle}_esm_composite_score", group_col="rfdiff_location", plot=True, prefix=f"cycle_{cycle}_esm_comp_per_bb") # filter for maximum number of input poses for colabfold poses.filter_poses_by_rank(n=200, score_col=f"cycle_{cycle}_esm_composite_score", prefix=f"cycle_{cycle}_esm_comp", plot=True) .. figure:: ../../assets/binder_tutorial/egfr_af2_input_filter.png :align: center :figwidth: 700px Next, we want to evaluate if our binder actually binds to the target using Colabfold (an AlphaFold2 implementation). First, we have to add the target sequence to our poses. Our poses need to be in .fasta format. .. code-block:: python from protflow.tools.protein_edits import SequenceAdder from protflow.tools.colabfold import Colabfold, calculate_poses_interaction_pae # set up sequence adder and colabfold seq_adder = SequenceAdder(jobstarter=small_cpu_jobstarter) colabfold = Colabfold(jobstarter=gpu_jobstarter) # convert pdb to fasta poses.convert_pdb_to_fasta(prefix=f"cycle_{cycle}_complex_fasta", update_poses=True) # add target sequence seq_adder.run(prefix=f"cycle_{cycle}_target_seq", sequence=receptor_sequence) # reusing the sequence extracted pre-diffusion and adding a chain separator # define colabfold options and run it colabfold_opts = "--num-models 3 --num-recycle 3" colabfold.run(poses=poses, prefix=f"cycle_{cycle}_af2", options=colabfold_opts) # filter for high confidence predictions poses.filter_poses_by_value(score_col=f"cycle_{cycle}_af2_plddt", value=af2_plddt_cutoff, operator=">", prefix=f"cycle_{cycle}_af2_plddt", plot=True) # filter for high agreement between design model and prediction tm_score_calculator.run(poses=poses, prefix=f"cycle_{cycle}_af2_tm", ref_col=f"cycle_{cycle}_rlx_location") poses.filter_poses_by_value(score_col=f"cycle_{cycle}_af2_tm_TM_score_ref", value=0.9, operator=">", prefix=f"cycle_{cycle}_af2_tm_score", plot=True) for res in hotspot_residue_selection.to_list(): resnum, chain = get_resnum_chain(res) logging.info(f"{resnum}, {chain}, {type(chain)}") tmp.append([chain, resnum]) rescontact_opts={"max_distance": 12, "target_chain": "B", "partner_chain": "A", "target_resnum": int(res[1:]), "target_atom_names": ["CA"], "partner_atom_names": ["CA"]} rescontacts_calculator.run(poses=poses, prefix=f"cycle_{cycle}_hotspot_{chain+str(resnum)}_contacts", options=rescontact_opts) # calculate overall hotspot contacts poses.df[f"cycle_{cycle}_hotspot_contacts"] = sum([poses.df[f"cycle_{cycle}_hotspot_{res}_contacts_data"] for res in hotspot_residue_selection.to_list()]) # filter out all poses where the contact between target and binder is below the cutoff poses.filter_poses_by_value(score_col=f"cycle_{cycle}_hotspot_contacts", value=20, operator=">", prefix=f"cycle_{cycle}_hotspots_contacts", plot=True) calculate_poses_interaction_pae( poses=poses, prefix=f"cycle_{cycle}", pae_list_col=f"cycle_{cycle}_af2_pae_list", binder_start=0, # first residue of binder (0-indexed) binder_end=79, # last residue of binder target_start=80, # first residue of receptor target_end=241) # last residue of receptor (binder length + receptor length) # define colabfold composite scoreterms and weights colabfold_comp_cols = [ f"cycle_{cycle}_hotspot_contacts", f"cycle_{cycle}_af2_tm_TM_score_ref", f"cycle_{cycle}_af2_plddt", f"cycle_{cycle}_af2_iptm", f"cycle_{cycle}_pae_interaction"] colabfold_comp_weigths = [-1, -1, -2, -3, 4] # calculate a composite score of colabfold metrics poses.calculate_composite_score( name = f"cycle_{cycle}_opt_composite_score", scoreterms = colabfold_comp_cols, weights = colabfold_comp_weigths, plot = True ) #filter the poses: poses.filter_poses_by_rank( n = 5, # output poses per unique diffusion backbone score_col = f"cycle_{cycle}_opt_composite_score", prefix = f"cycle_{cycle}_opt_composite_score", plot = True, plot_cols=[ f"cycle_{cycle}_hotspot_contacts", f"cycle_{cycle}_af2_tm_TM_score_ref", f"cycle_{cycle}_af2_plddt", f"cycle_{cycle}_af2_iptm", f"cycle_{cycle}_pae_interaction" ], group_col = "rfdiff_location" ) # for checking the ouput poses.save_poses(os.path.join(poses.work_dir, f"cycle_{cycle}_output")) poses.save_scores(os.path.join(poses.work_dir, f"cycle_{cycle}_scores.json")) .. figure:: ../../assets/binder_tutorial/egfr_cycle1_results.png :align: center :figwidth: 700px We filtered our poses to the best 5 structures per unique diffusion backbone according to our composite score scoreterms, which are comprised of the number of hotspot contacts, the TM score between design model and prediction, the ColabFold pLDDT, the interface predicted template modelling score (iPTM), and the predicted aligned error (PAE). .. note:: Further information on ColabFold/AlphaFold2 pLDDT, iPTM and PAE scores can be found `here `_. Looking at the poses, you will notice that their names are quite long. This is because each tool added an index layer to the pose name. We can remove these suffixes for a cleaner look. .. code-block:: python poses.reindex_poses(prefix=f"cycle_{cycle}_reindex", force_reindex=True, group_col="rfdiff_location") # reset the number of index layers This will reset the names to the ones specified in group_col, then add another suffix layer (_0001, _0002, etc) to ensure each pose name is unique. This completes our first iterative optimization cycle. We can continue with the next cycle by moving all code inside a loop: .. code-block:: python num_cycles = 3 for cycle in range(1, num_cycles+1): # run all steps of the iterative optimization inside this loop ... .. figure:: ../../assets/binder_tutorial/egfr_cycle_results.png :align: center :figwidth: 700px AF2 Metrics for output poses of each cycle. Iterative refinement improved scores in our binder design pipeline. Final analysis -------------- After the final cycle is completed, we run Rosetta again to calculate some scores: .. code-block:: python # relax all optimized structures & calculate scores rosetta.run(poses=poses, prefix="final_rlx", nstruct=3, options=fr_options, rosetta_application="rosetta_scripts.default.linuxgccrelease") # filter for pose with lowest total score for each pose poses.filter_poses_by_rank( n = 1, score_col = "final_rlx_total_score", prefix = "final_rlx_total_score", plot = True, remove_layers = 1 # subtract one index layer then group, same result as group_col="final_rlx_location" ) # calculate a final composite score: final_comp_cols = colabfold_comp_cols + ["final_rlx_sap_score", "final_rlx_total_score", "final_rlx_interaction_score_interaction_energy"] final_comp_weights = colabfold_comp_weigths + [1, 1, 1] poses.calculate_composite_score(name="final_comp_score", scoreterms=final_comp_cols, weights=final_comp_weights, plot=True) # filter for the final top 20 poses poses.filter_poses_by_rank( n = 20, score_col = "final_comp_score", prefix = "final_comp_score", plot = True, plot_cols=final_comp_cols, ) # save poses poses.save_poses(os.path.join(poses.work_dir, "results")) # plot final scores violinplot_multiple_cols(dataframe=poses.df, cols=final_comp_cols, y_labels=final_comp_cols, out_path=os.path.join(results, "scores.png")) .. figure:: ../../assets/binder_tutorial/egfr_binders.png :align: center :figwidth: 700px Structures of truncated EGFR (green) in complex with de novo binders (magenta). .. figure:: ../../assets/binder_tutorial/egfr_final_scores.png :align: center :figwidth: 700px Metrics for top 20 output poses. This completes the binder design tutorial. The complete binder design script can be found on `GitHub `_.