1  Introduction to MoBPS

1.1 Learning Objectives

By the end of this chapter, you will:

  • Understand what MoBPS is and what it can do
  • Know how to install MoBPS and its dependencies
  • Be able to load the package and verify installation
  • Understand the basic workflow of a MoBPS simulation

1.2 What is MoBPS?

MoBPS (Modular Breeding Program Simulator) is an R package for stochastic simulation of breeding programs (Pook et al. 2020). It provides a flexible framework to:

  • Create realistic founder populations with complex genetic architectures
  • Simulate multiple generations of breeding actions (selection, mating, culling)
  • Analyze genetic gain, diversity, inbreeding, and population structure
  • Compare different breeding strategies
  • Optimize breeding programs for specific goals

1.2.1 Why Simulate Breeding Programs?

Breeding program simulation allows you to:

  1. Test strategies before implementing them in reality
  2. Understand trade-offs between genetic gain and diversity
  3. Optimize selection and mating decisions
  4. Predict long-term outcomes over many generations
  5. Train future breeders in a risk-free environment

1.2.2 Key Features

MoBPS offers exceptional flexibility:

  • ✅ Works with both plant and animal breeding
  • ✅ Supports single or multiple traits
  • ✅ Handles additive, dominance, and epistatic effects
  • ✅ Simulates realistic genomic data or uses real data
  • ✅ Integrates with external prediction software (BLUP, GBLUP, Bayesian)
  • ✅ Provides rich visualization and analysis tools
  • ✅ Scales from small experiments to large commercial programs

1.3 Installation

1.3.1 System Requirements

MoBPS requires:

  • R version 3.0 or higher (R 4.0+ recommended)
  • For large-scale simulations: RandomFieldsUtils and miraculix (optional but highly recommended)

1.3.2 Step 1: Platform-Specific Prerequisites

1.3.2.1 Windows Users

Some Windows systems require Rtools:

  1. Download from: https://cran.r-project.org/bin/windows/Rtools/
  2. Install the version matching your R version
  3. Verify with: pkgbuild::find_rtools() or Sys.which("make")

If Rtools exists on Windows you will see the following from Sys.which("make")

"C:\\rtools44\\usr\\bin\\make.exe

Remember: You do NOT load Rtools with library() like other packages, Rtools is just the compilers needed for your machine

To further test if Rtools is read to go, run the following in the R console:

pkgbuild::check_build_tools(debug = TRUE)

If this is not working, you need to add Rtools to your PATH with

Sys.setenv(PATH = paste(
  "C:/rtools44/usr/bin",
  Sys.getenv("PATH"),
  sep = ";"
))

However, replace the rtools44 to your version.

Also remember, you can add it to your PATH on Windows using .Rprofile

# Make Rtools available
if (dir.exists("C:/rtools44/usr/bin")) {
  Sys.setenv(PATH = paste(
    "C:/rtools44/usr/bin",
    Sys.getenv("PATH"),
    sep = ";"
  ))
}

And remember to replace your version (44 vs 45 or future version).

1.3.2.2 Mac Users

Mac users need Xcode Command Line Tools and gfortran (the Mac equivalent of Rtools). The easiest path is via Homebrew.

Step 1 — Install Xcode Command Line Tools:

xcode-select --install

A dialog will appear asking you to install. Click “Install” and wait for it to complete. Verify with:

xcode-select -p
# Should print: /Library/Developer/CommandLineTools

Step 2 — Install Homebrew (if not already installed):

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

For Apple Silicon (M1/M2/M3) Macs, follow any post-install instructions to add Homebrew to your PATH:

echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"

Step 3 — Install gfortran via Homebrew:

brew install gcc

This installs gfortran (bundled with GCC), which R needs to compile packages with Fortran code (including miraculix). Verify:

gfortran --version
# Should print: GNU Fortran (Homebrew GCC ...) ...

Step 4 — Optional: Install R via Homebrew (if not already installed):

brew install --cask r

Or download the .pkg installer from https://cran.r-project.org/bin/macosx/ if you prefer.

1.3.2.3 Linux Users

Linux setup varies depending on whether you have admin (sudo) privileges or are working on a shared HPC/cluster system.


1.3.2.3.1 Option 1: Desktop/Server with sudo Access

Install build tools, gfortran, and supporting libraries:

# Ubuntu / Debian
sudo apt-get update
sudo apt-get install build-essential gfortran liblapack-dev libblas-dev \
  libcurl4-openssl-dev libssl-dev libxml2-dev

# Fedora / RHEL / Rocky Linux / AlmaLinux
sudo dnf install gcc gcc-gfortran gcc-c++ lapack-devel blas-devel \
  libcurl-devel openssl-devel libxml2-devel make

# openSUSE / SLES
sudo zypper install gcc gcc-fortran gcc-c++ lapack-devel blas-devel \
  libcurl-devel libopenssl-devel libxml2-devel

Install R (if not already installed):

# Ubuntu / Debian — add CRAN repo for latest R
sudo apt-get install --no-install-recommends r-base r-base-dev

# Fedora / RHEL
sudo dnf install R

1.3.2.3.2 Option 2: HPC / Cluster (module system, no sudo required)

Most HPC systems use Environment Modules (module command). Load R and the required compiler toolchain before starting R or submitting jobs.

Check what’s available:

module avail R
module avail gcc
module avail intel   # Intel compilers also work

Load the modules (adjust version numbers to what your cluster provides):

module load R/4.3.1          # or whatever version is available
module load gcc/12.2.0       # provides gfortran
module load openblas         # BLAS/LAPACK for linear algebra performance

Make it persistent — add to your ~/.bashrc or ~/.bash_profile so modules load automatically:

echo 'module load R/4.3.1' >> ~/.bashrc
echo 'module load gcc/12.2.0' >> ~/.bashrc
source ~/.bashrc

In SLURM job scripts, load modules at the top of the script:

#!/bin/bash
#SBATCH --job-name=mobps_sim
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=02:00:00

module load R/4.3.1
module load gcc/12.2.0

Rscript my_simulation.R

1.3.2.3.3 Option 3: HPC / Cluster (no sudo, no module system)

If your cluster lacks the needed modules, use Conda/Mamba to install R and compilers entirely in your home directory — no admin privileges needed.

Install Miniforge (lightweight Conda with conda-forge defaults):

# Download and install Miniforge
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh -b -p ~/miniforge3
~/miniforge3/bin/conda init bash
source ~/.bashrc

Create an environment with R and compilers:

conda create -n mobps -c conda-forge r-base r-devtools gfortran_linux-64 \
  openblas lapack -y
conda activate mobps

Install R packages inside the environment (same as normal — no sudo needed):

install.packages("devtools")
devtools::install_github("tpook92/MoBPS", subdir="pkg")

Set a personal R library path (useful even outside Conda, e.g. if R is already on the cluster but you want to install your own packages):

# Add to ~/.bashrc
export R_LIBS_USER=~/R/library
mkdir -p ~/R/library

Then in R, verify it’s on your search path:

.libPaths()
# ~/R/library should appear first
TipHPC Tip: Ask Your Sysadmin

If MoBPS or its dependencies fail to compile, ask your HPC support team to load or install gcc with Fortran support (gfortran). Most clusters already have this — it may just need a module load gcc to activate it.

1.3.4 Step 3: Install MoBPS

Install the latest MoBPS from GitHub (recommended over CRAN for latest features):

# Install MoBPS from GitHub
devtools::install_github("tpook92/MoBPS", subdir="pkg")

Why GitHub over CRAN? - GitHub has the latest features and bug fixes - Regular updates and improvements - CRAN version may be outdated (last update: November 2021)

1.3.5 Step 4: Install Genetic Maps (Optional)

For working with real species data:

# Install MoBPS maps package (includes Ensembl maps)
devtools::install_github("tpook92/MoBPS", subdir="pkg-maps")

This package includes genetic maps for common species from Ensembl.

1.3.6 Step 5: Verify Installation

Load the package and check version:

# Load MoBPS
library(MoBPS)

# Check version
packageVersion("MoBPS")

# Should see something like: '1.13.0' or higher

1.3.7 Additional Packages

MoBPS can integrate with various prediction software. These are optional and only needed for specific analyses:

# For genomic prediction
install.packages("rrBLUP")    # GBLUP
install.packages("BGLR")      # Bayesian methods
install.packages("sommer")    # Mixed models

# For visualization
install.packages("ggplot2")   # Advanced plotting
install.packages("viridis")   # Color palettes

1.4 Basic Workflow Overview

Every MoBPS simulation follows the same basic structure:

1.4.1 1. Create a Founder Population

Use creating.diploid() to initialize your population:

population <- creating.diploid(
  nsnp = 1000,          # Number of SNP markers
  nindi = 100,          # Number of individuals
  n.additive = 100,     # Number of QTLs
  mean.target = 100     # Mean breeding value
)

1.4.2 2. Simulate Breeding Actions

Use breeding.diploid() to perform selection, mating, and other actions:

population <- breeding.diploid(
  population,
  selection.size = c(10, 10),  # 10 males, 10 females
  breeding.size = c(50, 50)     # Generate 50 male, 50 female offspring
)
# phenotype generation 2
population <- breeding.diploid(
  population,
  phenotyping.gen = 2
)

1.4.3 3. Extract and Analyze Results

Use get.*() functions to extract information:

# Get breeding values of gen 2
bv <- get.bv(population, gen = 2)

# Get phenotypes of gen 2
pheno <- get.pheno(population, gen = 2)

# Calculate inbreeding for gen 2
inbreeding <- inbreeding.exp(population, gen = 2)

1.4.4 4. Visualize Results

Use built-in plotting functions:

# Track genetic gain
bv.development(population)

# PCA plot
get.pca(population, gen = 1:2)

# Kinship development
kinship.development(population, gen = 1:2)

1.5 The Population Object

The core of MoBPS is the population object (usually called population or pop):

  • It’s an R list containing all simulation data
  • Stores genotypes, phenotypes, pedigrees, trait architectures
  • Updated by each breeding.diploid() call
  • Can be saved/loaded for reproducibility

Think of it as a complete database of your breeding program.

1.6 Getting Help

1.6.1 Within R

# Function documentation
?creating.diploid
?breeding.diploid

# See all MoBPS functions
help(package = "MoBPS")

1.6.2 Online Resources

1.6.3 Important Notes

WarningTake Simulation Results with Caution!

MoBPS won’t stop you from making biologically impossible settings (e.g., collecting milk records from bulls). Always validate your simulation setup makes biological sense!

TipDon’t Get Overwhelmed!

MoBPS has many parameters, but you only need a few for most simulations. Start simple, then add complexity as needed. Use Ctrl+F to search the documentation for specific features.

1.7 What’s Next?

In the next chapter, we’ll dive into core concepts that are essential for understanding how MoBPS works:

  • Individual grouping (generations, databases, cohorts)
  • The population list structure
  • How traits are represented
  • Time flow in simulations

Let’s continue to Chapter 2: Core Concepts!

1.8 Summary

  • MoBPS is a flexible R package for breeding program simulation
  • Install from GitHub for latest features
  • Optional performance packages (miraculix) are highly recommended
  • Basic workflow: Create → Breed → Analyze → Visualize
  • The population object stores all your simulation data
  • Help is available through documentation and direct contact
Pook, Torsten, Martin Schlather, and Henner Simianer. 2020. “MoBPS—Modular Breeding Program Simulator.” G3: Genes, Genomes, Genetics 10 (6): 1915–18. https://doi.org/10.1534/g3.120.401193.