2  Core Concepts

2.1 Learning Objectives

By the end of this chapter, you will understand:

  • The gen/database/cohorts system for grouping individuals
  • How sex is handled in MoBPS
  • The structure of the population object
  • How time flows through generations
  • Key terminology and concepts

2.2 Individual Grouping

One of the biggest challenges in breeding simulation is having the flexibility to perform operations on specific groups of individuals. MoBPS provides three powerful ways to select groups:

  1. Generations (gen) - Select all individuals from specific generation(s)
  2. Cohorts (cohorts) - Select named groups with specific characteristics
  3. Database (database) - Precise selection by generation, sex, and individual range

2.2.1 Understanding Generations (gen)

Every time you create offspring with breeding.diploid(), they are assigned to a new generation. Generations are numbered sequentially starting from 1.

# Create founder population (generation 1)
population <- creating.diploid(
                nsnp = 1000,      # 1000 total SNP
                nindi = 100,      # 100 individuals in First Cohort, 50 Males + 50 Females
                mean.target = 0,  # target mean TBV
                var.target = 1,   # target variance of TBV
                n.additive = 100, # 100 QTL
                name.cohort = "Founders"
                )

# Create generation 2
# select 10 males and 10 females from gen 1
# mate them to create 50 new males and 50 new females
population <- breeding.diploid(population,
                               selection.size = c(10, 10), # select 50 males and 50 females from past cohorts
                               breeding.size = c(50, 50)   # generate new cohort, 50 males and 50 females
                               )

# extract True Breeding Value (BV) from population list - returns a basic vector
bv_gen2 <- get.bv(population, gen = 2)

# extract True Breeding Value (BV) from both generations
bv_multi <- get.bv(population, gen = 1:2)  # Generations 1 and 2

Key points:

  • Generations track when individuals were born
  • Useful for age-structured populations
  • Easy to analyze trends over time

2.2.2 Named Groups: Cohorts

Cohorts are named groups of individuals you define. They’re incredibly useful for tracking:

  • Different selection lines (e.g., “HighYield”, “LowFat”)
  • Breeding groups (e.g., “NucleusHerd”, “CommercialLine”)
  • Treatment groups (e.g., “Tested”, “Controls”)
# Create offspring and assign to named cohort
population <- breeding.diploid(
  population,
  selection.size = c(10, 10),  # select 10 males and 10 females
  breeding.size = c(50, 50),   # generate 50 males and 50 females
  name.cohort = "SelectedLine"  # Give this group (cohort) a name
)

# Later, extract results such as the TBV by name
bv_selected <- get.bv(population, cohorts = "SelectedLine")

# Select multiple cohorts
multi_cohorts <- get.bv(population,
                         cohorts = c("Cohort_1", "SelectedLine"))

Best practices:

  • Use descriptive names: “TopSires”, “TestGroup1”, “Line_A”
  • Keep naming consistent across simulations
  • Use cohorts when generation number alone isn’t enough

2.2.3 Precise Selection: Database

The database parameter gives you surgical precision. It’s a matrix where each row specifies:

  1. Generation number
  2. Sex (1 = male, 2 = female)
  3. First individual to include (optional)
  4. Last individual to include (optional)
# create 1 by 4 matrix with info from above
database <- matrix(c(1, 1, 1, 20), ncol = 4)

# Extract TBV of males 1-20 from generation 1
males <- get.bv(population, database = database)

# Select a specific range (4-column: gen, sex, first, last)
db_sub <- rbind(
  c(1, 1, 1, 20),    # Males 1-20 from gen 1
  c(1, 2, 5, 15)     # Females 5-15 from gen 1
)
bv_db_sub <- get.bv(population, database = db_sub)

You may want to select all animals from a gen + sex combination, we can do this with a 1x2 matrix using database.

# Select ALL individuals of a sex/gen: pass only 2 columns (gen, sex)
# MoBPS auto-fills start=1 and end=N for that group
db_all_males <- matrix(c(2, 1), ncol = 2)  # All males from gen 2
bv_all_males <- get.bv(population, database = db_all_males)

If we want to select 20 males from gen 1 and all males from gen 2, we can use the following code.

# Combining full-range rows with specific-range rows requires explicit counts
# because rbind() with mixed column counts will pad with NA (which breaks things)
db_combined <- rbind(
  c(1, 1, 1, 20),                                    # Males 1-20 from gen 1
  c(2, 1, 1, population$info$size[2, 1])             # All males from gen 2
)
bv_combined <- get.bv(population, database = db_combined)

Remember: Each line goes from 1 to n (counting restarts each gen + sex)

Run the following code to make this clear. We can nest get. functions like this to extract data from all cohorts if needed.

get.database(population, cohorts=get.cohorts(population))

When to use database:

  • Need specific individual ranges
  • Complex selection criteria
  • Combining multiple precise selections

2.2.4 Combining Extraction Methods

You can mix and match with gen, database, and cohorts if needed for some reason as follows:

# add 3 generations for example
population <- breeding.diploid(population, selection.size = c(10, 10), breeding.size = c(50, 50))
population <- breeding.diploid(population, selection.size = c(10, 10), breeding.size = c(50, 50))
population <- breeding.diploid(population, selection.size = c(10, 10), breeding.size = c(50, 50))

# Select from generations 4-5, specific males from gen 3, AND a cohort
database <- matrix(c(3, 1, 21, 50), ncol = 4)

# extract TBV vector based on gen + database + cohorts
bv <- get.bv(population,
             gen = 4:5,              # All of gen 4 & 5
             database = database,     # Males 21-50 from gen 3
             cohorts = "Founders")    # Plus the Founders cohort

This gives you incredible flexibility to work with exactly the individuals you need!

2.3 Sex Handling

MoBPS has a flexible approach to sex:

2.3.1 Traditional Two-Sex Systems

By default, individuals are assigned male = 1 or female = 2:

Using sex.quota argument to set the percent of females:

# Control sex ratio in founders
population <- creating.diploid(
  nsnp = 1000,
  nindi = 100,
  sex.quota = 0.2  # 20% female (or 20 females in this example)
)

The sex.s argument allows users to enter a vector of 1’s and 2’s for each sex:

# Or specify exactly
population <- creating.diploid(
  nsnp = 1000,
  nindi = 100,
  sex.s = c(rep(1, 40), rep(2, 60))  # 40 males, 60 females
)

2.3.2 Flexible Sex Usage

Important: Sex assignments are not binding for breeding operations!

  • An individual stored as “female” can be used as father
  • Useful for plant breeding where sex may not be fixed
  • Useful for modeling hermaphrodites or aquaculture

2.3.3 One-Sex Mode

For organisms without meaningful sex distinctions:

# Deactivate two-sex system
population <- creating.diploid(
  nsnp = 1000,
  nindi = 100,
  one.sex.mode = TRUE  # All individuals in "sex 1"
)

This automatically adjusts breeding.size, selection.size, etc. to work with a single group.

2.3.4 Using Sex as Structure

Even in plants, you can use “sex” to organize populations:

  • Sex 1 = Gene pool A
  • Sex 2 = Gene pool B

This provides convenient structure for tracking different groups!

2.4 The Population Object

The population object is the heart of MoBPS. It’s an R list containing everything about your breeding program.

2.4.1 Main Components

The population object has two major sections:

# Examine top-level structure
names(population)
# [1] "info"     "breeding"

# General information
names(population$info)
# QTL effects, genetic maps, trait info, etc.

# $breeding is a nested numeric list: [[gen]][[sex]][[individual]]
# names() returns NULL because it uses integer indices, not names.
# To inspect it, use:

length(population$breeding)         # number of generations stored
population$info$size                # matrix of individual counts: rows=gen, cols=sex (1=M, 2=F)
str(population$breeding, max.level = 2)  # overview of nesting depth

2.4.2 What’s Stored?

In $info (population-level):

  • Genetic map (chromosome structure, marker positions)
  • QTL effects and locations
  • Trait names and architectures
  • Correlation structures
  • Fixed effects

In $breeding (individual-level):

  • Genotypes/haplotypes for each individual
  • Breeding values (true genetic values)
  • Phenotypes (observed values)
  • Pedigree information
  • Genotyping status
  • Cohort assignments

2.4.3 Accessing the Population Object

While you can access elements directly, it’s better to use get.*() functions:

# DON'T DO THIS (fragile, complex):
# would access data on animal 6 from generation 2 males
bv <- population$breeding[[2]][[1]][[6]]

# DO THIS INSTEAD (clear, robust):
bv <- get.bv(population, gen = 2)

Available getter functions:

  • get.bv() - Breeding values
  • get.pheno() - Phenotypes
  • get.geno() - Genotypes
  • get.pedigree() - Pedigree
  • get.map() - Genetic map
  • Many more! (See Chapter 17)

2.5 Classes: Advanced Grouping

Classes provide another layer of organization:

# Assign individuals to class 1
population <- set.class(population,
                        gen = 2,
                        new.class = 1)

# Only phenotype class 1
population <- breeding.diploid(
  population,
  phenotyping.gen = 2,
  phenotyping.class = 1  # Only these get phenotyped
)

Special classes:

  • Class 0 (default): Normal active individuals
  • Class -1: Culled/dead animals (automatically assigned)
  • Class 1+: User-defined groups

Use classes for:

  • Active vs. culled animals
  • Test groups vs. controls
  • Multiple herds/flocks with different management

2.6 Time and Generation Flow

Understanding how time works in MoBPS is crucial:

2.6.1 Sequential Generations

Each breeding.diploid() call creates the next generation:

# Gen 1: Founders
pop <- creating.diploid(nsnp = 1000, nindi = 100)

# Gen 2: First offspring
pop <- breeding.diploid(pop,
                        selection.size = c(10, 10),
                        breeding.size = c(50, 50))

# Gen 3: Second offspring generation
pop <- breeding.diploid(pop,
                        selection.size = c(10, 10),
                        breeding.size = c(50, 50))

Generation numbers are automatic and sequential.

2.6.2 Overlapping Generations

You can have overlapping generations by selecting parents from multiple generations:

# Use individuals from generation 2 AND 3 as parents
pop <- breeding.diploid(pop,
                        selection.m.gen = 2,      # select Males from gen 2
                        selection.f.gen = 3,      # select Females from gen 3
                        breeding.size = c(50, 50)) # generate 50 new males and 50 new females

This creates generation 4 from a mix of gen 2 and 3 parents, mirroring the overlapping-generation response framework described by Hill (Hill 1974).

2.6.3 Age Structure

MoBPS doesn’t explicitly track age, but you can model it:

  • Use generations as age cohorts
  • Use cohorts to track birth years
  • Select parents from specific generations to control age

See Section 6.20 in the full manual for detailed age structure examples.

2.7 Genetic Pools and Crossbreeding

Founder pools track the origin of genome segments:

# Create two founder populations
pop1 <- creating.diploid(nsnp = 1000, nindi = 50,
                         founder.pool = 1)  # Pool 1

pop2 <- creating.diploid(nsnp = 1000, nindi = 50,
                         founder.pool = 2)  # Pool 2

Why use pools?

  • Track breed composition in crossbreeding
  • Assign breed-specific QTL effects
  • Analyze admixture and introgression
  • Model heterosis and breed complementarity

You can later query which parts of the genome came from which pool using get.pool().

2.8 Key Terminology Recap

Term Definition
Generation Time point when individuals were born (sequential)
Cohort Named group of individuals with similar characteristics
Database Precise selection by gen, sex, and individual range
Class Category for management actions (0=active, -1=culled, 1+=custom)
Sex Male (1) or female (2), flexibly used
Pool Founder population origin for tracking breed composition
Population object R list containing all simulation data

2.9 Practical Tips

TipStart Simple

Don’t try to use all features at once! Start with just gen for selecting individuals, then add cohorts and classes as needed.

TipNaming Conventions

Use consistent naming: - Cohorts: “Line_A”, “Test_2024”, “HighYield” - Variables: pop or population for the population object - Clear generation references in comments

WarningDon’t Lose Your Population Object!

Always save important population objects:

saveRDS(population, "my_population_gen10.rds")
# Later:
population <- readRDS("my_population_gen10.rds")

2.10 Summary

  • Three selection systems: gen (generations), cohorts (named groups), database (precise)
  • Flexible sex: Can be biological sex, gene pools, or organizational structure
  • Population object: Central data structure storing all simulation information
  • Classes: Additional grouping for management (0=active, -1=culled, custom)
  • Generations flow sequentially: Each breeding.diploid() creates the next generation
  • Pools track origins: Useful for crossbreeding and admixture

2.11 What’s Next?

Now that you understand the core concepts, let’s put them into practice!

In Chapter 3: Your First Simulation, you’ll create a complete breeding program from start to finish.

Hill, William G. 1974. “Prediction and Evaluation of Response to Selection with Overlapping Generations.” Animal Science 18 (2): 117–39. https://doi.org/10.1017/S0003356100017372.