Building the First Model

From Biocellion
Revision as of 00:58, 10 November 2017 by Simon Kahan (talk | contribs) (Model Descriptions)

Jump to: navigation, search

Introduction

Biocellion is a high-performance software framework written in C++ for modeling biological systems. In this tutorial, we will provide a complete walk through of the model building process in Biocellion. Biocellion has a steep learning curve, so we recommend reading the introductory chapters of the Biocellion manual to get an overview now; then using it as a reference while working through the tutorials.

The tutorial source code is maintained independently of the Biocellion simulation framework. However, it is available in the same Google Drive directory made available to you when you licensed Biocellion. You will need to download TutorialScripts.3.30.17.tgz from that directory. Use

       tar zxvf TutorialScripts.3.30.17.tgz

to create the directory tutorial_models and move its contents into your biocellion-user directory:

       mv tutorial_models/* <...directory path...>/biocellion-user

If you have any difficulties accessing the tutorial source code or anywhere else throughout the tutorials, do not hesitate to post to the Biocellion Forum asking for help: the community welcomes new members and their questions.

For the first tutorial, we will build a simple model, where we place a number of cells in the middle of a specified simulation domain. The cells will take random walks, and we will visualize the model outputs in Paraview. Not all the model_routine .cpp files (and functions) will be covered in this tutorial. They are left blank for now. However, we will progressively incorporate all 5 model_routine .cpp during the tutorial series.

This tutorial is organized in the following manner:

Model Descriptions explains four of the six required files for running this model in Biocellion:

  • tutorial.xml
  • model_define.h
  • model_routine_config.cpp
  • model_routine_agent.cpp
  • model_routine_output.cpp

Compilation describes the configuration of the corresponding model XML file, and compiling the first model.

Visualization with Paraview demonstrates how to visualize .pvtu output data.

Model Descriptions

If you have followed the instructions above, you will find the files for this tutorial in biocellion-user/1_building_the_first_model .

model_define.h

In this first model, we will place a number of cells of the same type in a cubic Simulation Domain. To do so, we will need to define all the global variables and enumerations that will be used in the model_define.h header file. Two coding conventions are to capitalize global variables and to end enumerations with “_e”. We start with the following:

       cell_type_e
       INI_N_CELLS
       A_CELL_RADIUS
       grid_summary_real_e
       model_rng_type_e
       IF_GRID_SPACING
       BASELINE_TIME_STEP_DURATION
       NUM_STATE_AND_GRID_TIME_STEPS_PER_BASELINE

First, we need to know what type of cells we will be using, how many of these cell types there are, and the sizes of the cells. We start with cell type, cell_type_e. Notice that CELL_TYPE_A is an enumerator in the enumeration declaration, cell_type_e that is used to define A_INI_N_CELLS and A_CELL_RADIUS. This is generally a good practice in Biocellion, (for example, should you choose to add additional cell types to the model, the indexing scheme keeps the model specifics organized). For this example, we will place a hundred cells, thus A_INI_N_CELLS = {100}. All the cells have fixed radius size of 2.0. Users may want to experiment with different values.

grid_summary_real_e is the enumeration declaration for GRID_SUMMARY_REAL_LIVE_CELLS, which is a Summary Output variable we will use to print the number of live cells at specific time-step interval.

MODEL_RNG_UNIFORM is the declaration for Biocellion’s Uniform Random Number Generator that will be used later. Do not use C++’s built-in rand() function, as it is not thread-safe.

IF_GRID_SPACING is the Interface Grid Spacing size. In Biocellion, the Simulation Domain is a three-dimensional grid comprised of unit boxes, and each unit box has a grid spacing size. In this example, IF_GRID_SPACING is 10.0.

BASELINE_TIME_STEP_DURATION is the the duration of biggest time-step size, “Baseline Time Steps” used in model simulation. Baseline Time Step can be further broken down to smaller time-step sizes, “State and Grid Time Steps,” for activities such as cell secretion that require smaller step-size than say, physico-mechanical interactions. NUM_STATE_AND_GRID_TIME_STEPS_PER_BASELINE is the ratio between “Baseline Time Step” and “State and Grid Time Steps.”

model_routine_config.cpp

In model_routine_config.cpp, we handle all simulation “setup.” Most functions are not necessary for this model, so they will be omitted from the tutorial. Interested users may want to read over the Biocellion 1.1 manual for explanations of all functions.

First, we define the Interface Grid Spacing size that will be used throughout the model in updateIfGridSpacing(). Although we have already defined IF_GRID_SPACING in model_define.h, it must be initialized in updateIfGridSpacing() for Biocellion to run.

In updateDomainBdryType(), we set the boundary types for the 3 simulation directions (x, y, and z). There are only two Boundary Types: DOMAIN_BDRY_TYPE_PERIODIC and DOMAIN_BDRY_TYPE_NONPERIODIC_HARD_WALL.

updatePDEBufferBdryType() sets the boundary type between an interface grid partition and PDE buffer partition. We will leave it as PDE_BUFFER_TYPE_HARD_WALL for this example.

We initialize the BASELINE_TIME_STEP_DURATION and NUM_STATE_AND_GRID_TIME_STEPS_PER_BASELINE from model_define.h in updateTimeStepInfo().

In updateSpAgentInfo(), vector v_spAgentInfo holds all agent specific information for each NUM_CELL_TYPES. This is an important step in Biocellion for defining the initial states of each cell. For our simple model, we will only use info.dMax, which is the maximum radius of interaction between a pair of cells. It is set at IF_GRID_SPACING. Users may choose to specify any other variables in the spAgentInfo class.

updateRNGInfo() sets the random number generator. RNGInfo class currently offers uniform, Gaussian, exponential, and gamma probability distributions. We initialize a uniform and Gaussian random number generators with appropriate parameters. Each probability distribution has different definitions for the rngInfo class parameters, so refer to Section 2.4.14 of the Biocellion manual for the parameter specifics.

updateFileOutputInfo() sets the model fileOutputInfo class. When fileOutputInfo.particleOutput is set ‘true’, Biocellion outputs discrete agent data. Only the particleOuput is needed in our case.

updateSummaryOutputInfo() outputs user-defined summaries during the simulation. This is an essential tool for observing what is happening with the model during simulation. The frequency of the summaries that are output in the simulation is defined in the simulation XML file. updateSummaryOutputInfo() is used in conjunction with model_routine_output.cpp’s updateSummaryVar(). In our model case, we want to see the total number of live cells as a summary output. The total number of live cells is computed in model_routine_output.cpp’s updateSummaryVar(), and stored in vector v_realVal, which is used in summary OutputInfo class of updateSummaryOutputInfo(). Refer to model_routine_output.cpp section for more information.

model_routine_agent.cpp

model_routine_agent.cpp is where “true” modeling begins. In this model routine, we can define most of a cell’s morphogenesis. In this simple example where we only care about placing cells in the Simulation Domain, only the function addSpAgents() is considered. Utilizations of other functions for more realistic modeling of morphogenesis (cell growth, death, division, secretion, etc…) will follow in subsequent tutorials.

Before beginning the discussion on addSpAgents(), it is worth noting Biocellion’s indexing scheme. As described in 1. model_define.h, Biocellion’s Simulation Domain comprises of unit boxes with size of ifGridSpacing (additionally, the Simulation Domain can be gridded into larger grid spacings in the same domain using Biocellion’s Adaptive Mesh Refinement feature to improve computational time). A specific location in the Simulation Domain can be determined by the index of one of these unit boxes and the position offsets from the center of the unit box. Each unit box is indexed in the x, y, and z directions (as a tuple) with the class, VIdx. Within each unit box, the location is the positional distance (with values of REAL data type) from the center <0.0 , 0.0, 0.0 >, with the class VReal. The offset value from the center of the unit box can not be greater than the half of ifGridSpacing.

addSpAgents() has return-by-reference vectors of v_spAgentVIdx, v_spAgentOffset, and v_spAgentState. We store locations of the cells we want to place in the v_spAgentVIdx and v_spAgentOffset with classes VIdx and VReal. We store the states of the cells in v_spAgentState with class SpAgentState.

We initialize values of agent locations using nested for-loops of NUM_CELL_TYPES (= 1) and INI_N_CELLS[0] (= 100). The locations of all cells are equal in our example. They are located in unit box index of <regionSize[0]/2 - 1, regionSize[1]/2 - 1, regionSize[2]/2 - 1>, which is the unit box located in the middle of simulation domain. The positional offsets for the x, y, z directions are 0.5 * IF_GRID_SPACING. This places the cells in the “upper right-hand corner” of the unit box, <regionSize[0]/2 - 1, regionSize[1]/2 - 1, regionSize[2]/2 - 1>. This ensures that the cells are “truly” in the middle of the simulation domain.

We store the VIdx and VReal values in v_spAgentVIdx and v_spAgentOffset by using push_back(). We also initialize the cell states (cell type and radius by default) in v_spAgentState.

The random walk of the cells is implemented in adjustSpAgent(). We adjust the position of each cell in x, y, z directions by giving a Gaussian distribution to the mean displacement of random walks of the diffusion equation: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \bar{X}_i(t+\Delta t) = \bar{X}_i(t)=\sqrt{2D_i\Delta t}\bar{\xi}(t)} , where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle D_i} is A_CELL_DIFFUSION_COEFF, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle t} is the BASELINE_TIME_STEP_DURATION, and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \bar{\xi}} is a vector with components sampled from a Gaussian distribution, RNG_TYPE_GAUSSIAN, with a mean and standard deviation of 0 and 1 respectively.

model_routine_output.cpp

As mentioned in model_routine_config.cpp, we define the Summary Output (for simulation run time) in model_routine_output.cpp as well as the specifics for Agent Output (for post simulation).

updateSpAgentOutput() is used for outputting model data for external use. It can produce .data and .pvtu files, which in our case, will be used later for visualization in Paraview.

In updateSummaryVar(), we count the total number of live cells in the simulation at set intervals, specified in the XML file. We do this by invoking the class, UBAgentData. The vector v_realVal is returned.

Compilation

XML Setup

In order to run a biocellion model, a corresponding XML file (in the case of provided XML file, tutorial_part1.xml) is required to run the simulation. The XML file has two components: required, optional.

In the required portion of the XML file, set how many baseline time-step to run the simulation in <time_step>, the Simulation Domain size in <domain>, the partition size in <init_data>, and the output result settings in <output>. The output path can be set to wherever you would like to save simulation results. The output data is in the .pvtu (Paraview Parallel VTK Unstructured Data) for model visualization. For this example, the domain size for the Paraview is set exactly the same as Simulation Domain size <size_x = “64” size_y = “64” size_z = “64”> .

The optional portion of the XML can be used for a variety of use, including adding model specific parameters (<model>), changing the verbosity of simulation details (<stdout>), and improving the performance by taking advantage of Biocellion’s High-Performance Computing. For most users, the use of num_threads may be of the most interest to improve computational time. You can check the cpu specifics in Linux by using the command, ‘lscpu.’

biocellion.DP.SPAGENT.OPT

With the model header file, model routine files, and the XML all set, we are ready to compile our first model. In order for a model to compile, the header file (model_define.h) and the .cpp files have to be placed under the model directory in biocellion/libmodel/model, along with the pre-built Makefile. Biocellion compiles whatever files that exist in the biocellion/libmodel/model directory.

Command, “make” in the /model directory compiles the model. When everything is compiled without error (you can turn on debugging in Makefile.model in biocellion-user directory for debug support), execute the model simulation using the configurations from XML file by typing,

       /LOCATION_OF_BIOCELLION_ROOT/framework/main/DP.SPAGENT.OPT tutorial.xml

If you're running the tutorial scripts in the VirtualBox or gcloud images, the DP.SPAGENT.OPT path is already set as a variable in ~/.bashrc, so you can just type,

       $biocellion tutorial.xml

Check that the model has successfully finished the simulation.

Visualization with Paraview

When the simulation has successfully finished, in the directory where you set the model <output> path, there should be a series of .data, .pvtu, and .vtu files. Open the .pvtu file in Paraview. Apply the properties.

Select “Glyph” from Filters > Alphabetical > Glyph.

From Glyph Properties, set Glyph Type to “Sphere,” Glyph Radius to “1.0,” Scalar to “radius,” Scale Mode to “scalar,” Scale Factor to 1.0, and Maximum Number of Sample Points to a number greater than the number of cells. Apply.

Zoom to Data and click Play. The resulting animation should resemble the provided video.

(Optionally, color cells of the simulation in Paraview according to a scaled distance from the center, using the "Calculator" tool to match the model video.)

Next Tutorial

Cell Growth and Cell Death