本文对Emergent simplicity in microbial community assembly——中使用的交叉互养模型的代码分析

原始文献Goldford J E , Lu N , Bajic D , et al. Emergent Simplicity in Microbial Community Assembly[J]. Science, 2018, 361(6401):469-474.
原始代码链接https://github.com/jgoldford/mcrm.

文章说明和系列内容

本系列更新主要是关于2018年一篇群落结构组成的文献进行阅读的分析和总结,将全文的过程进行分解。主要分为下面几个篇章。本系列的全部内容都为原创,由于个人水平有限,整理过程可能对一些理解不到位甚至错误,请辩证的阅读。

本文字数共(11703)大于阅读5分钟

  • 论文全过程详细阅读整理
  • 全文分析
  • 文献模型过程推导
  • 代码整理过程注释

本文代码运行说明

 matlab2018b(原始代码环境matlab2015a)
 本文对原始代码进行整理,加入了一下自己的思考。适合初学者了解文献中实现模型的过程。
 对原始文献部分地方进行了调整(具体见代码部分)

代码过程

下图是对文献中的代码流程的一个汇总绘制,其中关于涉及的具体公式参考文献内部和本系列数学推导过程在这里插入图片描述

对文献中实现过程的思考

  • 持续更新 2020年11月25日
    文献在对群落中代谢过程的分析可以从采样模型进行改进,采取傅里叶变化的方式对代谢规律进行分析,观察群落中不同代谢物质在群落持续变化对资源的利用效果。分析在不同科水平中加入一下生化相关的信息。

代码结果展示

文献具体图在上一篇文章有上传在这里插入图片描述
结果输出:在这里插入图片描述

代码过程

主函数部分

%% 时间 2020年11月25日08:55:26
%% 版本 matlab 2018b
%% 注释  尚帝
%% 原始代码地址 https://github.com/jgoldford/mcrm.
%% 整理邮箱: 尚帝@163.com
%% %%
%% 设置群落的参数属性
% define hyperparamters
% the total number of species in your "world"

Total_Number_Of_Species = 1000;

% the number of species in your ecosystem (i.e. inocula)
numSpecies = 100;

% the total number of metabolic families
numGroups = 5;

% the total number of resources (must equal or exceed numGroups)
numResource = 10; %资源总数
% x 获取资源数目
% dirichlet hypoerparameter (controls the variablity of the family
% metabolism)
Total = 100;
%% 生成分泌代谢矩阵
% get a secretion matrix for everyone %分泌矩阵
D = getMetabolism(numResource,'rand',1/numResource);

% determine how much of the community you want to be specialists vs.
% generalists
% more generalist, set to 1/M, and more specialist set to 0.9
specialist = 0.5;

specialistVariation = 0.01;
T=10;  %外源物质转化效率/摄取资源量
% 每个科水平中 每个物质的消耗水平
priors = arrayfun(@(x) getConsumerPriors(T,x,specialist,specialistVariation,numResource)',1:numGroups,'uni',0);
%每个菌属的代谢矩阵
[out,~]=makePhyloConsumers(Total_Number_Of_Species,numResource,numGroups,priors,Total,true);

% if you don't want community struture, use the function
% getRandomConsumerMatrix


% randomly sample a sub population:
k = randsample(1:Total_Number_Of_Species,numSpecies);
% construct an ecosystem
params = mcrm_params(1,numSpecies,out.C(k,:),D,'eye','');

% run ODE solver for ecosystem
r = run_mcrm(params);
figure();
subplot(1,3,1);
plot(r.species);
ylabel('species abundance');



g = out.group(k((r.communityStruct > 1e-4)));
% x = r.communityStruct(r.communityStruct > 1e-4);
% 
% subplot(1,3,2);
% pie(x)
%% 探究群落中丰富度对稳定性的影响,分析群落中物种的丢失率
% compute the coarse grained structure of the community
x = coarseGrainCommunityStructure(r.communityStruct,out.group(k),numGroups);
%figure();
subplot(1,3,2) 
x(x<1e-10) = 0;
pie(x+1e-20)
legend(arrayfun(@(x) ['Family ',num2str(x)],1:numGroups,'uni',0));

% how  stable is the community to species loss?
% this function reduces the parameter structure to only include species
% that survived in the prior simulation
p_new = removeExtinctSpecies(params,r,1e-4);


% knockout a species (i.e. set it's consumption rates to zero)

I = zeros(p_new.num_species);

for ko_species = 1:p_new.num_species
    params_knockout = knockoutSpecies(p_new,ko_species);
    ko_sim = run_mcrm(params_knockout);
    
    % compute the number of species that went co-extinct
    extinct = find(ko_sim.communityStruct < 1e-4);%寻找物种丰富度低的
    I(ko_species,extinct) = 1;
    %extinct = setdiff(extinct,ko_species);
    
end

%compute average species loss 
avgLoss = (sum(sum(I)) - length(I))/length(I);
fractionalLoss = avgLoss/p_new.num_species;
disp(strcat('平均损失的物种数目',num2str(fractionalLoss)))
%基于傅里叶变化分析代谢的种类差异 非原始代码
subplot(1,3,3) 
Y = fft(r.resources,[],2);
P2 = abs(Y/(size(Y,1)*size(Y,2)));
P1 = P2(1:size(r.resources,2)); %以样本长度读取代谢信号
P1(2:end-1) = 2*P1(2:end-1); %去两倍信号延拓
plot([1:length(P1)],P1)
title('代谢的能力的傅里叶变化')
xlabel('10种资源')
ylabel('资源代谢频率分布')
% surf(r.species)

功能实现函数(非运行顺序)

function [coarsedGrainedAbundanceVec] = coarseGrainCommunityStructure(abundanceVector,groupVector,maxGroups)
% function sums the total abundance within each group (groupVector conains the group id for each species)
coarsedGrainedAbundanceVec = arrayfun(@(y) sum(abundanceVector(groupVector==y)),1:maxGroups);
end
function r = drchrnd(a,n)
p = length(a);
r = gamrnd(repmat(a,n,1),1,n,p);
r = r ./ repmat(sum(r,2),1,p);
end

微生物群落随机部分更新

function [params_out] = evolve_member(params,percentChange)
%EVOLVE_MEMBER takes a parameter structure, copies a random member and
%"mutates" the consumer vector by a max. percent change.  This new member
%is appended into the mcrm structure

%   Detailed explanation goes here
species = randsample(1:params.num_species,1);
total = sum(params.C(species,:));
uptake = params.C(species,:)./total;
dir = binornd(1,[0.5]);
resource = randsample(1:params.num_resources,1);

if dir > 0
    uptake(resource) = uptake(resource) + percentChange;
else
    uptake(resource) = uptake(resource) - percentChange;
end
uptake(uptake<0) = 0;
uptake = uptake./sum(uptake);

% now there is a small chance of changing the total enzyme capacity through
% some global mutation
p_change = 0;
q = binornd(1,p_change);
if q == 1
   total = total + total.*normrnd(0,0.01);
end

c = uptake.*total;

d = 1e-2.*params.x0(params.varIdx.species(species));

% append a new member to the community structure
params_out = appendMember(params,c,d,0);


end


设定群落中微生物的能量代谢古城,相关代谢矩阵的属性

function [R] = getConsumerPriors(T,r,specialist,specialistVar,numResources)
% function generates a single consumer vector R, that represents the total
% uptake capacity for a group of organisms

% the specialist paramaters controls the propotion of total uptake capacity
% (T) that is dedicated to uptake resource "r"

% input
%     T = total uptake capacity (e.g. controlled by fixed amount of enzymes, transporters)
%     specialist, specialistVar = mean and std to randomly draw the
%     fraction of uptake capcity dedicated to consumer resource "r"
%     r = index of resource for which a consumer will be a "specialist" on
%     numResources = total number of resources 

% initialize resource consumer vector
R = zeros(numResources,1);
% pick a dominate resource
if isempty(r)
    r = randsample(1:numResources,1);
end
%draw proportion of fixed resources 固定资源获取比例
f = normrnd(specialist,specialistVar); %正态分布的一随机数,(平均获取能力,均差)
% ensure f falls between 0 and 1
if f<0
    f = 0;
elseif f>1
    f = 1;
end


% choose other resource consumption proportions  from a uniform
% distribution
R = rand(numResources,1);
% (T-f*T) 剩余资源  除了自身以外的消耗速率百分比R(setdiff(1:numResources,r))./sum(R(setdiff(1:numResources,r)))'
R(setdiff(1:numResources,r)) = (T-f*T).*R(setdiff(1:numResources,r))./sum(R(setdiff(1:numResources,r)))';

% compute fraction of total uptake capacity dedicated to consuming resource
% "r"
R(r) = f*T;

end

初始物种丰富度

function [params] = knockoutSpecies(params,speciesIdx)
% Set species abundance to zero
%
params.C(speciesIdx,:) = zeros(length(speciesIdx),params.num_resources);

end

按科水平进行对微生物的代谢资源种类

function [out,priors] = makePhyloConsumers(num_species,num_resources,num_groups,priors,Total,nonNormal)
% function to make sets of consumer matricies drawn from priors,
% representing "families" of consumers.


% if no prior family-level consumer vectors, make some
if isempty(priors)
priors = arrayfun(@(x) Total.*rand(1,num_resources),1:num_groups,'uni',0);
else
end

% draw consumption vectors from dirichlet distribution
% 每次运行返回一个群落数目{(物种数目×资源数目)}的元细胞
C = cellfun(@(y) drchrnd(y,num_species), priors,'uni',0);

% assign wich consumers are part of which family
group_labels = arrayfun(@(y) y.*ones(1,num_species), 1:num_groups,'uni',0);
group_labels = cell2mat(group_labels);

C_total = cell2mat(C');
% randomly sample a sub-population
idx = randsample(1:size(C_total,1),num_species);
C = C_total(idx,:);
group_labels = group_labels(idx);


% breaks the constraint that there has to be a "fixed enzyme budget per
% species"

if nonNormal %运行有随机的浮动代谢
    % allow for a ~2 percent change in the total uptake capacity
    u = normrnd(1,0.01,[num_species,1]);
    C = diag(u)*C;
    
    
%     for i = 1:max(group_labels)
%         u = normrnd(1,0.01);
%         C(group_labels==i,:) = C(group_labels==i,:).*u;
%         uptakeFactor(i) = u;
%     end
   % out.uptakeFactor = uptakeFactor;
      
end

% construct output structure
out.C = C;
out.group = group_labels;
[~,out.k]=sort(out.group);



end

物种随机抽样

function [C] = getRandomConsumerMatrix(numSpecies,numResources,ctype)
% function to greate random consumer matricies
% if ctype is normalized, then make sure sum_alpha c_i,alpha = 1

if ~isfield(ctype,'normalized')
    normalizedBool = true;
else
    normalizedBool = false;
end
    


% randomly sample resource utilization matrix (C) from uniform distribution
C = rand(numSpecies,numResources);
% normalize distribution for each species
if normalizedBool
    C = cell2mat(cellfun(@(x) x./sum(x), num2cell(C,2),'uni',0));
end


end

设定代谢物质矩阵的生成方式(文献中为随机)

function [ D ] = getMetabolism(numResources,flag,p)
%GETMETABOLISM Function used to generate random stoichiometric matricies


switch flag
    case 'thermo'
        w = [numResources:-1:1];
        % construct D matrix
        d = fliplr(w)./numResources;
        D = cell2mat(arrayfun(@(x) [zeros(1,x),d(1:end-x)],1:numResources,'uni',0)');
        D = D';
        % rescale D s.t. W > D'W
        D = 0.3*D;
    case 'sparse'
        w = [numResources:-1:1];
        % construct D matrix
        d = fliplr(w)./numResources;
        D = cell2mat(arrayfun(@(x) [zeros(1,x),d(1:end-x)],1:numResources,'uni',0)');
        D = D';
        % rescale D s.t. W > D'W
        D = 0.3*D;
        
        M = binornd(1,p,[numResources,numResources]);
        D = M.*D;
    case 'rand'
        D = rand(numResources);
        scale = p;
        D = D.*scale;
    case'zeros'
        D = ones(numResources);
end
        

        
end


研究丰富度对群落的影响,去除不能生存的微生物(代谢的物质不足,或者丰富度低于预设值)

function [ params ] = removeExtinctSpecies(params,simulation,minAbundance)
%REVMOVEEXTINCTSPECIES removes species at the end of the simulation that
%are zero
%  

x = simulation.species(end,:);
x(x<0) = 0;
k = x > minAbundance;
params.num_species = sum(k);
params.varIdx.species = 1:(params.num_species);
params.varIdx.resources = (params.num_species+1):(params.num_resources + params.num_species);
x0 = [x(k)';simulation.resources(end,:)'];
params.x0 = x0;
params.C = params.C(k,:);
params.death_rate = params.death_rate(k);


end


资源竞争模型计算模拟

%% 资源竞争模型计算模拟
function [output] = run_mcrm(params)

% parse input
num_species = params.num_species;
num_resources = params.num_resources;
varIdx = params.varIdx;
C = params.C;
D = params.D;
%Q = params.Q;
W = params.W;
B = params.B;
death_rate = params.death_rate;
mu = params.mu;
T = params.T;
tau = params.tau;
x0 = params.x0;
timeSteps = params.timeStep;
alpha = params.alpha;

% create a function handle for dynamics
y = @(t,x) population_dynamics(t,x,num_species,C,D,W,T,mu,alpha,death_rate,B,tau,varIdx);

% solve ODE
[T,Y] = ode15s(y,1:timeSteps,x0);

% parse  species, resource abundanes and consumer_matrix into output structure
output.species = Y(:,varIdx.species);
output.resources = Y(:,varIdx.resources);
%output.consumer_matrix = C;
%output.weights = W;
%output.production_matrix = D;
output.communityStruct = output.species(end,:);
output.environmentalStruct = output.resources(end,:);



end
%%求解微分方程
function [dx] = population_dynamics(t,x,num_species,C,D,W,T,mu,alpha,death_rate,B,tau,varIdx)
    dx = zeros(sum(length(varIdx.resources)+num_species),1);
    % uptake, consumption matrix
    %C = update_consumption_matrix(C,x(varIdx.resources),diag(W));

    % calculate growth rate multiplier ([sum_j w_j * c_ij * r_j - T])
    growth_rate_multiplier = C*W*x(varIdx.resources) - T;
    % calculate consumption rate of resources
    consumption = (C*diag(x(varIdx.resources)))'*x(varIdx.species);
    % compute production rate 
    production = D*consumption;
    
    % calculate differential
    dx(varIdx.species) = x(varIdx.species)*diag(mu).*growth_rate_multiplier - death_rate.*(x(varIdx.species));
    dx(varIdx.resources) =  (alpha-x(varIdx.resources))./tau - consumption + production + B.*sum(death_rate.*x(varIdx.species));
    %dx = dx';
        
end

群落属性更新(忘了具体干啥的了。。。。)

function [params] = appendMember(params,c,abundance,death_rate)
%APPENDMEMBER takes a parmater structure and appends a new member (consumer vector) onto this structure
%   inputs:
%		 params : mcrm paramater structure
%	     c : consumer vector (1XM dim)
%	     abundance : initial abundance of this consumer
%		 dealth rate: rate that this species dies relative to per capita growth (usually set to 0)
 

params.x0 = [params.x0(params.varIdx.species);abundance;params.x0(params.varIdx.resources)];
params.varIdx.species = [params.varIdx.species,params.varIdx.species(end)+1];
N = length(params.varIdx.species);
params.num_species = N;
params.varIdx.resources = (N+1):N+length(params.varIdx.resources);

params.C = [params.C;c];
params.death_rate = [params.death_rate;death_rate];

end

总结

2020年11月25日
。。。。下次再说
Logo

瓜分20万奖金 获得内推名额 丰厚实物奖励 易参与易上手

更多推荐