micro biology question

wormer

Science

15 Mar '17 15:49

wormer
Joined
08 Sep '06
Moves
24735
15 Mar '17 15:49
With age, somatic cells are thought to accumulate genomic scars as a resilt of the inaccurate repair to double stranded breaks by NHEJ. Estimates based on frequency of breaks in primary human fibroblasts suggest that by the age of 70 each human somatic cell carry some 2000 NHEJ-induced mutations due to inaccurate repair. If these mutations were distributed randomly around the geome, how mnay genes would you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
wormer
Joined
08 Sep '06
Moves
24735
15 Mar '17 15:56
I'm not quite sure how i'm supposed to go about answering this question. I need help forming a plan to make calculations.
twhitehead
Cape Town
Joined
14 Apr '05
Moves
52945
15 Mar '17 17:102 edits
Originally posted by wormer
If these mutations were distributed randomly around the geome, how mnay genes would you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
If one mutation occurs, it has a 2.5% chance of being in a 'crucial region'.
Next you need to figure out the probability of at least one of the total number of mutations being in the cruical region. I am afraid I don't know the formula, but this site might help:

http://www.mathgoodies.com/lessons/vol6/independent_events.html

An equivalent would be picking a single ball from a bag multiple times and always putting it back. If 2.5% of the balls are black, what is the probability of picking a black ball after 2000 attempts.

[edit] Actually you asked how many black balls one would expect to pick. So rather more complicated.
twhitehead
Cape Town
Joined
14 Apr '05
Moves
52945
15 Mar '17 17:16
Could it really be this simple?
The probability that each mutation is in a critical region is 2.5% Therefore 2.5% of the mutations will be in critical regions
Therefore the answer is 2.5% of 2000 or 4.5
wormer
Joined
08 Sep '06
Moves
24735
15 Mar '17 17:37
Originally posted by wormer
With age, somatic cells are thought to accumulate genomic scars as a resilt of the inaccurate repair to double stranded breaks by NHEJ. Estimates based on frequency of breaks in primary human fibroblasts suggest that by the age of 70 each human somatic cell carry some 2000 NHEJ-induced mutations due to inaccurate repair. If these mutations were distributed ra ...[text shortened]... you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
correction- 2% of genome- 1.5% coding and 0.5 regulatory
wormer
Joined
08 Sep '06
Moves
24735
15 Mar '17 18:00
Originally posted by twhitehead
Could it really be this simple?
The probability that each mutation is in a critical region is 2.5% Therefore 2.5% of the mutations will be in critical regions
Therefore the answer is 2.5% of 2000 or 4.5
these numbers make no sense
twhitehead
Cape Town
Joined
14 Apr '05
Moves
52945
15 Mar '17 19:44
Originally posted by wormer
these numbers make no sense
Follow the logic not the numbers. Yes, I got the numbers wrong.
So its 2% of the mutations hit 'critical regions'.
2% of 2000 = 40
DeepThought
Losing the Thread
Quarantined World
Joined
27 Oct '04
Moves
87415
16 Mar '17 03:06
Originally posted by wormer
With age, somatic cells are thought to accumulate genomic scars as a resilt of the inaccurate repair to double stranded breaks by NHEJ. Estimates based on frequency of breaks in primary human fibroblasts suggest that by the age of 70 each human somatic cell carry some 2000 NHEJ-induced mutations due to inaccurate repair. If these mutations were distributed ra ...[text shortened]... you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
Suppose the probability of a coding mutation is x (which you seem to be saying is 2%, so 0.02). There are N mutations in total (N = 2000). Then the average number of mutations is:

<n> = 0* probability of all non-coding mutations + 1 * probability of exactly 1 coding mutation + 2 * probability of exactly 2 coding mutations + ... + N * probability that all mutations are in coding DNA.

Let's look at the typical term, we need to know the probability of n coding mutations. The probability of getting n coding mutations in a row is x^n (x to the power of n). The probability of then getting (N - n) non-coding mutations is (1 - x)^(N - n). We have to take into account that we can get our n coding mutations and (N - n) non-coding mutations in any order. This is given by the binomial coefficient (which I'll write C(N, n)). So the typical term in the above polynomial is:

n * C(N, n) * x^n * (1 - x)^(N - n)

To sum this we need a new variable y = x / (1-x), and we can rewrite the typical term as:

n* C(N, n) * y^n * (1 - x)^N

So the average number of coding mutations is now:

<n> = (1 - x)^N * sum(n = 0 ... N) n * C(N, n) * y^n

We can use that d/dy y^n = n y^(n - 1), to do the sum:

<n> = y*(1 - x)^N * d/dy sum(n = 0 ... N) C(N, n) * y^n

The sum is now straightforward:

<n> = y * (1 - x)^N * d/dy (1 + y)^N = y * (1 - x)^N * [N * (1+y)^(N - 1)]

1 + y = 1/(1 - x) so that:

<n> = [x/(1 - x)] * [(1 - x)^N] * N * [1/(1 - x)]^(N - 1) = Nx = 2000 * 0.02 = 40

So twhitehead got the right answer.

The only catch is if we have to take into account the possibility that a coding mutation is in a critical gene which produces a highly conserved protein and the mutation kills the cell. Some of these mutations might kill the organism, for example if it is on the PrP gene causing CJD before age 70. So we need to factor out mutations that kill cells or the entire organism. If there are m coding bases in total of which p are critical coding bases and the genome is length l, then where x was m/l we'd need to replace it with (m - p)/(l - p). If p is small compared with m then don't worry about it.
twhitehead
Cape Town
Joined
14 Apr '05
Moves
52945
16 Mar '17 16:50
Originally posted by DeepThought
The only catch is if we have to take into account the possibility that a coding mutation is in a critical gene which produces a highly conserved protein and the mutation kills the cell. Some of these mutations might kill the organism, for example if it is on the PrP gene causing CJD before age 70. So we need to factor out mutations that kill cells or t ...[text shortened]... ed to replace it with (m - p)/(l - p). If p is small compared with m then don't worry about it.
A good point about evolution. Its not quite clear to me though what you are calculating.
The question states that there are 2000 mutations at age 70 - which means the cells involved (and the organism) survived to age 70, so the reality is that there were likely more mutations, some of which occurred in super critical regions but were weeded out by evolution (cell or organism death).
DeepThought
Losing the Thread
Quarantined World
Joined
27 Oct '04
Moves
87415
16 Mar '17 17:50
Originally posted by twhitehead
A good point about evolution. Its not quite clear to me though what you are calculating.
The question states that there are 2000 mutations at age 70 - which means the cells involved (and the organism) survived to age 70, so the reality is that there were likely more mutations, some of which occurred in super critical regions but were weeded out by evolution (cell or organism death).
I assume you mean the (m - p)/(l - p) bit. That's just a way of excluding mutations that kill the cell (or the entire organism) from the calculation. As an analogy imagine shuffling a pack of cards and turning over the top card, drawing a joker corresponds to a coding mutation, if the card turned over is the bridge score card then that ends the game. If one does this twelve times (say) then since we've specified that the bridge scoring card has not been drawn then I think that one gets the right probability if one just does the calculation for a normal pack with two jokers and with the bridge scoring card absent.
twhitehead
Cape Town
Joined
14 Apr '05
Moves
52945
16 Mar '17 18:40
Originally posted by DeepThought
If one does this twelve times (say) then since we've specified that the bridge scoring card has not been drawn then I think that one gets the right probability if one just does the calculation for a normal pack with two jokers and with the bridge scoring card absent.
I agree.

Science Forum

micro biology question