Correlation functions and power spectra
We're going to create fake random data that creates a noise signal as a function of time. Call the noise at time t,x(t). This could be many things. It could be noise coming out of an electrical circuit as a function of time, it could be the fluctuations in light intensity intensity, pressure fluctuations, or the position of a small sphere moving in a constrained fluid environment. There are general principles of how to analyze noise and we're going to investigate them here.
Assuming we have oodles of data for x(t), what can we do with it? We can look at it's average properties. What does the word "average" mean? It can mean many things. It could be that you do an experiment every day for a year, and you're averaging over different days. But that may not be the most efficient way to gather information. Most of the time there's no difference in an experiment done on Tuesdays or Wednesdays, or even between what's happening at 1PM and 1:05PM. So instead of doing things on different days, we can take a continuous stream of data and perform averages on that. So for example we can compute the mean of it
⟨x⟩. This would mean summing up all the data and dividing by the number of points. Although this is easy to do, it's often not that terribly interesting, and often, as we consider the below, we can take the mean to be 0.
So what else can you do with the data? You can get its histogram, i.e. how often a x lies between 0.1 and 0.11, etc. More crudely you can just look at the variance of this distribution which is ⟨x2⟩. Often the data forms a gaussian distribution in any case, so that full distribution you'll see has this "bell shaped" curve, with the computed variance.
But there are lots of different noisy signals that have the same variance but look radically different. That's because even though the noise traces look random, statistically speaking: not all random functions look the same. You can have data that darts quickly between positive and negative values, or data which is much more highly correlated and slowly meanders from positive to negative values.
How can you distinguish between data with different time characteristics? The most important way to do so is using the "two point autocorrelation function", (or correlation function for short.) Again with the mean being zero, we define this as
C(t)=⟨x(0)x(t)⟩
where here I'm being inefficient and averaging over different data sets (or days as in the above discussion.) What we're saying is that we'll ask how the value of x at some time, arbitrarily set to 0, correlates with its value at time
t. This means that if
x(t) darts rapidly between positive and negative values, that
C(t) will quickly go to zero as
t increases. On the other hand, if the noise
x(t) is more sluggish, it'll stay correlated with its value at
t=0 for longer, and the correlation function will still be non-zero.
Correlation functions, tell us a huge amount about the x(t). In fact, if the histogram for
x is Gaussian, it tells you everything about its statistics! So this is a very important thing to understand. You'll see in the next problem how it tells us a lot about the dynamics of a particle in an optical trap, but there are a huge number of applications of correlations aside from this!
Another way of looking at all of this is in the frequency domain. This is often done in experiments, because electronics have traditionally been more suited to this mode of investigation. By the frequency domain, I mean the usual Fourier transform. So what does noise look like in frequency space? Actually you are looking at it right now! Incoherent light (which is 99% of light). When you sum a bunch of waves with different frequencies together, but with random phases, you get noise! So conversely, noise looks like the sum of a bunch of sine waves with random amplitudes and phases.
This means that the Fourier transform of noise looks like noise, but not quite the same. If you Fourier transform "white noise", that's a kind of noise that infinitely quickly darts back and forth between negative and positive values, you find that the amplitudes of all the frequency components have the same statistics. There is no increase or decrease in amplitude as the frequency of the wave is varies. It's called "white" because that's approximately what you have with white light (very approximately.)
But no noise is truly white. Instead the amplitudes of the noise will vary with frequency. So lets call the Fourier transform of x(t),
ˆxω. What characterizes different noise is
⟨|ˆxω|2⟩. That is, the variance of the amplitudes of the waves as a function of frequency. This is often referred to as the "power spectrum".
This would also seem like an important way of characterizing noise, but apparently is different than the correlation function. Or is it? We now come to what is known as the Wiener Khinchin Theorem
Links to an external site.. The Fourier Transform of the power spectrum is the autocorrelation function of the position, sometimes just called the correlation function. In symbols, call ˆxω=F(x(t) Then
⟨|xω|2⟩=TmeasF(C(t)) where C(t) is explained above.
Tmeas is the total time the measurement of
x(t) is taken. Don't worry too much about that prefactor. Depending on definitions of fourier transform, you'll see a slightly different result. The important point is that the Fourier transform of the correlation function is the power spectrum (and pretty much vice-versa).
So let's see how we can test this all out by doing the following:
- 1. Look at the program hw5/noise.py. It's got comments that should be helpful. Run the script. Out should pop a bunch of graphs. Look at figures 3 and 4 first, which show graphs of some white noise and "filtered" noise. White noise was first generated, then fourier transformed. Then the high frequency components were attenuated so that the noise has less high frequency noise in it. This is done by the sqrt_lorenzian function. Now that we have our new less frenetic noise, we inverse fourier transform it to see the noise in the time domain. If you look at Figs 3 and 4, this difference should be apparent.
- 2. Next we do a histogram (figure 0) of the the values of the noise as mentioned above. We see it fits well to a gaussian. The white noise was definitely generated to give a Gaussian histogram, but how about the filtered noise. Does this make sense?
- 3. Now we look at the correlation function
C(t). (Fig. 1). That's been fit to an exponential. Why should an exponential be a good fit?
- 4. Finally Fig. 3 gives the power spectrum that's been fit to a Lorenzian. It makes sense that it should fit well to one, because that's how it was constructed in the first place.
- 5. Now change the code:
fa = fw * sqrt_lorenzian(N,omega_0)
to instead use the line:
fa = fw * sharp_cutoff(N,omega_0)
(Its commented out in the code so you don't need to type it in again. Just uncomment it and comment out the previous line).
This puts a very sharp cutoff on the amplitudes, instead of the smooth Lorenzian being used before. Can you explain the correlation function that you see?
Part II: Noise in biological systems
Much of the most important parts of biological machinery involves systems that are so small that thermal noise is important. Here you will be asked to research sources of noise in different contexts.
- 1. Describe the kind of noise that takes place at the level of the individual neuron in regard to the behavior of the transmembrane voltage. There are many different kinds of neurons and these are intricately connected together. However there are general question that you can still consider.
- Describe qualitatively, the characteristics of the transmembrane voltage as a function of time. Does these characteristics vary between different cell types?
- Is the noise in the transmembrane voltage seen at high temporal resolution, i.e. less than 1 msec, an example of Gaussian noise Links to an external site.?
- There are frequently many different synaptic inputs to a neuron. If the system is noisy, how does this influence the ability of it to do computation?
- 2. Gene expression in a single cell use a small number of components that play crucial roles. Each component is itself microscopic and susceptible to noise. These components include DNA and regulatory molecules. The binding of regulatory molecules is governed by physical processes that include random fluctuations. Describe experimental evidence for intrinsic fluctuations in the expression level of genes. For example, there has been good evidence reported for this behavior in E. Coli.