Selforganised modular neural networks for encoding data
Posted on: Tuesday 1/1, 2008; 10:26 PM
These calculations are taken from the original Mathematica notebook used to calculate these results, which was started when I was working at the Isaac Newton Institute for Mathematical Sciences programme on Neural Networks in 1997. There are a few changes because of Mathematica version dependencies, but otherwise I have changed nothing. I calculate the and parts separately because that is the way I did the calculations originally.
This is the largest algebraic calculation that I have ever done. The results are sufficiently simple (and involve a great deal of algebraic cancellation) that one suspects that there might be a much cleverer way of deriving them. Anyone?
Currently, this reworking of the paper uses Mathematica version 5.
Preliminary Definitions
Switch off annoying warning messages.
Define a function for computing a Euclidean distance.
Define a function for contructing a vector on the unit circle centred at the origin.
Circular Manifold (2 overlapping posterior probabilities)
Functional Form of the Posterior Probability
Define p(θ)=f(θ) in the interval , where f(θ) is given by
Solve for the coefficients a, b and c by imposing continuity and normalisation.
Define a function to implement this solution.
Display f(θ) with the correct a, b and c inserted.
Stationary Posterior Probability
Without loss of generality, set y=0.
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Insert the known form of the coefficients.
Solve for r.
Objective Function
Evaluate .
Evaluate .
Define a function to implement .
Stationary Reference Vector Length
Compute the quantity that has to be zero when the reference vector length is stationary.
Manipulate this into a simpler form.
Define a function to implement this.
Define a function to find a root of this.
Stationary
Substitute in the known stationary parameter values to obtain the stationary .
Replace the term from the stationarity condition on the reference vector length.
Define a function to implement this.
Define a function to minimise this w.r.t. s.
Toroidal Manifold (2 overlapping posterior probabilities)
Stationary Posterior Probability
Without loss of generality, set y=0.
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Define a function for extracting the Fourier coefficients of an expression w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1). This function assumes that these are the only Fourier components present.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Solve for r by setting the coefficient of cos 2θ to zero.
Verify that the solution makes all of the Fourier coefficients zero.
Objective Function
Evaluate .
Evaluate .
Define a function to implement .
Stationary Reference Vector Length
Compute the quantity that has to be zero when the reference vector length is stationary.
Manipulate this into a simpler form.
Define a function to implement this.
Define a function to find a root of this.
Stationary
Substitute in the known stationary parameter values to obtain the stationarity .
Replace the term from the stationarity condition on the reference vector length.
Define a function to implement this.
Define a function to minimise this w.r.t. s.
Circular Manifold (3 overlapping posterior probabilities)
Functional Form of the Posterior Probability
Define p(θ) for each of the intervals (this is ), (this is ), and (this is ).
State the continuity constraints.
State the norms, and expand them in terms of sin θ and cos θ.
Extract the oscillating parts of the norms.
Extract the constant parts of the norms.
Solve the continuity and normalisation constraints for the parameters {a2,a3,b1,b3,c1,c2,c3}, leaving a1 and b2 to be determined.
Define a function to implement this transformation.
Stationary Posterior Probability
Without loss of generality, set y=0.
Region 1:
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Region 2:
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Region 3:
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Solve the Stationarity Conditions:
The coefficient of cos(2θ) in region 2 gives b2.
Substitute back into the coefficient of cos θ in region 1.
Substitute back into the coefficient of sin θ in region 3.
Solve for a1.
Solve for r.
Substitute the solution back to determine the coefficients in the expression for the posterior probability.
Define a function to implement this transformation.
Display f1(θ) with the correct a1, b1 and c1 inserted.
Display f2(θ) with the correct a2, b2 and c2 inserted.
Display f3(θ) with the correct a3, b3 and c3 inserted.
Objective Function
Evaluate .
Evaluate .
Define a function to implement .
Stationary Reference Vector Length
Compute the quantity that has to be zero when the reference vector length is stationary.
Define a function to implement this.
Define a function to find a root of this.
Stationary
Substitute in the known stationary parameter values to obtain the stationary .
Extract the main factor Last[d12circle3minimum] from , TrigFactor the sine terms, and write as c2s.
Extract the main factor from the quantity that has to be zero when the reference vector length is stationary, and write as c2s.
Solve for c2s in this stationarity condition.
Substitute back into the main factor in .
Include the missing factors that were dropped earlier.
Define a function to implement this.
Define a function to minimise this w.r.t. s.
Define a function which combines the 2 and 3 overlapping posterior probability cases.
Toroidal Manifold (3 overlapping posterior probabilities)
Stationary Posterior Probability
Without loss of generality, set y=0.
Region 1:
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Region 2:
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Region 3:
Compute the posterior average over reference vectors.
Compute the quantity that has to be zero when the posterior probability is stationary.
Extract the Fourier coefficients w.r.t. (cos 2θ,sin 2θ,cos θ,sin θ,1).
Solve the Stationarity Conditions
The coefficient of the cos(2θ) term in region 2 gives b2.
Substitute back into the coefficient of cos θ in region 1.
Substitute back into the coefficient of sin θ in region 3.
Solve for a1.
Solve for r.
Substitute the solution back to determine the coefficients in the expression for the posterior probability.
Define a function to implement this transformation.
Display f1(θ) with the correct a1, b1 and c1 inserted.
Display f2(θ) with the correct a2, b2 and c2 inserted.
Display f3(θ) with the correct a3, b3 and c3 inserted.
The functional form of the toroidal posterior probability is exactly the same as in the circular case, except that .
Objective Function
Evaluate .
Evaluate .
Define a function to implement .
Stationary Reference Vector Length
Compute the quantity that has to be zero when the reference vector length is stationary.
Define a function to implement this.
Define a function to find a root of this.
Stationary
Substitute in the known stationary parameter values to obtain the stationary .
Extract the main factor Last[d12torus3factorialminimum] from , TrigFactor the sine term, and write as cs1cs2s.
Extract the main factor from the quantity that has to be zero when the reference vector length is stationary, and write as cs1cs2s.
Solve for cs1cs2s in this stationarity condition.
Substitute back into the main factor in .
Include the missing factors that were dropped earlier.
Define a function to implement this.
Define a function to minimise this w.r.t. s.
Define a function which combines the 2 and 3 overlapping posterior probability cases.
Collect Results Together
These results are as computed originally using Mathematica version 3. They are algebraically equivalent to whatever result is obtained using the latest version of Mathematica.
Miscellaneous
In the torus routine the number of neurons M is automatically divided by 2 inside the routine.
2 Overlapping Posterior Probabilities
3 Overlapping Posterior Probabilities
The torus results below assume that the total number of neurons is M, and this is automatically divided by 2 inside the routines. These results are redundant because the functional form of the toroidal posterior probability is exactly the same as in the circular case, except that .
2/3 Overlapping Posterior Probabilities
Asymptotic Properties
Circular Manifold
Validity Limit
Derive the limiting form (as n→∞ and M→∞) of the boundary between the regions where 2 or 3 posterior probabilities overlap.
M→∞ (2 overlapping posterior probabilities)
Do an asymptotic expansion of s in the quantity that must be zero when s is stationary.
Solve for the coefficients that satisfy stationarity.
Substitute back into and asymptotically expand to lowest contributing order.
Verify this result graphically. The plots lie on top of each other.
n→∞ (3 overlapping posterior probabilities)
Do an expansion of s about the point (which is the limiting value as n→∞). Expand s as , where δs is presumed to be O(1).
This stationarity condition is thus satisfied to provided that the first and third terms (which are both O(1)) cancel out.
Substitute the solution for δs back into .
Extract the leading asymptotic behaviour.
Verify this result graphically. The plots lie on top of each other.
Toroidal Manifold (joint encoding case)
Validity Limit
Derive the limiting form (as n→∞ and M→∞) of the boundary between the regions where 2 or 3 posterior probabilities overlap.
Toroidal Manifold (factorial encoding case)
Validity Limit
Derive the limiting form (as n→∞ and M→∞) of the boundary between the regions where 2 or 3 posterior probabilities overlap.
M→∞ (2 overlapping posterior probabilities)
Do an asymptotic expansion of s in the quantity that must be zero when s is stationary.
Solve for the coefficients that satisfy stationarity.
Substitute back into and asymptotically expand to lowest contributing order.
Verify this result graphically. The plots lie on top of each other.
n→∞ (3 overlapping posterior probabilities)
Do an expansion of s about the point (which is the limiting value as n→∞). Adjust the expansion so that it is of the form , where δs is presumed to be O(1).
This stationarity condition is thus satisfied to provided that the first and third terms (which are both O(1)) cancel out.
Substitute the solution for δs back into .
Manipulate the large factor separately, then recombine the factors, then extract the leading asymptotic behaviour.
Verify this result graphically. The plots lie on top of each other.
Joint/Factorial Encoding Stability Limit
The n→∞ behaviour of in the joint encoding case is , and in the factorial encoding case is .
Solve for the value of M that makes these the same.
Assorted Results
Value of n on Validity Boundary
For what value of n does in the circular case with 2 overlapping posterior probabilities?
Check this solves the case of 3 overlapping posterior probabilities.
Verify that the distortion functions join up correctly.
For what value of n does in the toroidal case (factorial encoding) with 2 overlapping posterior probabilities?
Check this solves the case of 3 overlapping posterior probabilities.
Verify that the distortion functions join up correctly.
Verify that Solutions Join Together
The difference between the 2 and 3 overlapping posterior probabilities at the boundary is a multiple of a quantity that is zero.
Ditto in the toroidal case when .
The 2 overlapping posterior probabilities circle and torus solutions are closely related.
Plots
The functional forms were computed using Mathematica version 3, and are algebraically equivalent to whatever the current version of Mathematica gives.
Receptive Field Plots (Circular Manifold)
Define a function which plots the receptive fields.
Plot an example of the receptive fields and their norm for encoding a circle in the case of 2 overlapping posterior probabilities.
Plot an example of the receptive fields and their norm for encoding a circle in the case of 3 overlapping posterior probabilities.
Stability Contour Plots (Toroidal Manifold)
Define an area of the (n,M) plane to investigate.
Plot the asymptotic value of M.
Plot the boundary between the 2 and 3 overlapping posterior probability regions for joint encoding.
Plot the boundary between the 2 and 3 overlapping posterior probability regions for factorial encoding.
Derive some interpolating functions to speed up the computation of .
Plot the boundary between the stable regions for joint and factorial encoding.
Combine the plots.
Do a comparative plot of for the joint and factorial encoding cases.
Manually Simplified Expressions
These expressions are obtained by starting with the results produced by Mathematica 3 (or whatever algebraically equivalent result is obtained using the current version of Mathematica) and interactively simplifying them.
scircle2[M,n,s]
scircle3[M,n,s]
Do the following replacements to get all of the trigonometric sdependence in terms of and .
Manually transform.
Manually transform some more.
storus2factorial[M,n,s]
storus3factorial[M,n,s]
Transform in terms of , and .
Manually transform into a nice form.
Manually transform some more.
dcircle2[M,n,s]
dcircle3[M,n,s]
Use that .
dtorus2factorial[M,n,s]
dtorus3factorial[M,n,s]
Use that .
Some Useful Diagrams
Target Manifolds
Define 1D image manifolds for 1 and 2 (superimposing) targets.
Define 2D image manifold for 1 target.
Plot 3 pixel values of a 1D 1target image manifold.
Plot 3 pixel values of a 1D 2target image manifold.
Plot 3 pixel values of a 2D 1target image manifold.
2 Overlapping Posterior Probabilities
3 Overlapping Posterior probabilities
Torus Pictures
Pinch some torus drawing code from Graphics`Shapes`. There must be a better way of doing it than this, but I need to be able to "get at" the torus drawing code in order to be able to draw things on the surface of the torus.
Add some extra routines for drawing things on the surface of a torus.
Draw some toruses.
Gather together some of the graphics.
Valid Regions
Joint Posterior Probabilities (linear case)
Here is a function for plotting the boundaries between the piecewise linear regions of a 2D posterior probability.
Plot the case .
Define the posterior probability centred on the origin.
Evaluate the integral of the posterior probability over all values of its input vector.
Plot the posterior probability.
Plot the marginalised posterior probability. This demonstrates that you obtain the 1D posterior probability when you marginalise the 2D case.
Permalink Notebook
A selforganising approach to multiple classifier fusion
Posted on: Tuesday 1/1, 2008; 3:01 PM
These results are generated using the same Mathematica code that was used for the original paper. A few changes had to be made to make the code compatible with the latest version of Mathematica, and to make it more consistent and easy to read.
I have used the previously trained network(s) to generate the figures. I should supply code to generate new instances of the figures.
Currently, this reworking of the paper uses Mathematica version 5.
Functions
Definitions of functions for computing SVQ posterior probabilities.
Figure 1
Define parameters of an SVQ network trained on data from a circular manifold.
This is figure 1.
Figures 2 and 3
Target output.
Strong supervision case. Define parameters of an SVQ network trained on data from a toroidal manifold with strong supervision from a target as given above.
Weak supervision case. Define parameters of an SVQ network trained on data from a toroidal manifold with weak supervision from a target as given above.
Unsupervised joint encoder case. Define parameters of an SVQ network trained in the joint encoding regime on data from a toroidal manifold.
Unsupervised factorial encoder case. Define parameters of an SVQ network trained in the factorial encoding regime on data from a toroidal manifold.
This is figure 2.
This is figure 3.
Permalink Notebook
A discrete firing event analysis of the adaptive cluster expansion network
Posted on: Tuesday 1/1, 2008; 2:46 PM
This paper contains only diagrams. I have written some Mathematica code to generate a similar set of pictures. The code is not optimised to make full use of reusable components.
Currently, this reworking of the paper uses Mathematica version 5.
Initialisation
Turn off annoying error messages.
Functions
Figure 1
Figure 2
Figure 3
Permalink Notebook
Selforganisation of multiple winnertakeall neural networks
Posted on: Tuesday 1/1, 2008; 2:14 PM
These results are created from the original Mathematica notebook that I used to create the paper. Some minor modifications had to be made to ensure compatibility with the latest version of Mathematica.
Currently, this reworking of the paper is being converted so that it uses Mathematica version 6.
Initialisation
Turn off annoying warning messages.
Define some fonts for use in the graphics.
Load packages.
Functions
Borrow some Mathematica code for building a torus from the Graphics`Shapes` package.
Define various bands encircling a torus.
Define a torus plus various bands around it.
Define some functions to plot various graphs.
Figure 1
Assorted useful torus graphics.
This is figure 1.
Figure 2
This is figure 2.
Figure 3
Create the various pieces of graphics needed to construct figure 3.
This is figure 3.
Figure 4
This is figure 4.
Figure 5
This is figure 5.
Figure 6
Create the various pieces of graphics needed to construct figure 6.
This is figure 6.
Figure 7
This is figure 7.
Figure 8
This is figure 8.
Figure 9
This is figure 9.
Figure 10
Create the various pieces of graphics needed to construct figure 10.
This is figure 10.
Permalink Notebook
Prior knowledge and object reconstruction using the best linear estimate technique
Posted on: Monday 31/12, 2007; 6:44 PM
Reproduction of the computational results in "Prior knowledge and object reconstruction using the best linear estimate technique", Luttrell S P, Optica Acta, 1985, vol. 32, no. 6, pp. 703716.
I can reproduce most of the results in the paper. However, the paper has gaps in it where things have not been defined explicitly. I presume this happened because bits of the paper got deleted at a last minute editorial stage, and this left some dangling bonds in the paper. This leaves undefined the precise meaning of "object basis function" which could be either (defined) or (undefined).
Currently, this reworking of the paper uses Mathematica version 5.
Initialisation
Turn off the warning messages that occur because of the infinite integration range. This problem could be avoided by doing the W=constant part of the integral analytically.
Turn off warning messages for similar symbol names.
Load the package for multiple plots.
Load the package for displaying plots together.
Functions
Define the sinc function.
Define a single matrix element of the kernel in equation 24. I could evaluate the contribution from the W=constant part of the integral analytically.
Define the full matrix kernel assuming samples spaces at Δ times the Nyquist length centred on the origin. The matrix is symmetric but I don't bother using this fact when computing it.
Define the object basis function as in equation 25.
Table 1
Compute the eigenvalues of the matrix kernel for 25 Nyquist samples with and c=π, and a unit halfwidth tophat weight W. The multiplying the eigenvalues is the sampling density factor.
Save the results because they took a while to compute.
List the first 7 eigenvalues for and c=π. This is table 1.

c=π 
0.783368 
0.981041 
0.196959 
0.736059 
0.0113728 
0.24359 
0.000190051 
0.0219907 
2.15127*10^^6 
0.00106412 
1.08678*10^^8 
0.0000220308 
5.81144*10^^11 
4.72969*10^^7 
Figure 1
Compute the eigenvalues of the matrix kernel for 25 Nyquist samples (times various sampling interval spacing parameters Δ={1,0.5,0.25,0.1}) with c=π, and a unit halfwidth tophat weight W sitting on a pedestal ε.
Save the results because they took a while to compute.
Plot the eigenvalues. This is figure 1.
Figure 2
Compute the eigenvalues of the matrix kernel for 25 Nyquist samples with c=π and , and a unit halfwidth tophat weight W sitting on a various pedestals ε={0.1,0.01,0.0001}.
Save the results because they took a while to compute.
Plot the eigenvalues. This is figure 2.
Figure 3
Compute the eigenvalues of the matrix kernel for 25 Nyquist samples with c=π and , and a Gaussian weight sitting on a various pedestals ε={0.1,0.01,0.0001}.
Save the results because they took a while to compute.
Plot the eigenvalues. This is figure 3.
These results are different from figure 3 for low order eigenvalues. It seems that I used the wrong normalisation (or something) for the Gaussian in the paper. An easy way to see that the paper is wrong is that the sum of the eigenvalues is not large enough: there is too much depression of the low order eigenvalues which is not compensated enough by the increase in high order eigenvalues.
Figure 4
Compute the eigenvectors of the matrix kernel for 25 Nyquist samples with c=π, and a unit halfwidth tophat weight W sitting on a pedestal ε=0.1.
Save the results because they took a while to compute.
Plot the normalised object basis functions. This is figure 4.
The function (rather than ) was originally plotted, but I didn't define in the paper; it probably was removed during a last minute editorial change. I have found defined in an early draft (July 1983) of the paper.
Figure 5
Compute the eigenvectors of the matrix kernel for 25 Nyquist samples with c=π, and a Gaussian weight sitting on a pedestal ε=0.1.
Save the results because they took a while to compute.
Plot the normalised object basis functions. This is figure 5.
The normalisations of these results are different from figure 5.
Figure 6
This makes use of the results already computed for figure 5.
Compute the corresponding results for ε=0.01.
Save the results because they took a while to compute.
Plot the normalised object basis functions for ε=0.1 and ε=0.01. This is figure 6.
The normalisation(s) differ from figure 6 but otherwise this plot seems to be correct. I have labelled the function plotted as (rather than as it was labelled in the paper).
Figure 7
Compute resolution enhancement factor E as ratio of halfwidth at halfheight of (continuous) sinc image to its reconstruction from 25 Nyquist (spaced for c=π) samples for unit halfwifth weight sitting on a pedestal and using various values of c. Note that the sample spacing stays fixed throughout at Nyquist for c=π, and only the bandwidth of the imaging function is varied.
Save the results because they took a while to compute.
Plot the resolution enhancement factor. This is figure 7.
This accurately matches figure 7 apart from the case which I assume is due to rounding errors. There is one other problem point for c=1.5 and The behaviour at small c looks as if it is due to the halfwidth at halfheight of the reconstruction falling on the fixedlocation discontinuity of the tophat W.
Figure 8
The same as figure 7 except using Gaussian weight sitting on a pedestal.
Save the results because they took a while to compute.
Plot the resolution enhancement factor. This is figure 8.
This accurately matches figure 8 apart from the case which I assume is due to rounding errors.
Try to find out how I got wrong results in Gaussian case. I have a vague memory that I discretised object space which would give inaccurate results.
Tidy up the errors in the resolution enhancement plots in figures 7 and 8.
Permalink Notebook
