© University of Kent - Contact | Feedback | Legal
The University of Kent, Canterbury, Kent, CT2 7NZ, T +44 (0)1227 764000
This project will look at possibilities to design, model and (mathematically) analyse distributed memory systems based on realistic biomolecular interactions. The questions to be addressed during this research are (i) how can data be efficiently stored in biomolecular systems (ii) how can stored data be retrieved and processed (iii) what are the fundamental properties of bio-molecular systems, in particular, what are the theoretical limitations on storing data in biological systems.
This project will require an interdisciplinary outlook, willingness to learn some biochemistry, and some mathematical inclination.
Humans are very good at prioritising competing processing demands. In particular, perception of a salient environmental event can interrupt ongoing processing, causing attention, and accompanying processing resources, to be redirected to the new event. A classic example of this is the well-known Cocktail Party Effect. Not only are we easily able to follow just one conversation when several people are speaking, but the occurrence of a salient phrase in a peripheral conversation stream, such as somebody mentioning our name, causes auditory attention to be redirected. It is also clear that emotions, motivation and physiological state in general, play a key role in such prioritisation In contrast, artificial systems do less well. For example, they are often bad at adjusting their processing to salient events, especially when assessing salience is context dependent. Thus, they may fail to respond appropriately to a salient event or at the other extreme, they may interrupt processing unnecessarily frequently in response to contextually low salience events.
The proposed PhD will investigate the construction of computational models of these cognitive phenomena, with particular emphasis on neural level modelling. The modelling work will be guided by (and will also guide) the converging evidence now being made available by behavioural studies and brain mapping (both fMRI and EEG). The models that we will construct will both elucidate the human system and inform the construction of artificial systems by using the suite of computational models as general-purpose specifications of such systems.
How consciousness emerges from the physical matter of the brain remains one of the greatest mysteries of science. However, as a result of modern neuroscience and brain imaging techniques, theories of the neural mechanisms underlying conscious experience are starting to be proposed. For example, there are theories concerning synchronous firing of neurons and consciousness and there are explanations that focus on brain regions, e.g. the what and where pathways from visual cortex. In addition, neural network modelling is playing an important role in this debate. For example, explanations focussing on synchronous neural spiking have been investigated using neural network simulations.
There is considerable room for PhD level research on using neural networks to simulate theories of consciousness. One direction would be to develop models of how masking works in psychological studies of perception. It is well known that following a stimulus by a mask prevents conscious experience of the stimulus (in fact, related effects arise if the mask is presented at the same time as the stimulus). However, even though we have no awareness of the stimulus, our motor system still responds to it. Notice that such masking underlies the subliminal presentation of frames during films.
Despite the fact that such masking has empirically been investigated very extensively and indeed many theories of its functioning exist, there is currently no comprehensive computational model of the phenomenon. Thus, a possible avenue for a PhD in this area would be to construct neural network models of the competing theories of masking in order to verify their validity.
Genetic algorithms are a technique for finding good solutions to optimization and other problems which work by simulating the evolutionary process found in natural biological systems. One problem with these techniques is that the population can rapidly converge to a weak solution, or at the opposite end of the spectrum can fail to converge to a solution at all. This depends largely on the choice of parameters such as rates of mutation, population size, recombination rates, et cetera.
Work has been carried out at finding a priori optimal parameter settings for simple problems, however these results do not extend easily to real problems. One promising approach would be to look at on-the-fly parameter adjustment. As the algorithm runs it would accumulate statistical information about current performance, and use this information to do parameter-estimation for models of algorithm behaviour. This can then be used to adjust the parameters of the algorithm itself to control e.g. convergence rates. Another approach would be to use ideas from control theory.
Similar techniques could be applied to other meta-heuristics for optimization and related problems.
Several methods for the automated layout of regular transport schematics have been developed. These concentrate on ensuring lines are at 45 degrees. However, many metro maps may have a more effective layout with smooth curves rather than jagged lines. Some hand drawn examples can be seen here: go to page 6, London by Max Roberts and Madrid, also by Max Roberts.
Clearly, there needs to be a radical rethink of the asethetics of layout. So what features make for a good curvy map? sharpness of bends, and the number of turning points may well be important. Novel automatic layout techniques need to be developed that attempt to improve the layout, perhaps inspired by curve representations using splines or parametric equations which could then be integrated with curve fitting techniques to attempt to draw diagrams with the desired aesthetics.
Currently, graph drawing techniques that are capable of drawing arbitary input graphs are based on either force models or search based optimization. Both of these have major problems: the first has only a limited capability to capture aesethics; the second is very slow to calcluate layout. We propose a new method, pattern based layout, that has the potential to overcome both of these issues.
Pattern based layout would rely on identifying patterns in the diagram that can be matched to templates that have a known drawing. In terms of tasks, the researcher would need to identify patterns and develop layouts for them. An efficient approach to finding appropriate subgraphs in the target graph would then be required. This would have to deal with the issues of overlapping patterns.
This work is in collaboration with Dr. Kai Xu of Middlesex University.
Many data sets are visualized effectively with area-proportional Venn and Euler diagrams, where the area of the regions formed by intersecting curves is in proportion to a defined specification. Transforming one area proportional diagram into another, either with the same area specification or a target area specification, could allow the improvement of the visual properties of the diagram, or allow the generation of particular diagrams from a library of known layouts.
The challenges with this project lie with defining transformations that are area preserving. Previous Euler diagram transformation research has only concentrated on the non area-proportional case, where the diagram is first converted into a graph, and then, following graph transformations is converted back into the diagram. This project would either have to discover ways of ensuring the graph transformations are area preserving, or define new transformation methods that were applied on structures that are closer to a concrete representation of the diagram.
Computational paint (also known as amorphous computers) is a massively parallel system of locally interacting small processors. Each of the individual processors is very limited in its computational capabilities and can be thought of as an automaton with a few states only. The individual units are distributed in space and can communicate (send and received message) with other units that are in close proximity to them.
It has been demonstrated that such collections of weak computing units collectively can perform interesting computational tasks in a massively parallel way. As a concept amorphous computing is only interesting if the individual units can be produced cheaply. Unfortunately, this is currently not the case, at least if we think of these units as standard electronic components.
This project will explore the use of bacteria instead of standard electronic devices. Bacteria can be engineerined to have all the required properties and are extremely cheap to produce (in fact they divide by themselves). The main aim of this project will be to develop algorithms for amorphous computers based on credible models of bacterial interactions.
Data mining consists of concepts and methods (derived from machine learning and statistics) for discovering important hidden relationships in the data. A very important application of data mining is bioinformatics, where the goal is to discover interesting knowledge from molecular biology data sets.
This project will consist of developing a data mining method for analysing data about the molecular biology of ageing. More precisely, the project will focus on classification (supervised machine learning) methods for predicting functions or related aspects of proteins or genes associated with ageing. There is a clear and urgent need to better understand the biology of ageing (since ageing is a major cause of so many diseases), and the availability of large amounts of ageing-related data on the web gives data miners a great opportunity to try to contribute to this nderstanding, discovering new patterns in such data. The specific type of classification method to be developed and the specific type of ageing-related protein/gene whose functions will be predicted are flexible, and these will depend on the background of the PhD student.
This is a very interdisciplinary project, and PhD candidates are expected to have some previous background in at least one of the two major areas of the project, namely data mining or molecular biology (as well as a genuine interest in interdisciplinary research).
The geometric framework (GF) is a formal theory of search algorithms that links solution representations, search operators, search spaces and fitness landscapes. It defines search operators using geometric shapes such as circles, segments and convex polygons specified as functions of the underlying distance of the search space. The search operators are then defined using these geometric shapes to relate the location and probability of the offspring with respect to the location of their parents. For example, uniform geometric crossover is a geometric search operator that, given two parents corresponding to the end-points of a segment, returns uniformly points in the segment as offspring. This framework reveals an important duality: the same search operator can be defined both on the representation as a procedure that manipulates parent configurations to produce offspring configurations as well as in a geometric declarative form as spatial relationship by specifying the locations and probability of the offspring with respect to the locations of their parents. For example, the uniform crossover for binary strings, which is defined on the representation, dually, it can be defined as uniform geometric crossover on the Hamming space. This duality has surprising consequences: (i) it allows us to use geometry on the search space as a formal language for a general theory of representations that encompasses the most-frequently used mutation and crossover operators across representations; (ii) it allows us to associate complex search operators with classic fitness landscapes; and (iii) it allows us to do principled design of search operators for new representations and problems that embed problem-specific knowledge in the search.
The aim of this project is to explore whether the GF can give insights to make discoveries in real genetics. This line of research is highly speculative. However, this might be possible because homologous biological recombination that aligns DNA strands on contents before genetic material exchange was shown to be a geometric crossover. Using the GF, this may allow us to determine the biological fitness landscape exactly and not to use this concept only heuristically or metaphorically as it is currently done in biology. Studying the characteristic of smoothness of this landscape might then explain, or contribute to an explanation on, why the rate of the biological adaptation due to Darwinian evolution is surprisingly high (which is an important open question). A second line of enquiry building upon the GF that can give interesting results in real genetics is about genes discovery (which is an important open problem). This is because the notion of gene can be naturally defined in geometric terms in a representation-independent way and can be formally specialized to any solution representation, including the biological one based on DNA strands. When specialized to binary stings, geometric genes correspond to schemata. It would be then interesting to see whether the genes predicted by the GF for DNA strands correspond to known real genes and if so whether they can predict unknown real genes.
Genetic Programming (GP) is a method of problem solving that takes its inspiration from biological evolution. Specifically, GP is concerned with evolving a population of programs (or other executable structures) to solve a specific problem. GP has been successful in a number of areas, e.g. data mining, bioinformatics, game AI, circuit design, symbolic regression, and many others.
In GP the semantics of programs are important, i.e. what programs actually do, as opposed to the syntax of programs, e.g. their text. A lot of work in GP has focused on these syntactic aspects of GP and neglected the semantic aspects. In the last few years a decent amount of work has been carried out on semantic aspects of GP, both at Kent and elsewhere. For example, the semantic effects of evolutionary operators can be studied (e.g., does the exchange of text between two programs result in a program that has aspects of the behaviour of the two parent programs?), studying the semantic diversity of the population (e.g., is the initial population semantically diverse? How does this semantic diversity change over an evolutionary run?), aspects concerned with fitness (e.g., can we define a fitness function in terms of the desired behaviour of programs and measure the distance of a program from that ideal semantics?).
There is lots of scope for exploring different aspects of this. A good starting point would be to look at the papers that I have written with my research student Lawrence Beadle (e.g. on semantic initialisation, crossover and mutation), and the work by groups in Dublin and Poznan.