School of Computing

Suggested PhD Projects

On-the-fly Parameter Adjustment in Genetic Algorithms

Contact: Colin Johnson

Genetic algorithms are a technique for finding good solutions to optimization and other problems which work by simulating the evolutionary process found in natural biological systems. One problem with these techniques is that the population can rapidly converge to a weak solution, or at the opposite end of the spectrum can fail to converge to a solution at all. This depends largely on the choice of parameters such as rates of mutation, population size, recombination rates, et cetera.

Work has been carried out at finding a priori optimal parameter settings for simple problems, however these results do not extend easily to real problems. One promising approach would be to look at on-the-fly parameter adjustment. As the algorithm runs it would accumulate statistical information about current performance, and use this information to do parameter-estimation for models of algorithm behaviour. This can then be used to adjust the parameters of the algorithm itself to control e.g. convergence rates. Another approach would be to use ideas from control theory.

Similar techniques could be applied to other meta-heuristics for optimization and related problems.

Semantics in Genetic Programming

Contact: Colin Johnson

Genetic Programming (GP) is a method of problem solving that takes its inspiration from biological evolution. Specifically, GP is concerned with evolving a population of programs (or other executable structures) to solve a specific problem. GP has been successful in a number of areas, e.g. data mining, bioinformatics, game AI, circuit design, symbolic regression, and many others.

In GP the semantics of programs are important, i.e. what programs actually do, as opposed to the syntax of programs, e.g. their text. A lot of work in GP has focused on these syntactic aspects of GP and neglected the semantic aspects. In the last few years a decent amount of work has been carried out on semantic aspects of GP, both at Kent and elsewhere. For example, the semantic effects of evolutionary operators can be studied (e.g., does the exchange of text between two programs result in a program that has aspects of the behaviour of the two parent programs?), studying the semantic diversity of the population (e.g., is the initial population semantically diverse? How does this semantic diversity change over an evolutionary run?), aspects concerned with fitness (e.g., can we define a fitness function in terms of the desired behaviour of programs and measure the distance of a program from that ideal semantics?).

There is lots of scope for exploring different aspects of this. A good starting point would be to look at the papers that I have written with my research student Lawrence Beadle (e.g. on semantic initialisation, crossover and mutation, and the work by groups in Dublin and Poznan.

Monte-Carlo Tree Search and Genetic Algorithms

Contact: Colin Johnson

Monte Carlo Tree Search is a method for searching game-trees in computer game playing, which has currently the state-of-the-art method for evaluating move choices in programs that play games such as Go. The idea is that rather than playing forward a small number of games intelligently, the program plays a large number of games randomly to completion, and then takes an average of the performance of each set of random games that start with a particular move as a measure of the quality of that move.

A not entirely unrelated notion is that of fitness inheritance and other fitness approximation techniques in genetic algorithms. This is the idea that rather than always allocating fitness to individuals by calculating the fitness directly, we assign a fitness based on the relationship of the individual to others whose fitness has already been calculated. For example, we might use the mean fitness of the two parents in crossover as an approximation to the fitness of the child. Recent surveys survey paper by Lin and Shi and Rasheed give a general introductions to this area, and a nice paper by Barbour, Corne and McCall shows how these ideas can be applied.

One interesting idea would be to bring these together in a particular way, that is to use a form of Monte Carlo tree search as a means of doing fitness approximation. That is, we take the current population, and run a number of runs to convergence by taking random crossovers and mutations (this is computationally cheap, as in most individuals it is fitness evaluation that is most costly). The fitness of the individuals is then the fitness of the end-point of these random runs. The question that we would like to answer is whether the fitness assigned at the end of this process is a more valuable guide for selection than selection based on direct calculation of the fitness of the individuals.

Similar ideas could be applied to other methods, such as other applications of genetic algorithms, or swarm intelligence methods.

Exploring Real Genetics using the Geometric Framework

Contact: Colin Johnson

The geometric framework (GF) is a formal theory of search algorithms that links solution representations, search operators, search spaces and fitness landscapes. It defines search operators using geometric shapes such as circles, segments and convex polygons specified as functions of the underlying distance of the search space. The search operators are then defined using these geometric shapes to relate the location and probability of the offspring with respect to the location of their parents. For example, uniform geometric crossover is a geometric search operator that, given two parents corresponding to the end-points of a segment, returns uniformly points in the segment as offspring. This framework reveals an important duality: the same search operator can be defined both on the representation as a procedure that manipulates parent configurations to produce offspring configurations as well as in a geometric declarative form as spatial relationship by specifying the locations and probability of the offspring with respect to the locations of their parents. For example, the uniform crossover for binary strings, which is defined on the representation, dually, it can be defined as uniform geometric crossover on the Hamming space. This duality has surprising consequences: (i) it allows us to use geometry on the search space as a formal language for a general theory of representations that encompasses the most-frequently used mutation and crossover operators across representations; (ii) it allows us to associate complex search operators with classic fitness landscapes; and (iii) it allows us to do principled design of search operators for new representations and problems that embed problem-specific knowledge in the search.

The aim of this project is to explore whether the GF can give insights to make discoveries in real genetics. This line of research is highly speculative. However, this might be possible because homologous biological recombination that aligns DNA strands on contents before genetic material exchange was shown to be a geometric crossover. Using the GF, this may allow us to determine the biological fitness landscape exactly and not to use this concept only heuristically or metaphorically as it is currently done in biology. Studying the characteristic of smoothness of this landscape might then explain, or contribute to an explanation on, why the rate of the biological adaptation due to Darwinian evolution is surprisingly high (which is an important open question). A second line of enquiry building upon the GF that can give interesting results in real genetics is about genes discovery (which is an important open problem). This is because the notion of gene can be naturally defined in geometric terms in a representation-independent way and can be formally specialized to any solution representation, including the biological one based on DNA strands. When specialized to binary stings, geometric genes correspond to schemata. It would be then interesting to see whether the genes predicted by the GF for DNA strands correspond to known real genes and if so whether they can predict unknown real genes.

Genetic programming for temperature weather derivatives

Contact: Michael Kampouridis

Weather derivatives are financial instruments used as part of a risk management strategy to reduce risk associated with adverse or unexpected weather conditions. Just as traditional contingent insurance claims, whose payoffs depend upon the price of some fundamental, a weather derivative has an underlying measure, such as rainfall, temperature, humidity, or snowfall. However, in the majority of the weather derivatives, the underlying asset is a temperature index. Hence, the proposed work will be focusing on temperature weather derivatives.
The problem of temperature weather derivatives can be divided into two main parts: (i) temperature prediction, and (ii) pricing of weather derivative contracts. This project will use an evolutionary approach, called Genetic Programming (GP) to predict future temperature and derive pricing equations. GP is a nature-inspired algorithm, which uses the principle of natural evolution to find computer programs that perform well in a given task. One of the main advantages of GP is its ability to perform well in high-dimensional combinatorial problems, such as the one of weather derivatives pricing.


Financial forecasting with directional changes

Contact: Michael Kampouridis

In the aftermath of a global financial crisis, it is more important than ever to have a better understanding of the markets and be able to forecast their movement. Directional changes (DC) is a new concept that is based on the idea that an event-based system can capture significant points in price movements that the traditional physical time methods cannot. For instance, if one was using daily closing prices, s/he would never notice the Dow Jones Industrial Average flash crash on the 6th of May 2010, where an almost 1000 point loss (about 9%) took place, only to recover most of those loses within minutes. Hence, instead of looking at the market from an interval-based perspective, it is proposed to record the key events in the market (e.g., changes in the stock price by a pre-specified percentage), and summarise the data based on these events.
This project will use Genetic Programming (GP) methods to create trading strategies. GP is an evolutionary technique inspired by natural evolution, where computer programs act as the individuals of a population. GP has been extensively used in the past for financial forecasting, and has shown it is able to identify patterns in financial data.

Training algorithms for spiking neural networks

Contact: Dominique Chu

Spiking neural networks encode information through the temporal order of the signals. They are more realistic models of the brain than standard artificial neural networks and they are also more efficient in encoding information. Spiking neural networks are therefore very popular in brain simulations. A disadvantage of spiking neural networks is that there are
not many efficient training algorithms available.

This project will be about finding novel training algorithms for spiking neural networks and to compare the trained networks with standard artificial neural networks on a number of benchmark AI tasks. An important part of this project will be not only to evaluate how well these spiking neural networks perform in relation to standard networks, but also to
understand whether or not they are, as is often claimed, more efficient in the sense that they need smaller networks or fewer computing resources.

The main approach of the model will be to gain inspiration from existing theories about how the how the human brain develops and learns. These existing theories will then be adapted so as to develop efficient training algorithms. This project will be primarily within AI, but it will also provide the opportunity to learn and apply techniques and ideas from computational neuroscience.

Computational models of brain-like networks

Contact: Dominique Chu

The human brain consists of billions of individual neurons whose firing patterns collectively create the computation that allows us to walk, be conscious and perform tasks. We are still far away from understanding in detail how the brain works. However, we are now able to address questions about how brain-like systems can perform particular tasks. For example,
how are we able to keep an internal map of our location in space? How are we able to hear a tune and then immediately reproduce it? Or how can we learn to perform mental arithmetic?

This project will construct minimal neural networks models that can perform such tasks. The main purpose of the project is not primarily to understand how the brain is actually solving these tasks, but to understand what the minimal networks are that can perform particular tasks.

Computational Modelling of Attention

Contact: Howard Bowman

Humans are very good at prioritising competing processing demands. In particular, perception of a salient environmental event can interrupt ongoing processing, causing attention, and accompanying processing resources, to be redirected to the new event. A classic example of this is the well-known Cocktail Party Effect. Not only are we easily able to follow just one conversation when several people are speaking, but the occurrence of a salient phrase in a peripheral conversation stream, such as somebody mentioning our name, causes auditory attention to be redirected. It is also clear that emotions, motivation and physiological state in general, play a key role in such prioritisation.

The proposed PhD will investigate the construction of computational models of these cognitive phenomena, with particular emphasis on neural level modeling. The modeling work will be guided by (and will also guide) the converging evidence now being made available by behavioural studies and brain mapping (both fMRI and EEG), some of which is being collected in Bowman’s research groups at Kent and Birmingham. A particular focus will be on extending Bowman & Wyble’s Simultaneous Type/ Serial Token model. One possible line would be adding spatial attentional mechanisms to the model and making the model a more complete theory of conscious perception.

Experimental Studies of Human Attention Using Behavioural and EEG Methods

Contact: Howard Bowman

Humans are very good at prioritising competing processing demands. In particular, perception of a salient environmental event can interrupt ongoing processing, causing attention, and accompanying processing resources, to be redirected to the new event. A classic example of this is the well-known Cocktail Party Effect. Not only are we easily able to follow just one conversation when several people are speaking, but the occurrence of a salient phrase in a peripheral conversation stream, such as somebody mentioning our name, causes auditory attention to be redirected. It is also clear that emotions, motivation and physiological state in general, play a key role in such prioritisation.

Through the combination of behavioural experimentation and the recent application of brain imaging, modern cognitive neuroscience is starting to clarify the mechanisms that underlie human deployment of attention. In particular, a number of experimental paradigms have started to reveal how timing constraints and sensitivity to salient events are reconciled in humans. Two experimental paradigms that are currently being explored in the attention research group at Kent are the attentional blink and the Emotional Stroop Effect. In particular, we have developed detailed theoretical accounts of both these phenomena.

The proposed PhD will undertake experimental studies targeted at evaluating a number of key predictions arising from our theoretical accounts of these phenomena. Many of these predictions focus on the nature of human conscious perception and the constraints that govern whether a stimulus is, or is not, perceived. In this respect, we are particularly evaluating the relationship between encoding into working memory and conscious perception.

The proposed experimental methods are traditional behavioural experimentation and electrophysiological work (i.e. EEG and ERP recording). The latter of these will be performed using the School of Computing at Kent’s BioSemi EEG system. There is also the possibility to run functional magnetic resonance imaging studies through Bowman’s part-time Professorship in the School of Psychology at the University of Birmingham.

Lie Detection and Brain-Computer Interaction on the Fringe of Awareness

Contact: Howard Bowman

We have developed methods to detect with EEG when a subject perceives a salient stimulus amongst a list of rapidly presented stimuli (10 per second). The majority of stimuli presented at this rate are not consciously reportable. However, the brain is selectively processing stimuli at such speeds; for example, it can detect the presence and identify of stimuli that fit a task template (e.g. the image containing an animal) or are intrinsically (and personally) salient (e.g. the word “spider” for a spider phobic).

Such modes of presentation have been extensively studied from a theoretical perspective, thereby, clarifying perceptual and attentional processing in humans. While continuing to investigate such fringe awareness theoretically, we are also exploring applications of such techniques in brain computer interaction. The space of potential applications of such methods is broad, including, lie detecting, interacting with vegetative and coma patients, control of computing and mobility devices, brain-salience directed search and retrieval, and adaptive computer interfaces. In this context, we have developed a brain-computer interface called the P3-Rapid method and a lie detector called the fringe-P3 method. All these areas along with theoretical investigations of perception and attention are potential subjects for PhD research.

Connectionism and Consciousness

Contact: Howard Bowman

How consciousness emerges from the physical matter of the brain remains one of the greatest mysteries of science. However, as a result of modern neuroscience and brain imaging techniques, theories of the neural mechanisms underlying conscious experience are starting to be proposed. For example, there are theories concerning synchronous firing of neurons and consciousness and there are explanations that focus on brain regions, e.g. the what and where pathways from visual cortex. In addition, neural network modelling is playing an important role in this debate. For example, explanations focussing on synchronous neural spiking have been investigated using neural network simulations.

There is considerable room for PhD level research on using neural networks to simulate theories of consciousness. One direction would be to develop models of how masking works in psychological studies of perception. It is well known that following a stimulus by a mask prevents conscious experience of the stimulus (in fact, related effects arise if the mask is presented at the same time as the stimulus). However, even though we have no awareness of the stimulus, our motor system still responds to it. Notice that such masking underlies the subliminal presentation of frames during films.

Despite the fact that such masking has empirically been investigated very extensively and indeed many theories of its functioning exist, there is currently no comprehensive computational model of the phenomenon. Thus, a possible avenue for a PhD in this area would be to construct neural network models of the competing theories of masking in order to verify their validity.

Computational creativity and automated evaluation

Contact: Anna Jordanous

In exploring how computers can perform creative tasks, computational creativity research has produced many systems that can generate creative products or creative activity. Evaluation, a critical part of the creative process, has not been employed to such a great extent within creative systems. Recent work has concentrated on evaluating the creativity of such computational systems, but there are two issues. Firstly, recent work in evaluation of computational creativity has consisted of the system(s) being evaluated by external evaluators, rather than by the creative system evaluating itself, or evaluation by other creative software agents that may interact with that system. Incorporation of self-evaluation into computational creativity systems *as part of guiding the creative process* is also under explored. (Anna currently has one PhD student at Kent and one external PhD student exploring different approaches to this latter issue, but there are many other possible avenues for investigation.)

In this project the candidate will experiment with incorporating evaluation methods into a creative system and analyse the results to explore how computational creativity systems can incorporate self-evaluation. The creative systems studied could be in the area of musical or linguistic creativity, or in a creative area of the student's choosing. It is up to the student to decide whether to focus on evaluation methods for evaluating the quality of output from a creative system or the creativity of the system itself (or both). The PhD candidate would be required to propose how they would will explore the above scenarios, for a more specific project. Anna is happy to guide students in this and help them develop their research proposal.

Expressive musical performance software

Contact: Anna Jordanous

Traditionally, when computational software performs music the performances can be criticised for being too unnatural, lacking interpretation and, in short, being too mechanical. However much progress has been made within the field of expressive musical performance and musical interpretation expression. Alongside these advances have been interesting findings in musical expectation (i.e. what people expect to hear when listening to a piece of music), as well as work on emotions that are present within music and on how information and meaning are conveyed in music. Each of these advances raises questions of how the relevant aspects could be interpreted by a musical performer.

Potential application areas for computer systems that can perform music in an appropriately expressive manner include, for example, improving playback in music notation editors (like Sibelius), or the automated performance of music generated on-the-fly for 'hold' music (played when waiting on hold during phone calls). Practical work exploring this could involve writing software that performs existing pieces, or could be to write software that can improvise, interpreting incoming sound/music and generating an appropriate sonic/musical response to it in real time.

Digital preservation of the information within musical/sonic material

Contact: Anna Jordanous

Digital preservation of audio material raises many interesting questions to be investigated, including how to archive a sound, what metadata to keep, and future-proofing. Of particular interest is how to explore issues of retention of musical/sonic information from relevant digital audio material, for later access and analysis. Sound and music are typically very open to interpretation, with much information being conveyed through musical/sonic material.

Music Information Retrieval (MIR) allows us to see what information is communicated by musical material, using techniques from Computing and Music. Typically MIR is applied to digital rather than physical materials and comes in a variety of forms that could be explored, such as using digital tools or computational analysis for informing and enhancing musicological analysis or musical interpretation. In this PhD project, the PhD candidate will carry out such explorations, towards the development of an archive or a methodology for existing archives to access and retrieve musical information from archive music-based data.

Music on the Semantic Web

Contact: Anna Jordanous

The Semantic Web is a vision of the Web where items on the web are data, which get linked together if they are data referring to similar things. In the Semantic Web, "a computer program can learn enough about what the data means to process it." (Tim Berners-Lee, Weaving the Web, 2000) There are some data and ontologies (computational models of knowledge) published on the Semantic Web about music, for example the Music Ontology (musicontology.com).

Research is starting to emerge on using information retrieval in conjunction with data on the Semantic Web; this project proposes that the PhD candidate explores how Music Information Retrieval (MIR) can be enhanced using Semantic Web data and tools. During this PhD project, the candidate would look at a particular question in music information retrieval, such as how to use MIR to perform computational musicological analysis or how to identify music that is intended to express similar meanings or emotions. (Alternatively the candidate may wish to address a different music information retrieval problem, in an area of specific interest to them; this is welcome.) The PhD candidate would explore how this MIR question can be addressed by using music-specific Semantic Web data/models/technologies to enhance the process of identifying relevant information.

It is expected that the PhD candidate will produce computational tools or software that engages directly with the Semantic Web in order to perform the musical information retrieval task. The performance of Semantic-Web enhanced solutions should be compared to traditional MIR solutions for that task, if any exist, and evaluated as to the accuracy and comprehensiveness with which the tools or software carry out the task.

Chatbots for Language Tutoring

Contact: Marek Grzes

The old proverb says: "The more languages you speak, the more human you are". More pragmatically, knowing a foreign language opens doors to many opportunities. These benefits motivate people worldwide to spend billions of dollars every year on foreign language learning. It is clear that learners who engage in meaningful conversations (not necessarily oral) make faster progress. Unfortunately, communications with native speakers are rare or even impossible when the learner is in her country of origin. Chatbots provide an easy access to lifelike conversations to millions of people worldwide, regardless of the target language or location of the learner. Imagine that an accurate and entertaining chatbot for teaching English is available on the web or as a contact person on Skype, Kik Messenger, Gadu-Gadu, or on any other platform of this kind. The benefits for students as well as business opportunities for industry are evident.

This research topic could be engaging for both academically oriented PhD students and those who are interested in a commercial deployment of their PhD work. It would require very good programming skills as well as interest in machine learning, statistical natural language processing, and information retrieval.

The student would be encouraged to take a highly personal interpretation of the problem.

Reinforcement Learning for Intelligent Tutoring Systems

Contact: Marek Grzes

TReinforcement learning is a fascinating research area because it allows intelligent behaviours to be learned either from data or from simulation. In order to make a cutting-edge research contribution to reinforcement learning, in most cases, one has either to come up with a solid theory (i.e. algorithms with provable performance guarantees) or to design reinforcement learning solutions for large, real-life problems. There is a unique opportunity to do the latter in the School of Computing at the University of Kent because our School has a solid platform for teaching programming languages which is used by thousands of users world-wide. The goal of this project would be to explore the potential of reinforcement learning to improve the learning experience of students who practice their programming skills using BlueJ or Greenfoot. The educational aspect of the project would require collaboration with the Computing Education research group at Kent; the student would benefit from having a second supervisor from that group. The machine learning part of this project would benefit from close collaboration with Prof. Pascal Poupart from the University of Waterloo in Canada.

Combining artificial intelligence and a learning tutor for programming languages opens up many avenues for cutting-edge research, yet it gives a student a lot of flexibility.

Probabilistic Planning with Constraints

Contact: Marek Grzes

Many applications of probabilistic planning and artificial intelligence involve various types of constraints that should be satisfied (with a high probability at least). For example, in intelligent tutoring systems, it may be desirable to minimise the number of turns while ensuring that the probability of the student completing the task is maximised. Partially observable Markov decision processes (POMDPs) provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. POMDPs represent a factual mathematical model which makes them useful both for developing new algorithms and for creating specialisations for particular applications, such as intelligent tutoring systems. Particular focus of this project could be on extending the recent work of Grzes & Poupart to situations that involve constraints.

This project is very technical, and it would be suitable for an individual who wishes to work on the core algorithms for artificial intelligence planning. Ability to write software (in Matlab at least) would be essential together with a good mathematical background (especially knowledge of linear algebra).

This research challenge can be addressed in various ways, and the methods investigated within this project would depend on preferences of a particular student, his interests, skills, and his long term objectives.

Automatic Layout of Concentric Metro Maps

Contact: Peter Rodgers

Work in automatic layout of metro maps has been progressing over the last few years. However, one type of design has not yet received any attention: the concentric metro map. This type of map has lines that are drawn either as circle segments or as lines that radiate from the centre of the map. Some examples:
Max Roberts' concentric London map
Moscow published map

This type of map will be difficult to draw automatically, but previous layout methods may provide a clue as to what mechanisms might be appropriate. In particular, hierarchical layout methods have similarities as they can be used to draw a DAG in a radial manner, with the top of the hierarchy placed in the centre of the graph, and the layers drawn using circles at increasing distances from the centre, see http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3868.

Tasks in this research project include:

  1. Developing a decision process for identifying metro maps that may be successfully drawn in a radial manner.
  2. Identifying radial and circular segments of metro map lines.
  3. Implementing an algorithm to place the stations in suitable locations.
  4. Conflict resolution to avoid crossing lines and reposition stations that are given duplicate locations.

Visualizing Complex Data

Contact: Peter Rodgers

There is a lot of data that contains both associations (usually visualized as a graph) and set membership (visualized with various techniques, such as Euler diagrams, bubble sets or linear diagrams). Visualizing both of these types of relationship is a challenging task, however it is very topical, and significant research efforts are going into this work at present. This project would look at developing techniques for data of limited size, then examine the scalability of the system by developing a separate overview visualization, which summarizes the data. The user would then be able to examine the overview for interesting data, filter the data, and then examine interesting areas in detail.

The project would involve identifying an application area (for example, social network data or gene data), develop the two visualization systems and the interaction between them. The result would then be tested for success by examining subject specialists as they use the software.

Motif Finding in Set Based Data

Contact: Peter Rodgers

Seeking overrepresented subgraphs, or "Motifs" in graphs is widespread. For example, they are used to analyse gene data [DD07], social networks [WG06] and criminal patterns [DM15]. The process works by sampling a large number of subgraphs of similar size, placing equal subgraphs (isomorphic subgraphs) into buckets and counting the size of the buckets. Data analysts then examine subgraphs that occur more than would be expected by chance.

This project would apply the same general process to set based data to see if insights can be derived from detecting motifs. Such data example social network data, where people have might have shared interests and are therefore in the same set. Tags on data (such as twitter hashtags) can also be used classify items into sets.

The tasks in this project would be:

  • to derive suitable real world data
  • to work out randomization strategies that produce random data sets for
    comparison
  • design methods to sample small sections (set systems) of this data
  • develop fast set system isomorphism algorithms (equivalent to thehypergraph isomorphism problem)
  • implement statistical analysis methods to indicate over represented smallset systems
  • analyse these to see if useful information can be derived from theoverrepresented cases

[DD07] Das, M. K., & Dai, H. K. (2007). A survey of DNA motif finding algorithms. BMC bioinformatics, 8(7), 1.
[DM15] Davies T, Marchione E (2015) Event Networks and the Identification of Crime Pattern Motifs. PLoS ONE 10(11): e0143638. doi:10.1371/journal.pone.0143638
[WG06] Wei, X., & Gang, Z. (2006). U.S. Patent Application No. 11/603,284.

.

School of Computing, University of Kent, Canterbury, Kent, CT2 7NF

Enquiries: +44 (0)1227 824180 or contact us.

Last Updated: 21/12/2016