School of Computing

Publications by Dr Colin Johnson

Also view these in the Kent Academic Repository
Articles

    Otero, F.E.B. and Freitas, A.A. and Johnson, C.G. (2010) A hierarchical multi-label classification ant colony algorithm for protein function prediction. Memetic Computing, 2 (3). pp. 182-196. ISSN 1865-9284.

    Otero, F.E.B. and Freitas, A.A. and Johnson, C.G. (2010) A hierarchical multi-label classification ant colony algorithm for protein function prediction. Memetic Computing, 2 (3). pp. 182-196.

    McIntyre, Emmet and Blackburn, Edith and Brown, Philip J. et al. (2010) The complete family of epidermal growth factor receptors and their ligands are co-ordinately expressed in breast cancer. Breast Cancer Research and Treatment, 122 (1). pp. 105-110. ISSN 0167-6806.

    Abstract

    The levels of expression of the four receptors and eleven ligands composing the Epidermal Growth Factor family were measured using immunohistochemical staining in one hundred cases of breast cancer. All of the family were expressed to some degree in some cases, however individual cases showed a very wide range of expression of the family from essentially none to all the factors at high levels. The highest aggregate level of expression of a receptor was HER2 followed by HER1, then HER3, then HER4. The ligands (including two splice variants of the NRG1 and NRG2 genes) broadly fell into three groups, those with the highest aggregate expression were Epigen, Epiregulin, Neuregulin 1?, Neuregulin 2?, Neuregulin 2?, Neuregulin 4 and TGF?, moderate expression was seen with EGF, Neuregulin 1? and Neuregulin 3, and relatively low levels of expression were seen of HB-EGF, Betacellulin and Amphiregulin. Statistical analysis using Spearman’s Rank Correlation showed a positive correlation of expression between each of the factors. Analysing the data using the Cox Proportional Hazards model showed that, in this data set, the most powerful predictors of relapse free interval and overall survival were the combined measurement of only Epigen and Neuregulin 4.

    Beadle, Lawrence and Johnson, Colin G. (2009) Semantic analysis of program initialisation in genetic programming. Genetic Programming and Evolvable Machines, 10 (3). pp. 307-337. ISSN 1389-2576.

    Abstract

     Abstract Population initialisation in genetic programming is both easy, because random combinations of syntax can be generated straightforwardly, and hard, because these random combinations of syntax do not always produce random and diverse program behaviours. In this paper we perform analyses of behavioural diversity, the size and shape of starting populations, the effects of purely semantic program initialisation and the importance of tree shape in the context of program initialisation. To achieve this, we create four different algorithms, in addition to using the traditional ramped half and half technique, applied to seven genetic programming problems. We present results to show that varying the choice and design of program initialisation can dramatically influence the performance of genetic programming. In particular, program behaviour and evolvable tree shape can have dramatic effects on the performance of genetic programming. The four algorithms we present have different rates of success on different problems.

    Correa, E.S. and Freitas, A.A. and Johnson, Colin G. (2008) Particle swarm for attribute selection in Bayesian classification: an application to protein function prediction. Journal of Artificial Evolution and Applications, 2008. pp. 12 pages. ISSN ISSN:1687-6229.

    Iqbal, M. and Freitas, A.A. and Johnson, Colin G. et al. (2008) Message-passing algorithms for the prediction of protein domain interactions from protein?protein interaction data. Bioinformatics, 24 (18). pp. 2064-2070. ISSN 1460-2059.

    Abstract

    Motivation: Cellular processes often hinge upon specific interactions among proteins, and knowledge of these processes at a system level constitutes a major goal of proteomics. In particular, a greater understanding of protein-protein interactions can be gained via a more detailed investigation of the protein domain interactions that mediate the interactions of proteins. Existing high-throughput experimental techniques assay protein-protein interactions, yet they do not provide any direct information on the interactions among domains. Inferences concerning the latter can be made by analysis of the domain composition of a set of proteins and their interaction map. This inference problem is non-trivial, however, due to the high level of noise generally present in experimental data concerning protein-protein interactions. This noise leads to contradictions, i.e. the impossibility of having a pattern of domain interactions compatible with the protein-protein interaction map. Results: We formulate the problem of prediction of protein domain interactions in a form that lends itself to the application of belief propagation, a powerful algorithm for such inference problems, which is based on message passing. The input to our algorithm is an interaction map among a set of proteins, and a set of domain assignments to the relevant proteins. The output is a list of probabilities of interaction between each pair of domains. Our method is able to effectively cope with errors in the protein-protein interaction dataset and systematically resolve contradictions. We applied the method to a dataset concerning the budding yeast Saccharomyces cerevisiae and tested the quality of our predictions by cross-validation on this dataset, by comparison with existing computational predictions, and finally with experimentally available domain interactions. Results compare favourably to those by existing algorithms. Availability: A C language implementation of the algorithm is available upon request.

    Johnson, Colin G. (2008) A Design Framework for Metaheuristics. Artificial Intelliigence Review, 29 (2). pp. 163-178. ISSN 1573-7462.

    Abstract

    This paper is concerned with taking an engineering approach towards the application of metaheuristic problem solving methods, i.e. heuristics that aim to solve a wide variety of problems. How can a practitioner solve a problem using metaheuristic methods? What choices do they have, and how are these choices influenced by the problem at hand? Are there sensible universal choices which can be made, or are these choices always problem-dependent? The aim of this paper is to address questions such as these in the context of a (soft) engineering design framework for the application of metaheuristics. The aim of this framework is to make explicit the choices which a practitioner needs to make in applying these techniques, and to give some guidelines for how metaheuristics might be tuned to problems by considering different problem- and solution-types.

    Fuller, Ursula and Johnson, Colin G. and Ahoniemi, T. et al. (2007) Developing a Computer Science-specific Learning Taxonomy. ACM SIGCSE Bulletin, 39 (4). pp. 152-170. ISSN 0097-8418.

    Abstract

    Bloom's taxonomy of the cognitive domain and the SOLO taxonomy are being increasingly widely used in the design and assessment of courses, but there are some drawbacks to their use in computer science. This paper reviews the literature on educational taxonomies and their use in computer science education, identifies some of the problems that arise, proposes a new taxonomy and discusses how this can be used in application-oriented courses such as programming.

    Stepney, S. and Braunstein, S.L. and Clark, J.A. et al. (2006) Journeys in Non-Classical Computation II: Initial Journeys and Waypoints. International Journal of Parallel, Emergent and Distributed Systems, 21 (2). pp. 97-125. ISSN 1744-5779.

    Stepney, S. and Braunstein, S.L. and Clark, J.A. et al. (2005) Journeys in Non-Classical Computation I: A Grand Challenge for computing research. International Journal of Parallel, Emergent and Distributed Systems, 20 (1). pp. 5-19. ISSN 1744-5760.

    Abstract

    A gateway event is a change to a system that leads to the possibility of huge increases in kinds and levels of complexity. It opens up a whole new kind of phase space to the systemÕs dynamics. Gateway events during evolution of life on earth include the appearance of eukaryotes (organisms with a cell nucleus), an oxygen atmosphere, multi-cellular organisms and grass. Gateway events during the development of mathematics include each invention of a new class of numbers (negative, irrational, imaginary, ...), and dropping Euclid's parallel postulate. A gateway event produces a profound and fundamental change to the system: Once through the gateway, life is never the same again. We are currently poised on the threshold of a significant gateway event in computation: That of breaking free from many of our current Òclassical computationalÓ assumptions. Our Grand Challenge for computer science is to journey through the gateway event obtained by breaking our current classical computational assumptions, and thereby develop a mature science of Non-Classical Computation

    Johnson, Colin G. and Goldman, J.P. and Gullick, W.J. (2004) Simulating complex intracellular processes using object-oriented computational modelling. Progress in Biophysics and Molecular Biology, 86 (3). pp. 379-406. ISSN 0079-6107.

    Abstract

    The aim of this paper is to give an overview of computer modelling and simulation in cellular biology, in particular as applied to complex biochemical processes within the cell. This is illustrated by the use of the techniques of object-oriented modelling, where the computer is used to construct abstractions of objects in the domain being modelled, and these objects then interact within the computer to simulate the system and allow emergent properties to be observed. The paper also discusses the role of computer simulation in understanding complexity in biological systems, and the kinds of information which can be obtained about biology via simulation.

    Goldman, J.P. and Gullick, W.J. and Johnson, Colin G. (2004) Individual-based simulation of the clustering behaviour of epidermal growth factor receptors. Scientific Programming, 12 (1). pp. 25-43. ISSN 1058-9244.

    Abstract

    The paper describes ongoing work on a project to simulate the behavior of epidermal growth factor receptors. These are structures which can be found on the surface of cells in the body, which receive and process chemical signals concerned with cell growth. The implementation of a program which simulates the stimulation and clustering behavior of these structures is described, then the paper discusses how the simualtion can be scaled up so that a whole cell can be simulated on a tractable timescale. Finaly some early results are given which show the effect of changing parameters in the system, and discuss ongoning work on calibrating the simulation against results from experiments.

    Johnson, Colin G. (2003) Exploring sound-space with interactive genetic algorithms. Leonardo, 36 (1). pp. 51-54. ISSN 0024-094X.

    Abstract

    This article describes a system which uses evolutionary computation to provide an interface to a complex sound synthesis algorithm. The article then considers a number of general issues which need to be considered when evolutionary computation is applied in artistic domains, and the differences between interactive and non-interactive genetic algorithms.

    Johnson, Colin G. and Romero Cardalda, Juan Jesus (2002) Evolutionary Computing in Visual Art and Music. Leonardo, 35 (2). pp. 175-184. ISSN 0024-094X.

    Abstract

    This paper is an introduction to the special section of Leonardo on Genetic Algorithms in Visual Art and Music, which arose from a workshop at the 2000 Genetic and Evolutionary Computing Conference. This introduction gives a background review of the area, takes a look at some open questions provoked by the workshop, and summarizes the papers in the section.

    Johnson, Colin G. (2001) Understanding complex systems through examples: A framework for qualitative example finding. Systems Research and Information Systems, 10 (3-4). pp. 239-267.

    Abstract

    Many complex information systems in science, business and design have the characteristic that we can classify objects in the system in some way, but that these classifications are distributed through a parameter space in some complex fashion. In order for a human to get an understanding of the system, we would like to present this user with one example of an object for each class. Examples of such problems can be found in information retrieval, bioinformatics, computational geometry, computer-aided design, software testing and cellular automata. In this paper we will show how problems in all these areas can be put into a general framework of finding qualitative examples, and argue that general heuristic approaches to this type of problem are an important and neglected area of machine learning. We contrast this with some other well-studied problems, showing how this problem is distinct and investigating what we can learn from these problems. We then discuss some of the requirements for a heuristic to solve these problems, and mention some recent work on this using genetic algorithms.

    Johnson, Colin G. and Marsh, Duncan (1999) Modelling robot manipulators with multivariate B-splines. Robotica, 17 (3). pp. 239-247. ISSN 0263-5747.

    Abstract

    Modelling robot manipulators using multivariate B-splines Colin G. Johnson and Duncan Marsh In programming robot manipulators to carry out a wide variety of tasks it would be desirable to create a cad system in which these tasks can be programmed at the task level, leaving the fine-grained detail of path planning and collision detection to the system. This paper describes the theoretical background to such a system, by providing a model in which robot motions are represented using multivariate B-splines, a standard representation for free-form shapes in the cad environment. The paper also describes algorithms which take this representation and apply it to collision detection and path-planning.

Book Sections

    Otero, F. and Segond, M. and Freitas, A.A. et al. (2009) An empirical evaluation of the effectiveness of different types of predictor attributes in protein function prediction. In: Abraham, A. and Hassanien, A.-E. and Snael, V. Studies in Computational Intelligence. Studies in Computational Intelligence 205, Foundations of Computational Intelligence Vol. 5. Springer, Berlin, pp. 339-357. ISBN 3642015352.

    Iqbal, M. and Freitas, A.A. and Johnson, Colin G. (2009) A hybrid rule-induction/likelihood-ratio based approach for predicting protein-protein interactions. In: Mumford, Christine L. and Jain, Lakhmi C. Computational Intelligence: Collaboration, Fusion and Emergence. Intelligent Systems Reference Library. Springer, pp. 623-637. ISBN 978-3-642-01798-8.

    Johnson, Colin G. and McIntyre, E. and Gullick, W.J. (2008) Computational and Mathematical Modelling of the EGF Receptor System. In: Hayley, J.D. and Gullick, W.J. EGFR Signaling Networks in Cancer Therapy. Cancer Drug Discovery & Development. Humana Press, USA, pp. 209-220. ISBN 9781588299482.

    Abstract

    This chapter gives an overview of computational and mathematical modelling of the EGF receptor system. It begins with a survey of motivations for producing such models, then describes the main approaches that are taken to carrying out such modelling, viz. differential equations and individual-based modelling. Finally, a number of projects that applying modelling and simulation techniques to various aspects of the EGF receptor system are described.

    Gounaropoulos, Alex and Johnson, Colin G. (2006) Synthesising Timbres and Timbre-Changes from Adjectives/Adverbs. In: Rothlauf, F. and Branke, J. and Cagnoni, S. et al. Applications of Evolutionary Computing. Lecture Notes in Computer Science, 3907. Springer-Berlin/Heidelberg, pp. 664-675. ISBN 978-3-540-33237-4.

    Abstract

    Synthesising timbres and changes to timbres from natural language descriptions is an interesting challenge for computer music. This paper describes the current state of an ongoing project which takes a machine learning approach to this problem. We discuss the challenges that are presented by this, discuss various strategies for tackling this problem, and explain some experimental work. In particular our approach is focused on the creation of a system that uses an analysis-synthesis cycle to learn and then produce such timbre changes.

    Johnson, Colin G. (2005) Does a functioning mind need a functioning body? In: Davis, Darryl N. Visions of Mind. Idea Group Publishing, pp. 307-321. ISBN 1-59140-483-5.

    Abstract

    In recent years, the idea that somatic processes are intimately involved in actions traditionally considered to be purely mental has come to the fore. In particular, these arguments have revolved around the concept of somatic markers, i.e., bodily states that are generated by mind and then reperceived and acted upon. This chapter considers the somatic marker hypothesis and related ideas from the point of view of postclassical computation, i.e., the view that computing can be seen as a property of things-in-the-world rather than of an abstract class of mathematical machines. From this perspective, a number of ideas are discussed: the idea of somatic markers extending into the environment, an analogy with hardware interlocks in complex computer driven systems, and connections with the idea of just-do-it computation.

    Johnson, Colin G. (2004) What kinds of natural processes can be regarded as computations? In: Paton, R. Computation in Cells and Tissues: Perspectives and Tools of Thought. Springer. ISBN 978-3540003588.

    Abstract

    This chapter is concerned with how computational ideas can be used as the basis for understanding biological systems, not by simulating such systems, but by taking a computational stance towards the way such systems work. A number of issues are addressed. Firstly the question of what kinds of computer science are needed to help understand computational processes which happen outside of conventional computing machines. The second issue addressed places computational constraints on how the world can act into Dennett's framework of grades of possibility. The final main section considers the issue of changes in the world, and when it is meaningful to regard such changes as carrying out computations.

Conference Items

    Otero, Fernando E. B. and Castle, Tom and Johnson, Colin G. (2012) EpochX: Genetic Programming in Java with Statistics and Event Monitoring. In: Proceedings of the 2012 Genetic and Evolutionary Conference Companion (GECCO 2012).

    Abstract

    EpochX is a Genetic Programming (GP) framework written in Java. It allows the creation of tree-based and grammar-based GP systems. It has been created to reflect typical ways in which Java programmers work, for example, borrowing from the Java event model and taking inspiration from the Java collections framework. This paper presents EpochX in general, and gives particular attention to the event model and the statistics analysis framework.

    Castle, Tom and Johnson, Colin G. (2012) Evolving Program Trees with Limited Scope Variable Declarations. In: Proceedings of the 2012 IEEE Congress on Evolutionary Computation.

    Abstract

    Variables are a fundamental component of computer programs. However, rarely has the construction of new variables been left to the evolutionary process of a tree-based Genetic Programming system. We present a series of modifications to an existing GP approach to allow the evolution of high-level imperative programs with limited scope variables. We make use of several new program constructs made possible by the modifications and experimentally compare their use. Our results suggest the impact of variable declarations is problem dependent, but can potentially improve performance. It is proposed that the use of variable declarations can reduce the degree of insight required into potential solutions.

    Moraglio, Alberto and Otero, Fernando E.B. and Johnson, Colin G. et al. (2012) Evolving Recursive Programs using Non-recursive Scaffolding. In: Proceedings of the 2012 IEEE World Congress on Computational Intelligence.

    Abstract

    Genetic programming has proven capable of evolving solutions to a wide variety of problems. However, the successes have largely been with programs without iteration or recursion; evolving recursive programs has turned out to be particularly challenging. The main obstacle to evolving recursive programs seems to be that they are particularly fragile to the application of search operators: a small change in a correct recursive program generally produces a completely wrong program. In this paper, we present a simple and general method that allows us to pass back and forth from a recursive program to an associated non-recursive program. Finding a recursive program can be reduced to evolving non-recursive programs followed by converting the optimum non-recursive program found to the associated optimum recursive program. This avoids the fragility problem above, as evolution does not search the space of recursive programs. We present promising experimental results on a test-bed of recursive problems.

    Johnson, Colin G. (2012) Search-based Evolutionary Operators for Extensionally-defined Search Spaces: Applications to Image Search. In: Proceedings of the 2012 IEEE World Congress on Computational Intelligence.

    Abstract

    This paper explores the idea of applying evolutionary algorithms to those search spaces that are defined extensionally, i.e. by listing every item in the space. When these spaces are with a function that returns similar elements given a key element, analogies of mutation and crossover can be defined. This idea is discussed in general, and specific examples are given where the search is for images, in particular where image search is carried out using an interactive genetic algorithm.

    Castle, Tom and Johnson, Colin G. (2012) Evolving High-Level Imperative Program Trees with Strongly Formed Genetic Programming. In: Proceedings of the 15th European Conference on Genetic Programming, EuroGP 2012.

    Abstract

    We present a set of extensions to Montana's popular Strongly Typed Genetic Programming system that introduce constraints on the structure of program trees. It is demonstrated that these constraints can be used to evolve programs with a naturally imperative structure, using common high-level imperative language constructs such as loops. A set of three problems including factorial and the general even-n-parity problem are used to test the system. Experimental results are presented which show success rates and required computational effort that compare favourably against other systems on these problems, while providing support for this imperative structure.

    Howard, Sam and Silla Jr., Carlos N. and Johnson, Colin G. (2011) Automatic Lyrics-based Music Genre Classification in a Multilingual Setting. In: Thirteenth Brazilian Symposium on Computer Music, 31st August--3rd September 2011, Vitória, Brasil.

    Abstract

    A large amount of research has been undertaken with regard to the classification of lyrics into genres, but most of this work has featured solely English lyrics. This study investigates the implications of classifying a multilingual database and the effectiveness of a number of techniques and algorithms for doing so. Part of this involves the creation of a high-quality dataset for use in this research. This paper finds that there are significant challenges in preprocessing multilingual text, and that traditional techniques like stemming and stop words may actually do more harm than good in such circumstances. It also finds that classes with strong language bias may be more likely to perform better than those with multiple languages.

    Moraglio, A. and Otero, F.E.B. and Johnson, C.G. (2010) The ACO Encoding. In: Swarm Intelligence - 7th International Conference (ANTS 2010).

    Okasha, Ahmed and Johnson, Colin G. (2010) The Effect of Level of Rationality on Macro-Activities of the Lucas-Island Model. In: 2010 IEEE Congress on Evolutionary Computation.

    Abstract

    AbstractThis paper investigates the effect of different levels of rationality on the Lucas-Islands model of economic behaviour. In particular, this is studied through the use of Agentbased Computational Economics, where individual economic agents are represented by separate computational entities in an interacting computer simulation. Three different economic models are studied: one where workers are assigned randomly to firms, the second where there is loyalty to firms from workers, and the third where workers have a broader set of criteria on which to make a job choice. Simulations show that there are positive relationships between level of rationality and several factors in the model, i.e. wage, vacancy rate and production, whilst unemployment level is negatively correlated with level of rationality.

    Cattani, Phil T. and Johnson, Colin G. (2010) ME-CGP: Multi Expression Cartesian Genetic Programming. In: Proceedings of the 2010 IEEE World Congress on Computational Intelligence.

    Castle, Tom and Johnson, C. G. (2010) Positional Effect of Crossover and Mutation in Grammatical Evolution. In: Proceedings of the 13th European Conference on Genetic Programming, EuroGP 2010.

    Abstract

    An often-mentioned issue with Grammatical Evolution is that a small change in the genotype, through mutation or crossover, may completely change the meaning of all of the following genes. This paper analyses the crossover and mutation operations in GE, in particular examining the constructive or destructive nature of these operations when occurring at points throughout a genotype. The results we present show some strong support for the idea that events occurring at the first positions of a genotype are indeed more destructive, but also indicate that they may be the most constructive crossover and mutation points too. We also demonstrate the sensitivity of this work to the precise definition of what is constructive/destructive.

    Otero, F.E.B. and Freitas, A.A. and Johnson, Colin G. (2009) Handling continuous attributes in ant colony classification algorithms. In: Proc. of the 2009 IEEE Symposium on Computational Intelligence in Data Mining (CIDM 2009), Mar 30-Apr 02, 2009, Nashville, TN,.

    Abstract

    Most real-world classification problems involve continuous (real-valued) attributes, as well as, nominal (discrete) attributes. The majority of Ant Colony Optimisation (ACO) classification algorithms have the limitation of only being able to cope with nominal attributes directly. Extending the approach for coping with continuous attributes presented by cAnt-Miner (Ant-Miner coping with continuous attributes), in this paper we propose two new methods for handling continuous attributes in ACO classification algorithms. The first method allows a more flexible representation of continuous attributes' intervals. The second method explores the problem of attribute interaction, which originates from the way that continuous attributes are handled in cAnt-Miner, in order to implement an improved pheromone updating method. Empirical evaluation on eight publicly available data sets shows that the proposed methods facilitate the discovery of more accurate classification models.

    Abstract

    This paper presents an agent-based computational economics model (ACE) to study demand-pull and cost-push inflation. Moreover, it studies the effect of different levels of rationality on the equilibrium price and unemployment rate. The model examines three different economies. In the first economy workers choose firms randomly, in the second economy there is loyalty between workers and firms. In the last scenario workers are persistence to find jobs. Simulations show that there is a positive relationship between equilibrium price and level of rationality while there is a negative relationship with unemployment rate. Moreover, the model is able to reproduce the behaviour of demand-pull inflation and cost-push inflation without homogeneous and perfectly rational agents assumptions.

    Beadle, Lawrence and Johnson, Colin G. (2009) Semantically Driven Mutation in Genetic Programming. In: Proceedings of the 2009 IEEE Congress on Evolutionary Computation.

    Abstract

    Using semantic analysis, we present a technique known as semantically driven mutation which can explicitly detect and apply behavioural changes caused by the syntactic changes in programs that result from the mutation operation. Using semantically driven mutation, we demonstrate increased performance in genetic programming on seven benchmark genetic programming problems over two different domains.

    Otero, F.E.B. and Freitas, A.A. and Johnson, Colin G. (2009) A hierarchical classification ant colony algorithm for predicting gene ontology terms. In: Proc. 7th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBio-2009), APR 15-17, 2009, Tubingen, GERMANY.

    Abstract

    This paper proposes a novel Ant Colony Optimisation algorithm for the hierarchical problem of predicting protein functions using the Gene Ontology (GO). The GO structure represents a challenging case of hierarchical classification, since its terms are organised in a direct acyclic graph fashion where a term can have more than one parent in contrast to only one parent in tree structures. The proposed method discovers an ordered list of classification rules which is able to predict all GO terms independently of their level. We have compared the proposed method against a baseline method, which consists of training classifiers for each GO terms individually, in five different ion-channel data sets and the results obtained are promising.

    Cattani, Phil T. and Johnson, Colin G. (2009) Typed Cartesian Genetic Programming for Image Classification. In: Proceedings of the 2009 UK Workshop on Computational Intelligence.

    Abstract

    This paper introduces an extension to Cartesian Genetic Programming (CGP), aimed at image classification problems. Individuals in the population consist of two layers of functions: image processing functions, and traditional mathematical functions. Information can be passed between these layers, and the final result can either be an image or a numerical value. This has been applied to image classification, by using CGP to evolve image processing algorithms for feature extraction. This paper presents results which show that these automatically extracted features can substantially increase classification accuracy on a medical problem concerned with the analysis of potentially cancerous cells.

    Iqbal, M. and Freitas, A.A. and Johnson, Colin G. (2008) Protein interaction inference using particle swarm optimization algorithm. In: 6th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, March 26th - 28th 2008, Naples, Italy.

    Abstract

    Many processes in the cell involve interaction among the proteins and determination of the networks of such interactions is of immense importance towards the complete understanding of cellular functions. As the experimental techniques for this purpose are expensive and potentially erroneous, there are many computational methods being put forward for prediction of protein-protein interactions. These methods use different genomic features for indirect inference of protein- protein interactions. As the interaction among two proteins is facilitated by domains, there are many methods being put forward for inference of such interactions using the specificity of assignment of domains to proteins. We present here an heuristic optimization method, particle swarm optimization, which predicts protein-protein interaction by using the domain assignments information. Results are compared with another known method which predicts domain interactions by casting the problem of interactions inference as a maximum satisfiability (MAX-SAT) problem.

    Beadle, Lawrence and Johnson, Colin G. (2008) Semantically Driven Crossover in Genetic Programming. In: IEEE World Congress on Computational Intelligence, JUN 01-06, 2008, Hong Kong, China.

    Abstract

    Crossover forms one of the core operations in genetic programming and has been the subject of many different investigations. We present a novel technique, based on semantic analysis of programs, which forces each crossover to make candidate programs take a new step in the behavioural search space. We demonstrate how this technique results in better performance and smaller solutions in two separate genetic programming experiments.

    Johnson, Colin G. (2008) Multi-Level Neutrality in Optimization. In: Proceedings of the 2008 IEEE World Congress on Computational Intelligence, Jun 01-06, 2008, Hong Kong, Peoples R China.

    Abstract

    This paper explores the idea of neutrality in heuristic optimization algorithms. In particular, the effect of having multiple levels of neutrality in representations is explored. Two experiments using a fitness-adaptive walk algorithm are carried out: the first is concerned with function optimization with Random Boolean Networks, the second with a tunable neutral mapping applied to the hierarchical if-and-only-if function. In both of these cases it is shown that a two-level neutral mapping can be found that performs better than both nonneutral mappings and mappings with a single level of neutrality.

    Otero, F.E.B. and Freitas, A.A. and Johnson, Colin G. (2008) cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes. In: Ant Colony Optimization and Swarm Intelligence (Proc. ANTS 2008), LNCS 5217, SEP 22-24, 2008, Brussels, BELGIUM.

    Abstract

    This paper presents an extension to Ant-Miner, named cAnt-Miner (Ant-Miner coping with continuous attributes), which incorporates an entropy-based discretization method in order to cope with continuous attributes during the rule construction process. By having the ability to create discrete intervals for continuous attributes "on-the-fly", cAnt-Miner does not requires a discretization method in a preprocessing step, as Ant-Miner requires. cAnt-Miner has been compared against Ant-Miner in eight public domain datasets with respect to predictive accuracy and simplicity of the discovered rules. Empirical results show that creating discrete intervals during the rule construction process facilitates the discovery of more accurate and significantly simpler classification rules

    Correa, E.S. and Freitas, A.A. and Johnson, Colin G. (2007) Particle swarm and bayesian networks applied to attribute selection for protein functional classification. In: Yu, T. ACM pp. 2651-2658. ISBN 978-1-59593-698-1.

    Abstract

    The Discrete Particle Swarm (DPSO) algorithm is an optimizationmethod that belongs to the fertile paradigm of Swarm Intelligence. The DPSO was designed for the task of attribute selection and it deals with discrete variables in a straightforward manner. This work extends the DPSO algorithm in two ways. First, we enable the DPSO to select attributes for a Bayesian network algorithm, which is a much more sophisticated algorithm than the Naive Bayes classifier previously used by this algorithm. Second, we apply the DPSO to a challenging protein functional classification data set, involving a large number of classes to be predicted. The performance of the DPSO is compared to the performance of a Binary PSO on the task of selecting attributes in this challenging data set. The criteria used for comparison are: (1) maximizing predictive accuracy; and (2) finding the smallest subset of attributes.

    Johnson, Colin G. (2007) Genetic Programming with Fitness based on Model Checking. In: Genetic Programming, April 11-13, 2007, Valencia, Spain.

    Abstract

    Model checking is a way of analysing programs and program-like structures to decide whether they satisfy a list of temporal logic statements describing desired behaviour. In this paper we apply this to the fitness checking stage in an evolution strategy for learning finite state machines. We give experimental results consisting of learning the control program for a vending machine.

    Johnson, Colin G. and Fuller, Ursula (2007) Is Bloom's Taxonomy Appropriate for Computer Science? In: Proceedings of the Sixth Baltic Sea Conference on Computing Education Research.

    Abstract

    Bloom's taxonomy attempts to provide a set of levels of cognitive engagement with material being learned. It is usually presented as a generic framework. In this paper we outline some studies which examine whether the taxonomy is appropriate for computing, and how its application in computing might differ from its application elsewhere. We place this in the context of ongoing debates concerning graduateness and attempts to benchmark the content of a computing degree.

    Johnson, Colin G. (2007) A Genetic Algorithm for Coverage Problems. In: Proceedings of the 2007 Genetic and Evolutionary Computation Conference.

    Abstract

    This paper describes a genetic algorithm approach to coverage problems, that is, problems where the aim is to discover an example for each class in a given classifcation scheme.

    Johnson, Colin G. (2006) A Design Framework for Metaheuristics: Problem Types and Avoiding Bottlenecking. In: The 6th International Conference on Recent Advances in Soft Computing, 10-12 July 2006, Canterbury, United Kingdom.

    Abstract

    This paper is concerned with an aspect of the design of metaheuristic algorithms, such as evolutionary algorithms, tabu search and ant colony optimization. The topic that is considered is how problems can be represented when they are given to a metaheuristic algorithm. A particular difficulty is presented, viz. the ''bottleneck'', where the problem is artificially converted into a new representation in order to fit the standard input to the metaheuristic. Such bottlenecks cause problems in interpreting or trusting the solution given by the metaheuristic. In order to alleviate this problem, we suggest ways in which three types of problem (data-driven, specification-driven and interactive) can be presented to metaheuristics in a bottleneck-free way, and how problems which use multiple solution-types can be tackled.

    Johnson, Colin G. and Gounaropoulos, Alex (2006) Timbre Interfaces using Adjectives and Adverbs. In: Schnell, Norbert IRCAM, Paris, France pp. 101-102. ISBN 2-84426-314-3.

    Abstract

    How can we provide interfaces to synthesis algorithms that will allow us to manipulate timbre directly, using the same timbre-words that are used by human musicians to communicate about timbre? This paper describes ongoing work that uses machine learning methods (principally genetic algorithms and neural networks) to learn (1) to recognise timbral characteristics of sound and (2) to adjust timbral characteristics of existing synthesized sounds.

    Johnson, Colin G. (2006) Abstract Interpretation of Student Programs as a Strategy for Courseware Development. In: Methods, Materials and Tools for Programming Education, May 2006, 14-20, Tampere, Finland.

    Abstract

    What kinds of feedback can we give to students as part of a computer-based system for supporting programming? One kind of feedback is whether the idioms that students use in their programs are typical of experienced programmers. In this paper we talk about how such idioms/patterns/roles/clich303251s are used in a tacit fashion by experienced programmers, and how natural language tags for such idioms (e.g. roles of variables) can be used to articulate this knowledge. Based on these ideas we suggest a strategy for giving feedback to students in courseware: students annotate their program using an annotation language that captures these idioms; then abstract interpretation of programs is used by the courseware to check these annotations and provide focused feedback. As a case study, we show how roles of variables can be annotated and those annotations automatically checked in the BlueJ programming environment.

    Bishop, Craig and Johnson, Colin G. (2005) Roles of Variables and Program Analysis. In: Proceedings of the 5th Finnish/Baltic Conference on Computer Science Education.

    Abstract

    The idea of roles of variables is to provide a vocabulary for describing the way in which variables are used by experienced programmers. This paper presents work on a system that is designed to automatically check students' role assignments in simple procedural programming. This is achieved by applying program analysis techniques, in particular program slicing and data flow analysis, to programs that students have written and annotated with role assignments.

    Iqbal, M. and Freitas, A.A. and Johnson, Colin G. (2005) Varying the Topology and Probability of Re-Initialization in Particle Swarm Optimization. In: Evolution Artificielle 2005.

    Abstract

    This paper introduces two new versions of dissipative particle swarm optimization. Both of these use a new time-dependent strategy for randomly re-initializing the positions of the particles. In addition, one variation also uses a novel dynamic neighbourhood topology based on small world networks. We present results from applying these algorithms to two well-known function optimization problems. Both algorithms perform considerably better than both standard PSO and the original dissipative PSO algorithms. In particular one version performs significantly better on high-dimensional problems that are inaccessible to traditional methods.

    Johnson, Colin G. (2005) Search and Notions of Creativity. In: Nineteenth International Joint Conference on Artificial Intelligence, 30 July - 5 August 2005, Edinburgh, Scotland.

    Abstract

    In this paper we consider a number of issues concerned with how creativity can exist within a paradigm of computational search. We look both at creativity within the search process and the capacity of the search space to facilitate creativity. Within the latter we consider notions of compressibility and openness in search spaces; the influence of modularity and substructures on creativity; and the distinction between items in a search space denoting an external reference versus connoting links within and beyond the search space. We go on to discuss whether there really is a true distinction between transformational and exploratory notions of creativity within the context of search. Finally we offer some open questions in this area.

    Johnson, Colin G. (2005) What can Post-Classical Computation do for Post-Cognitivist Psychology? In: Post-Cognitivist Psychology Conference 2005, 4-6 July 2005, Glasgow, Scotland.

    Abstract

    This brief note is the abstract for a talk given at an interdisciplinary conference which brought together a number of disciplines to tackle questions in psychology and cognitive science. It describes some of the implications of non-classical computation for cognitive science.

    Johnson, Colin G. (2005) Non-classical computation the computationalist stance towards the natural and cognitive sciences. In: The Grand Challenge in Non-Classical Computation, 18-19 April 2005, York, United Kingdom.

    Abstract

    This brief note, prepared for a discussion meeting on non-classical computation, takes a look at the implications of non-classical computing for attempts to study natural and cognitive sciences using the idea of computation.

    Johnson, Colin G. (2005) Varieties of Openness in Evolutionary Computation. In: Proceedings of the Workshop on Creative Evolutionary Computation, February 2005., Goldsmiths College,.

    Abstract

    This paper explores ideas of what it means for search spaces for genetic algorithms and similar methods to be ''open'', and considers the implications of such openness for computational creativity.

    Johnson, Colin G. (2004) Using tabu search and genetic algorithms in mathematics research. In: Proceedings of the Fifth International Conference on Recent Advances in Soft Computing, December 2004, Nottingham Trent University.

    Abstract

    This paper discusses an ongoing project which uses computational heuristic search techniques such as tabu search and genetic algorithms as a tool for mathematics research. We discuss three ways in which such search techniques can be useful for mathematicians: in nding counterexamples to conjectures, in enumerating examples, and in nding sequences of transformations between two objects which are conjectured to be related. These problem-types are discussed using examples from topology.

    Green, Jennifer and Whalley, Jacqueline.L. and Johnson, Colin G. (2004) Automatic Programming with Ant Colony Optimization. In: Proceedings of the 2004 UK Workshop on Computational Intelligence, September 2004., Loughborough University,.

    Abstract

    Automatic programming is the use of search techniques to find programs that solve a problem. The most commonly explored automatic programming technique is genetic programming, which uses genetic algorithms to carry out the search. In this paper we introduce a new technique called Ant Colony Programming (ACP) which uses an ant colony based search in place of genetic algorithms. This algorithm is described and compared with other approaches in the literature.

    Johnson, Colin G. (2004) Post-industrial-revolution HCI. In: Coping with Complexity: Sharing New Approaches for the Design of Human-Computer Systems in Complex Settings, September 2004, University of Bath.

    Abstract

    This paper argues that computing in its present state is akin to the state of manufacturing prior to the industrial revolution. It is suggested that eventually an industrial revolution will occur in programming through the use of automated program generation tools, which will allow the rapid creation of programs on-the-fly from what-needs-doing descriptions rather than the how-todo- it descriptions of traditional programming. What would interfaces to computers look like in this context, and how would they aid users in coping with complexity?

    Johnson, Colin G. (2004) Do somatic markers need to be somatic? Analogies from evolution and from hardware interlocks. In: Proceedings of the AISB 2004 Convention.

    Abstract

    This paper considers Damasio's concept of the somatic marker from two new perspectives. The first of these considers them from the point of view of Dawkins's concept of the extended phenotype. This is used to develop the idea of the extended somatic marker, viz. a marker which uses some non-somatic feature of the external world in a similar fashion to the somatic marker. Secondly an analogy is developed with the concept of hardware interlocks in safety-critical systems. This is used to suggest why it is important that somatic markers are bodily states and not just mental markers.

    Stepney, S. and Clark, J.A. and Johnson, Colin G. et al. (2003) Artificial Immune Systems and the Grand Challenge for Non-Classical Computation. In: 2nd International Conference on Artificial Immune Sysyems, SEP 01-03, 2003, Edinburgh Scotland.

    Abstract

    The UK Grand Challenges for Computing Research is an initiative to map out certain key areas that could be used to help drive research over the next 10 15 years. One of the identified Grand Challenges is Non-Classical Computation, which examines many of the fundamental assumptions of Computer Science, and asks what would result if they were systematically broken. In this discussion paper, we explain how the sub-discipline of Artificial Immune Systems sits squarely in the province of this particular Grand Challenge, and we identify certain key questions.

    Alston, Mark and Robinson, Gary and Johnson, Colin G. (2003) Colour merging for the visualization of biomolecular sequence data. In: 7th International Conference on Information Visualization (IV 2003), JUL 16-18, 2003, LONDON, ENGLAND.

    Abstract

    This paper introduces a novel technique for the visualization of data at various levels of detail. This is based on a colour-based representation of the data, where ``high level'' views of the data are obtained by merging colours together to obtain a summary-colour which represents a number of data-points. This is applied to the problem of visualizing biomolecular sequence data and picking out features in such data at various scales.

    Johnson, Colin G. (2003) Artificial Immune Systems Programming for Symbolic Regression. In: Ryan, C. and Soule, T. and Keijzer, M. et al. LNCS 2610, 2610. Springer pp. 345-353. ISBN 3-540-00971-X.

    Abstract

    Artificial Immune Systems are computational algorithms which take their inspiration from the way in which natural immune systems learn to respond to attacks on an organism. This paper discusses how such a system can be used as an alternative to genetic algorithms as a way of exploring program-space in a system similar to genetic programming. Some experimental results are given for a symbolic regression problem. The paper ends with a discussion of future directions for the use of artificial immune systems in program induction.

    Moss, J.D. and Johnson, Colin G. (2003) An ant colony algorithm for multiple sequence alignment in bioinformatics. In: Pearson, D.W. and Steele, N.C. and Albrecht, R.F. Springer pp. 182-186. ISBN 3-211-00743-1.

    Abstract

    This paper describes a the application of ant colony optimization algorithms, which draw inspiration from the way ants organize themselves in searching for food, to the well-known bioinformatics problem of aligning several protein sequences.

    Johnson, Colin G. (2003) Towards a prehistory of evolutionary and adaptive computation in music. In: Raidl, G. and Corne, D. and Marchiori, E. et al. LNCS 2611, 2610. Springer pp. 502-509. ISBN 3-540-00976-0.

    Abstract

    A number of systems have been created which apply genetic algorithms, cellular automata, artificial life, agents, and other evolutionary and adaptive computation ideas in the creation of music. The aim of this paper is to examine the context in which such systems arose by looking for features of experimental music which prefigure the ideas used in such systems. A number of ideas are explored: the use of randomness in music, the view of compositions as parameterized systems, the idea of emergent structure in music and the idea of musicians performing the role of reactive agents.

    Johnson, Colin G. (2002) Genetic Programming with Guaranteed Constraints. In: Recent Advances in Soft Computing, December 12-13, 2002, Nottingham Trent University, England.

    Abstract

    Genetic programming is a powerful technique for automatically generating program code from a description of the desired functionality. However it is frequently distrusted by users because the programs are generated with reference to a training set, and there is no formal guarantee that the generated programs will operate as intended outside of this training set. This paper describes a way of including constraints into the fitness function of a genetic programming system, so that the evolution is guided towards a solution which satisfies those constraints and so that a check can be made when a solution satisfies those constraints. This is applied to a problem in mobile robotics.

    Johnson, Colin G. (2002) What can automatic programming learn from theoretical computer science? In: Proceedings of the 2002 UK Workshop on Computational Intelligence, 2-4 September 2002, Birmingham, U.K.

    Abstract

    This paper considers two (seemingly) radically different perspectives on the construction of software. On one hand, search-based heuristics such as genetic programming. On the other hand, the theories of programming which underpin mathematical program analysis and formal methods. The main part of the paper surveys possible links between these perspectives. In particular the contrast between inductive and deductive approaches to software construction are studied, and various suggestions are made as to how randomized search heuristics can be combined with formal approaches to software construction without compromising the rigorous provability of the results. The aim of the ideas proposed is to improve the efficiency, effectiveness and safety of search-based automatic programming.

    Whalley, Jacqueline L. and Tuite, M.F. and Johnson, Colin G. (2002) A virtual lab for exploring the [PSI]+ yeast prion. In: Valafar, Faramarz CSERA Press, USA pp. 583-589. ISBN 1-892512-32-7.

    Abstract

    This paper discusses on going work on simulating the propogation within the cell of a prion protein in yeast. The biological background to the project is outlined, and a number of questions about this system are posed. The paper then discusses how computer simulation is being used to provide a ''virtual laboratory'' for this system, which will be employed to support and understand real experiments. Futhermore, the development of the application is detailed with emphasis on the challenges encountered to date.

    Johnson, Colin G. (2002) Deriving genetic programming fitness properties by static analysis. In: Lutton, Evelyne and Foster, James A. and Miller, Julian et al. Lecture Notes in Computer Science, 2278. Springer-Verlag, Berlin pp. 299-308. ISBN 978-3540433781.

    Abstract

    Deriving Genetic Programming Fitness Properties by Static Analysis Colin G. Johnson The aim of this paper is to introduce the idea of using static analysis of computer programs as a way of measuring fitness in genetic programming. Such techniques extract information about the programs without explicitly running them, and in particular they infer properties which hold across the whole of the input space of a program. This can be applied to measure fitness, and has a number of advantages over measuring fitness by running members of the population on test cases. The most important advantage is that if a solution is found then it is possible to formally trust that solution to be correct across all inputs. This paper introduces these ideas, discusses various ways in which they could be applied, discusses the type of problems for which they are appropriate, and ends by giving a simple test example and some questions for future research.

    Goldman, J.P. and Gullick, W.J. and Bray, D. et al. (2002) Individual-based simulation of the clustering behaviour of epidermal growth factor receptors. In: Proceedings of the 2002 ACM Symposium on Applied Computing, March 11 - 14, 2002, Madrid, Spain.

    Abstract

    This paper describes ongoing work on a project to simulate the behaviour of epidermal growth factor receptors. These are structures which can be found on the surface of cells in the body, which receive and process chemical signals concerned with cell growth. We describe the implementation of a program which simulates the stimulation and clustering behaviour of these structures, discuss how we scale up this simulation so that we can simulate a whole cell on a tractable timescale, and discuss ongoing work in which we are calibrating our simulation against results from experiments.

    Johnson, Colin G. (2001) Finding Diverse Examples with Genetic Algorithms. In: Developments in Soft Computing, 2000, Leicester, England.

    Abstract

    A number of real-world problems can be seen as instances of the general problem of emphfinding qualitative examples. In such a problem we know how to classify objects from a set into a large number of classes, and we would like to find one specific example for each class. In this paper we outline a number of problems which fit into this category, and analyse some of the requirements for a general heuristic for this category of problem. We then develop a heuristic for this type of problem, based on genetic algorithms, and investigate the application of this heuristic to some test problems.

    Johnson, Colin G. and Brodie, Steven J. (2001) Phase transitions in multi-robot interactions. In: Towards Intelligent Mobile Robots - Proceedings of the 3rd British Conference on Autonomous Mobile Robotics and Autonomous Systems, 5 April 2001, Manchester.

    Abstract

    Phase transitions are where a small change in a parameter causes a qualitative change in the behaviour or state of a system. We can apply this concept to determining the connectivity of a population of mobile robots. Consider a population of mobile robots which can pass messages to other nearby robots. When the number of robots is small, or their working environment is large, robots will only be able to pass a message to a small number of others. However there is a point where if we increase the population by a small number, increase the communication range slightly, or reduce the working area by a small amount, the connectivity of the population increases suddenly to a point where almost all robots are communicating with each other. The consequences of this for multi-robot learning are discussed.

    Johnson, Colin G. and Jones, Gareth J.F. (1999) Effective affective communication in virtual environments. In: Proceedings of the Second Workshop on Intelligent Virtual Agents, September 1999, University of Salford.

    Abstract

    Studies of communication between entities in virtual environments have tended to focus on the relevant technical issues and its social impact impact. An important component of human communication is the conveying of affective information via voice, facial expression and gestures and other body language. Virtual environments may be populated by representations of human or virtual agent participants. Communications may be between person-person, agent-agent or person-agent. This paper explores the possible use of the affective communication in virtual environments. The desirability of affective communication is examined and some research ideas for developing affective communication in virtual environments are proposed.

    Johnson, Colin G. (1999) Exploring the sound-space of synthesis algorithms using interactive genetic algorithms. In: Proceedings of the AISB'99 Symposium on Musical Creativity.

    Abstract

    Exploring the sounds available from a synthesis algorithm is a complicated process, requiring the user either to spend much time gaining heuristic experience with the algorithm or requiring them to have a deep knowledge of the underlying synthesis algorithms. In this paper we describe a computer system which facilitates a more exploratory approach to sound design, allowing the user to work at the level of the sounds themselves. In this system the synthesis parameters are managed by a genetic algorithm, which is directed by the users' judgements about the sounds in the system. We describe the development of a prototype version of this system, concentrating on an interface to the FOF granular synthesis algorithm from Csound.

    Johnson, Colin G. and Marsh, Duncan (1996) Modelling Robot Manipulators in a CAD Environment Using B-Splines. In: Proceedings of the 1996 IEEE Joint Symposia on Intelligence and Systems.

    Abstract

    A major aim of robotics research is the provision of systems which simplify the programming of robots, enabling experienced designers and engineers to implement robotic devices as part of a larger systems without the need to become expert programmers. Also in the quest for a flexible industrial production system is it desirable to be able to reprogram robots offline, so that they can be doing one task whilst being prepared for another. This paper describes the mathematical and computational background to a system which enables the development and testing of robot programs in a CAD environment in a way that is simple to use and compatible with existing methods used in CAD systems.

Theses
Monographs

    Johnson, Colin G. and Whalley, Jacqueline (2002) Detecting collisions in sets of moving particles: a survey and some experiments. technical_report. University of Kent

    Abstract

    Detecting and responding to collisions between particles is an important requirement for building simulations in computational science. Due to the large number of potential collisions it is impractical to check all possibilities, so the development of algorithms which narrow down the number of possible searches to a small number is important. In this paper we review various algorithms for this task, and give results from a number of experiments which demonstrate the relative efficiency of these algorithms on a fundamental problem of detecting collisions between particles undergoing Brownian motion. The general slant of the paper is towards the development of algorithms for simulating microbiological systems.

    Carter, Janet and Tardivel, Jill and Fincher, Sally et al. (2001) Portrait of 2000/01 Part I Assessments, Part 1: Statistical Analysis. technical_report. UKC, University of Kent at Canterbury

    Abstract

    Every year the CompEd research group has undertaken a ''summer reading'' project. This year we have undertaken a whole group project which utilises the skills and research techniques relevant to CSEd research. This statistical analysis is the first in a series of technical reports to disseminate our findings. Its main use is to present a numerical picture of the relationship between assessments, examinations, modules, and overall performance that highlights factors requiring a more in-depth, qualitative approach. Part 2 of the report will build upon this information and will also involve an analysis of the different approaches to assessment setting between the CS Part I modules and the desired learning outcomes. Part 3 of the report will draw together the evidence from parts 1 and 2 and will also involve a closer examination of individual students performances tracing their progress both between modules and across different types of assessment. This will be situated within the context of the findings from parts 1 and 2. The overall aim is to create a taxonomy of assessment.

Edited Books
Total publications in KAR: 80 [See all in KAR]

School of Computing, University of Kent, Canterbury, Kent, CT2 7NF

Enquiries: +44 (0)1227 824180 or contact us.

Last Updated: 12/11/2012 12:16