Publications by Prof Richard Jones

Also view these in the Kent Academic Repository

Articles

Ugawa, T., Ritson, C. G. and Jones, R. E. (2018) 'Transactional Sapphire: Lessons in High Performance, On-the-fly Garbage Collection', ACM Transactions on Programming Languages and Systems. ACM. doi: 10.1145/3226225.

Abstract | View in KAR | View Full Text

Constructing a high-performance garbage collector is hard. Constructing a fully concurrent 'on-the-fly', compacting collector is much more so. We describe our experience of implementing the Sapphire algorithm as the first on-the-fly, parallel, replication copying, garbage collector for the Jikes RVM Java virtual machine. In part, we explain our innovations such as copying with hardware and software transactions, on-the-fly management of Java's reference types and simple, yet correct, lock-free management of volatile fields in a replicating collector. We fully evaluate, for the first time, and using realistic benchmarks, Sapphire's performance and suitability as a low latency collector. An important contribution of this work is a detailed description of our experience of building an on-the-fly copying collector for a complete JVM with some assurance that it is correct. A key aspect of this is model checking of critical components of this complicated and highly concurrent system.

Blackburn, S. M., Diwan, A., Hauswirth, M., Sweeney, P. F., Amaral, J. N., Brecht, T., Bulej, L., Click, C., Eeckhout, L., Fischmeister, S. and others. (2016) 'The Truth, the Whole Truth, and Nothing but the Truth: A Pragmatic Guide to Assessing Empirical Evaluations', Transactions on Programming Languages and Systems. ACM. doi: 10.1145/2983574.

Abstract | View in KAR | View Full Text

An unsound claim can misdirect a field, encouraging the pursuit of unworthy ideas and the abandonment of promising ideas. An inadequate description of a claim can make it difficult to reason about the claim, for example to determine whether the claim is sound. Many practitioners will acknowledge the threat of un- sound claims or inadequate descriptions of claims to their field. We believe that this situation is exacerbated and even encouraged by the lack of a systematic approach to exploring, exposing, and addressing the source of unsound claims and poor exposition. This paper proposes a framework that identifies three sins of reasoning that lead to unsound claims and two sins of exposition that lead to poorly described claims. Sins of exposition obfuscate the objective of determining whether or not a claim is sound, while sins of reasoning lead directly to unsound claims. Our framework provides practitioners with a principled way of critiquing the integrity of their own work and the work of others. We hope that this will help individuals conduct better science and encourage a cultural shift in our research community to identify and promulgate sound claims.

Jones, R. E. and Blackburn, S. (2008) 'International Symposium on Memory Management (ISMM 2008) summary', ACM SIGPLAN Notices. ACM, pp. 12-14. doi: 10.1145/1416216.1416220.

Abstract | View in KAR | View Full Text

The ACM SIGPLAN International Symposium on Memory Management (ISMM) is a premier forum for research in the management of dynamically allocated memory. We interpret this remit widely: areas of interest include but are not limited to explicit storage allocation and deallocation; garbage collection algorithms and implementations; compiler analyses to aid memory management; interactions with languages, operating systems and hardware (especially the memory system); memory management related tools; and empirical studies of allocation and referencing behavior in programs that make significant use of dynamic memory.

Jones, R. E. (2006) 'Editorial: Five Perspectives on Modern Memory Management: Systems, Hardware and Theory', Science of Computer Programming. Elsevier, pp. 95-97.

View in KAR

Moreau, L., Dickman, P. and Jones, R. E. (2005) 'Birrell's Distributed Reference Listing Revisited', ACM Transactions on Programming Languages and Systems (TOPLAS). ACM, pp. 1-52.

Abstract | View in KAR | View Full Text

The Java RMI collector is arguably the most widely used distributed garbage collector. Its distributed reference listing algorithm was introduced by Birrell in the context of Network Objects, where the description was informal and heavily biased toward implementation. In this paper, we formalise this algorithm in an implementation-independent manner, which allows us to clarify weaknesses of the initial presentation. In particular, we discover cases critical to the correctness of the algorithm that are not accounted for by Birrell. We use our formalisation to derive an invariant-based proof of correctness of the algorithm that avoids notoriously difficult temporal reasoning. Furthermore, we offer a novel graphical representation of the state transition diagram, which we use to provide intuitive explanations of the algorithm and to investigate its tolerance to faults in a systematic manner. Finally, we examine how the algorithm may be optimised, either by placing constraints on message channels or by tightening the coupling between application program and distributed garbage collector.

Chilimbi, T., Jones, R. E. and Zorn., B. (2001) 'Designing a Trace Format for Heap Allocation Events', ACM SIGPLAN Notices. Edited by T. Hosking. ISMM2000 International Symposium on Memory Management, Minneapolis, MN: ACM, pp. 35-49. doi: 10.1145/362426.362435.

Abstract | View in KAR | View Full Text

Dynamic storage allocation continues to play an important role in the performance and correctness of systems ranging from user productivity software to high-performance servers. While algorithms for dynamic storage allocation have been studied for decades, much of the literature is based on measuring the performance of benchmark programs unrepresentative of many important allocation-intensive workloads. Furthermore, to date no standard has emerged or been proposed for publishing and exchanging representative allocation workloads. In this paper, we describe a preliminary design of a trace format for such workloads and investigate its effectiveness at representing large allocation traces. Our proposal allows for a flexible encoding of information in the trace to achieve greater compression. We evaluate our preliminary design in two dimensions. First, we measure how effective these encodings are at reducing trace size. Second we consider how a meta-level specification language could be used to describe such formats and to generate trace readers and writers automatically.

Hartel, P. H., Feeley, M., Jones, R. E. and Thomas, S. P. (1996) 'Benchmarking Implementations of Functional Languages with `Pseudoknot', a Float-Intensive Benchmark', Journal of Functional Programming. Cambridge University Press, pp. 621-655.

Abstract | View in KAR

Over 25 implementations of different functional languages are benchmarked using the same program, a floating-point intensive application taken from molecular biology. The principal aspects studied are compile time and execution time for the various implementations that were benchmarked. An important consideration is how the program can be modified and tuned to obtain maximal performance on each language implementation. With few exceptions, the compilers take a significant amount of time to compile this program, though most compilers were faster than the then current GNU C compiler (GCC version 2.5.8). Compilers that generate C or Lisp are often slower than those that generate native code directly: the cost of compiling the intermediate form is normally a large fraction of the total compilation time. There is no clear distinction between the runtime performance of eager and lazy implementations when appropriate annotations are used: lazy implementations have clearly come of age when it comes to implementing largely strict applications, such as the Pseudoknot program. The speed of C can be approached by some implementations, but to achieve this performance, special measures such as strictness annotations are required by non-strict implementations. The benchmark results have to be interpreted with care. Firstly, a benchmark based on a single program cannot cover a wide spectrum of `typical' applications. Secondly, the compilers vary in the kind and level of optimisations offered, so the effort required to obtain an optimal version of the program is similarly varied.

Brown, P. J. and Jones, R. E. (1992) 'Marking EP coursework using electronic communication', Electronic Publishing: Origination, Dissemination and Design.

View in KAR

Jones, R. E. (1992) 'Tail recursion without Space Leaks', Journal of Functional Programming, pp. 73-79. doi: 10.1017/s0956796800000277.

View in KAR

Book sections

Mercier, D., Chawdhary, A. and Jones, R. (2017) 'dynStruct: An automatic reverse engineering tool for structure recovery and memory use analysis', in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, pp. 497-501. doi: 10.1109/SANER.2017.7884661.

Abstract | View in KAR | View Full Text

dynStruct is an open source structure recovery tool for ×86 binaries. It uses dynamic binary instrumentation to record information about memory accesses, which is then processed off-line to recover structures created and used by the binary. It provides a powerful web interface which not only displays the raw data and the recovered structures but also allows this information to be explored and manually edited. dynStruct is an effective tool for analyzing programs as complex as emacs.

Ugawa, T., Jones, R. E. and Ritson, C. G. (2014) 'Reference Object Processing in On-The-Fly Garbage Collection', in Proceedings of the 2014 international symposium on Memory management. ACM/SIGPLAN International Symposium on Memory Management (ISMM14), New York, USA: ACM, pp. 59-69. doi: 10.1145/2602988.2602991.

Abstract | View in KAR | View Full Text

Most proposals for on-the-fly garbage collection ignore the ques- tion of Java's weak and other reference types. However, we show that reference types are heavily used in DaCapo benchmarks. Of the few collectors that do address this issue, most block mutators, either globally or individually, while processing reference types. We introduce a new framework for processing reference types on- the-fly in Jikes RVM. Our framework supports both insertion and deletion write barriers. We have model checked our algorithm and incorporated it in our new implementation of the Sapphire on-the- fly collector. Using a deletion barrier, we process references while mutators are running in less than three times the time that previous approaches take while mutators are halted; our overall execution times are no worse, and often better.

Ritson, C. G., Ugawa, T. and Jones, R. E. (2014) 'Exploring Garbage Collection with Haswell Hardware Transactional Memory', in ISMM '14: Proceedings of the 2014 international symposium on Memory management. ACM/SIGPLAN International Symposium on Memory Management (ISMM14), New York, USA: ACM, pp. 105-115. doi: 10.1145/2602988.2602992.

Abstract | View in KAR | View Full Text

Intel's latest processor microarchitecture, Haswell, adds support for a restricted form of transactional memory to the x86 programming model. We explore how this can be applied to three garbage collection scenarios in Jikes RVM: parallel copying, concurrent copying and bitmap marking. We demonstrate gains in concurrent copying speed over traditional synchronisation mechanisms of 48–101%. We also show how similar but portable performance gains can be achieved through software transactional memory techniques. We identify the architectural overhead of capturing sufficient work for transactional execution as a major stumbling block to the effective use of transactions in the other scenarios.

White, D. R., Singer, J., Aitken, J. M. and Jones, R. E. (2013) 'Control theory for principled heap sizing', in Proceedings of the 2013 international symposium on memory management. International Symposium on Memory Management, New York, USA: ACM, pp. 27-38. doi: 10.1145/2555670.2466481.

Abstract | View in KAR | View Full Text

We propose a new, principled approach to adaptive heap sizing based on control theory. We review current state-of-the-art heap sizing mechanisms, as deployed in Jikes RVM and HotSpot. We then formulate heap sizing as a control problem, apply and tune a standard controller algorithm, and evaluate its performance on a set of well-known benchmarks. We find our controller adapts the heap size more responsively than existing mechanisms. This responsiveness allows tighter virtual machine memory footprints while preserving target application throughput, which is ideal for both embedded and utility computing domains. In short, we argue that formal, systematic approaches to memory management should be replacing ad-hoc heuristics as the discipline matures.

Kalibera, T. and Jones, R. (2011) 'Handles revisited: optimising performance and memory costs in a real-time collector', in ISMM '10 Proceedings of the 2010 international symposium on Memory management. Proceedings of the 10th International Symposium on Memory Management (ISMM), New York, USA: ACM, pp. 182-196. doi: 10.1145/1993478.1993492.

Abstract | View in KAR

Compacting garbage collectors must update all references to objects they move. Updating is a lengthy operation but the updates must be transparent to the mutator. The consequence is that no space can be reclaimed until all references have been updated which, in a real-time collector, must be done incrementally. One solution is to replace direct references to objects with handles. Handles offer several advantages to a real-time collector. They eliminate the updating problem. They allow immediate reuse of the space used by evacuated objects. They incur no copy reserve overhead. However, the execution time overhead of handles has led to them being abandoned by most modern systems. We re-examine this decision in the context of real-time garbage collection, for which several systems with handles have appeared recently. We provide the first thorough study of the overheads of handles, based on an optimised implementation of different handle designs within Ovm's Minuteman real-time collector. We find that with a good set of optimisations handles are not very expensive. We obtained zero overhead over the widely used Brooks-style compacting collector (1.6 and 3.1 on two other platforms) and 9 increase in memory usage. Our optimisations are particularly applicable to mark-compact collectors, but may also be useful to other collectors.

Hellyer, L., Jones, R. E. and Hosking, A. L. (2010) 'The Locality of Concurrent Write Barriers', in Vitek, J. and Lea, D. (eds). Proceedings of the 2010 International Symposium on Memory Management, Toronto, Canada: ACM, pp. 182-196. doi: 10.1145/1806651.1806666.

Abstract | View in KAR

Concurrent and incremental collectors require barriers to ensure correct synchronisation between mutator and collector. The overheads imposed by particular barriers on particular systems have been widely studied. Somewhat fewer studies have also compared barriers in terms of their termination properties or the volume of floating garbage they generate. Until now, the consequences for locality of different barrier choices has not been studied, although locality will be of increasing importance for emerging architectures. This paper provides a study of the locality of concurrent write barriers, independent of the processor architecture, virtual machine, compiler or garbage collection algorithm.

Singer, J., Jones, R. E., Brown, G. and Luján, M. (2010) 'The Economics of Garbage Collection', in Vitek, J. and Lea, D. (eds) ISMM '10 Proceedings of the 2010 international symposium on Memory management. Proceedings of the 2010 International Symposium on Memory Management, New York, USA: ACM, pp. 182-196. doi: 10.1145/1806651.1806669.

Abstract | View in KAR

This paper argues that economic theory can improve our understanding of memory management. We introduce the allocation curve, as an analogue of the demand curve from microeconomics. An allocation curve for a program characterises how the amount of garbage collection activity required during its execution varies in relation to the heap size associated with that program. The standard treatment of microeconomic demand curves (shifts and elasticity) can be applied directly and intuitively to our new allocation curves. As an application of this new theory, we show how allocation elasticity can be used to control the heap growth rate for variable sized heaps in Jikes RVM.

Jones, R. E. and Ryder, C. (2008) 'A Study of Java Demographics', in Blackburn, S. (ed.) ISMM '08 Proceedings of the 7th international symposium on Memory management. Proceedings of the 2008 International Symposium on Memory Management (ISMM'08), New York, USA: ACM, pp. 121-130. doi: 10.1145/1375634.1375652.

Abstract | View in KAR | View Full Text

Researchers have long strived to exploit program behaviour in order to improve garbage collection efficiency. For example, by using a simple heuristic, generational GC manages short-lived objects well, although longer-lived objects will still be promoted to an older generation and may be processed repeatedly thereafter. In this paper, we provide a detailed study of Java object lifetimes which reveals a richer landscape than the generational view offers. Allocation site has been claimed to be a good predictor for object lifetime, but we show that object lifetime can be categorised more precisely than 'short-lived/long-lived/immortal'. We show that (i) sites allocate objects with lifetimes in only a small number of narrow ranges, and (ii) sites cluster strongly with respect to the lifetime distributions of the objects they allocate. Furthermore, (iii) these clusterings are robust against the size of the input given to the program and (iv) are likely to allocate objects that are live only in particular phases of the program's execution. Finally, we show that, in contrast to previous studies, (v) allocation site alone is not always sufficient as a predictor of object lifetime distribution but one further level of stack context suffices.

Marion, S., Jones, R. and Ryder, C. (2007) 'Decrypting The Java Gene Pool: Predicting Objects' Lifetimes with Micro-patterns', in ISMM '07 Proceedings of the 6th international symposium on Memory management. International Symposium on Memory Management (ISMM07), New York, USA: ACM, pp. 67-78. doi: 10.1145/1296907.1296918.

Abstract | View in KAR | View Full Text

Pretenuring long-lived and immortal objects into infrequently or never collected regions reduces garbage collection costs significantly. However, extant approaches either require computationally expensive, application-specific, off-line profiling, or consider only allocation sites common to all programs, i.e. invoked by the virtual machine rather than application programs. In contrast, we show how a simple program analysis, combined with an object lifetime knowledge bank, can be exploited to match both runtime system and application program structure with object lifetimes. The complexity of the analysis is linear in the size of the program, so need not be run ahead of time. We obtain performance gains between 6-77% in GC time against a generational copying collector for several SPEC jvm98 programs.

Jones, R. E. (2007) 'Dynamic Memory Management: Challenges for Today and Tomorrow', in International LISP Conference 2007: (ILC 07). International Lisp Conference, Association of LISP Users, pp. 115-124.

Abstract | View in KAR | View Full Text

Garbage collection (GC) is a key component of almost all modern programming languages. The advent of conventional object-oriented languages supported by managed run-times (e.g. Java, C# and even Managed C++) has brought GC into the mainstream and, as memory manager performance is critical for many large applications, brought GC to the attention of programmers outside its traditional functional programming language community. In this talk, I shall start by reviewing how GC got to where it is today, why it is desirable, what performance you might reasonably expect and I shall outline the directions in which GC research is moving. In particular, I'll look at some of the challenges facing modern GC, in contexts ranging from GC for high-performance, multiprocessor systems to GC for real-time systems and limited devices, from better integrating with its operating environment to supporting specific applications. I shall speculate wildly on future directions for research.

Jones, R. E. and King, A. (2005) 'A fast analysis for thread-local garbage collection with dynamic class loading', in Fifth IEEE International Workshop on Source Code Analysis and Manipulation (SCAM'05). Source Code Analysis and Manipulation, 2005, IEEE, pp. 129-138. doi: 10.1109/SCAM.2005.1.

Abstract | View in KAR | View Full Text

Long-running, heavily multi-threaded, Java server applications make stringent demands of garbage collector (GC) performance. Synchronisation of all application threads before garbage collection is a significant bottleneck for JVMs that use native threads. We present a new static analysis and a novel GC framework designed to address this issue by allowing independent collection of thread-local heaps. In contrast to previous work, our solution safely classifies objects even in the presence of dynamic class loading, requires neither write-barriers that may do unbounded work, nor synchronisation, nor locks during thread-local collections; our analysis is sufficiently fast to permit its integration into a high-performance, production-quality virtual machine.

Printezis, T. and Jones, R. E. (2002) 'GCspy: An Adaptable Heap Visualisation Framework', in Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. Proceedings of OOPSLA'02 ACM Conference on Object-Oriented Systems, Languages and Applications, New York, USA: ACM Press, pp. 343-358. doi: 10.1145/582419.582451.

Abstract | View in KAR | View Full Text

GCspy is an architectural framework for the collection, transmission, storage and replay of memory management behaviour. It makes new contributions to the understanding of the dynamic memory behaviour of programming languages (and especially object-oriented languages that make heavy demands on the performance of memory managers). GCspy's architecture allows easy incorporation into any memory management system: it is not limited to garbage-collected languages. It requires only small changes to the system in which it is incorporated but provides a simple to use yet powerful data-gathering API. GCspy scales to allow very large heaps to be visualised effectively and efficiently. It allows already-running, local or remote systems to be visualised and those systems to run at full speed outside the points at which data is gathered. GCspy's visualisation tool presents this information in a number of novel ways. Deep understanding of program behaviour is essential to the design of the next generation of garbage collectors and explicit allocators. Until now, no satisfactory tools have been available to assist the implementer in gaining an understanding of heap behaviour. GCspy has been demonstrated to be a practical solution to this dilemma. It has been used to analyse production Java virtual machines running applications of realistic sizes. Its use has revealed important insights into the interaction between application program and JVM and has led to the development of better garbage collectors.

Blackburn, S. M., Jones, R., McKinley, K. S. and Moss, J. E. B. (2002) 'Beltway: Getting Around Garbage Collection Gridlock', in Hendren, L. J. (ed.) Proceedings of the ACM SIGPLAN 2002 conference on Programming language design and implementation. Proceedings of PLDI'02 Programming Language Design and Implementation, New York, USA: ACM, pp. 153-164. doi: 10.1145/512529.512548.

Abstract | View in KAR | View Full Text

We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware. The increasing reliance on garbage collected languages such as Java requires that the collector perform well. We show that the generality of Beltway enables us to design and implement new collectors that are robust to variations in heap size and improve total execution time over the best generational copying collectors of which we are aware by up to 40%, and on average by 5 to 10%, for small to moderate heap sizes. New garbage collection algorithms are rare, and yet we define not just one, but a new family of collectors that subsumes previous work. This generality enables us to explore a larger design space and build better collectors.

Jones, R. E. (2000) 'Session 3: Overview', in Kirby, G., Dearle, A., and Sjoberg, D. (eds) Persistent Object Systems: Design, Implementation, and Use 9th International Workshop. Persistent Object Systems: Design, Implementation, and Use, Berlin, Germany: Springer, pp. 84-86. doi: 10.1007/3-540-45498-5_8.

Abstract | View in KAR

This session contained four papers, three of which focussed on the implementation of persistent object stores and one, the first of the session, on caching in distributed, transactional, persistent systems.

Jones, R. E., Beckett, D. J. and Fincher, S. (2000) 'Meeting Diverse User Needs: Implementation of a Departmental Information Strategy', in Franklin, S. D. and Strenski, E. (eds) Building University Electronic Educational Environments. IFIP TC3 WG3.2/3.6 International Working Conference on Building University Electronic Educational Environments, Kluwer Academic, pp. 125-139. doi: 10.1007/978-0-387-35502-3_10.

Abstract | View in KAR | View Full Text

Any higher education department must provide a variety of information to several disparate audiences, recognising their diverse needs. Student expectations of learning support structures, institutional needs for quality assurance and enhancement and the demands of accountability from funding agencies continue to grow. The widespread use of IT, both at home and in schools, has led students to expect a rich networked computing environment; the widened pattern of access to higher education has simultaneously increased the level of support needed and limited access to the campus for many of them. Managing these challenges and changes demands flexibility and auditability: we must be able both to respond to new demands and opportunities and also to account for our actions. At the same time, government-imposed efficiency gains [sic] reduce the resources available to implement change. Technology can provide the mechanism to meet these information needs, educational challenges and to manage changing social and political demands. In this paper we offer pointers abstracted and generalised from our experience for organisational change in similar environments. We explain the development of a large web-based information system (CSWeb) at the Department of Computer Science at the University of Kent at Canterbury designed to meet these challenges. Our goal has been not only to improve our site but also to embed it at the heart of the department's culture. We describe the technical innovations required to manage this large-scale site and that permit the flexibility needed to respond to change. Finally we suggest measures by which the success of enterprises of this kind may be measured. We analyse quantitative data that not only includes page hit counts but also reports users' patterns of activity through the site. Staff acceptance of the system is also measured by the degree to which they have provided further content to the site. Evidence of changes in student and staff expectations, behaviour and study patterns is presented.

Rodrigues, H. C. and Jones, R. E. (1998) 'Cyclic Distributed Garbage Collection with Group Merger', in Jul, E. (ed.) ECOOP'98 — Object-Oriented Programming 12th European Conference. Proceedings of 12th European Conference on Object-Oriented Programming, ECOOP98, Berlin, Germany: Springer, pp. 249-273. doi: 10.1007/BFb0054095.

Abstract | View in KAR | View Full Text

This paper presents a new algorithm for distributed garbage collection and outlines its implementation within the Network Objects system. It supersedes the earlier version, /pubs/1997/548. The algorithm is based on a reference listing scheme augmented by partial tracing in order to collect distributed garbage cycles. Our collector is designed to be flexible thereby allowing efficiency, expediency and fault-tolerance to be traded against completeness. Processes may be dynamically organised into groups, according to appropriate heuristics, in order to reclaim distributed garbage cycles. Unlike previous group-based algorithms, multiple concurrent distributed garbage collections that span groups are supported: when two collections meet they may either merge, overlap or retreat. The algorithm places no overhead on local collectors and suspends local mutators only briefly. Partial tracing of the distributed graph involves only objects thought to be part of a garbage cycle: no collaboration with other processes is required.

Rodrigues, H. C. and Jones, R. E. (1996) 'A Cyclic Distributed Garbage Collector for Network Objects', in Babaoglu, Ö. and Marzullo, K. (eds) Distributed Algorithms 10th International Workshop. Tenth International Workshop on Distributed Algorithms WDAG'96, Berlin, Germany: Springer, pp. 123-140. doi: 10.1007/3-540-61769-8_9.

Abstract | View in KAR | View Full Text

This paper presents an algorithm for distributed garbage collection and outlines its implementation within the Network Objects system. The algorithm is based on a reference listing scheme, which is augmented by partial tracing in order to collect distributed garbage cycles. Processes may be dynamically organised into groups, according to appropriate heuristics, to reclaim distributed garbage cycles. The algorithm places no overhead on local collectors and suspends local mutators only briefly. Partial tracing of the distributed graph involves only objects thought to be part of a garbage cycle: no collaboration with other processes is required. The algorithm offers considerable flexibility, allowing expediency and fault-tolerance to be traded against completeness.

Jones, R. E. (1995) 'Introducing More Systematic Quality Assurance Arrangements for Course Approval and Review', in Middlehurst, R. (ed.) Managing for Quality: Stories and Strategies. Higher Education Quality Council, pp. 15-17.

View in KAR

Jones, R. E. and Lins, R. D. (1993) 'Cyclic Weighted Reference Counting without Delay', in PARLE '93 Parallel Architectures and Languages Europe 5th International PARLE Conference. Proceedings of PARLE'93, Berlin, Germany: Springer, pp. 712-715. doi: 10.1007/3-540-56891-3_66.

View in KAR

Monographs

Ugawa, T. and Jones, R. (2018) Model Checking Transactional Sapphire. School of Computing, University of Kent. Available at: https://github.com/rejones/sapphire/blob/master/models/report.pdf.

Abstract | View in KAR | View Full Text

This report describes how we verified major aspects of the Transactional Sap- phire garbage collector for Java.

Kalibera, T. and Jones, R. E. (2012) Quantifying Performance Changes with Effect Size Confidence Intervals. University of Kent. doi: 4--12.

Abstract | View in KAR

Measuring performance and quantifying a performance change are core evaluation techniques in programming language and systems research. Out of 122 recent scientific papers published at PLDI, ASPLOS, ISMM, TOPLAS, and TACO, as many as 65 included experimental evaluation that quantified a performance change using a ratio of execution times. Unfortunately, few of these papers evaluated their results with the level of rigour that has come to be expected in other experimental sciences. The uncertainty of measured results was largely ignored. Scarcely any of the papers mentioned uncertainty in the ratio of the mean execution times, and most did not even mention uncertainty in the two means themselves. Furthermore, most of the papers failed to address the non-deterministic execution of computer programs (caused by factors such as memory placement, for example), and none addressed non-deterministic compilation (when a compiler creates different binaries from the same sources, which differ in performance, for example again because of impact on memory placement). It turns out that the statistical methods presented in the computer systems performance evaluation literature for the design and summary of experiments do not readily allow this either. This poses a hazard to the repeatability, reproducibility and even validity of quantitative results. Inspired by statistical methods used in other fields of science, and building on results in statistics that did not make it to introductory textbooks, we present a statistical model that allows us both to quantify uncertainty in the ratio of (execution time) means and to design experiments with a rigorous treatment of those multiple sources of non-determinism that might impact measured performance. Better still, under our framework summaries can be as simple as ''system A is faster than system B by 5.5 ± 2.5, with 95 confidence'', a more natural statement than those derived from typical current practice, which are often misinterpreted.

Hellyer, L., Jones, R. E. and Hosking, A. L. (2010) The Locality of Concurrent Write Barriers (extended version). doi: 3-10.

Abstract | View in KAR

Concurrent and incremental collectors require barriers to ensure correct synchronisation between mutator and collector. The overheads imposed by particular barriers on particular systems have been widely studied. Somewhat fewer studies have also compared barriers in terms of their termination properties or the volume of floating garbage they generate. Until now, the consequences for locality of different barrier choices has not been studied, although locality will be of increasing importance for emerging architectures. This report expands upon our ISMM 2010 paper and provides a study of the locality of concurrent write barriers, independent of the processor architecture, virtual machine, compiler or garbage collection algorithm.

Jones, R. E. and King, A. C. (2004) Collecting the garbage without blocking the traffic. University of Kent. Available at: http://www.cs.kent.ac.uk/pubs/2004/1970/.

Abstract | View in KAR | View Full Text

Java is an increasingly common platform for server-side applications. Such applications are usually long-running, heavily multi-threaded, require very large heaps, executed on multiprocessors, load classes dynamically and make stringent demands of garbage collector performance. Synchronisation of all application threads in order to perform a collection is shown to be a significant bottleneck but current methods fail to solve this issue. We show how a combination of a new static analysis and novel garbage collector framework can address this issue by allowing independent collection of thread-local heaps. In contrast to previous work, our analysis can classify objects even in the presence of incomplete knowledge; our system is safe in the presence of dynamic class loading; it requires neither synchronisation for nor locks during thread-local collections; and it does not use a write-barrier that may do an unbounded work. Finally, our analysis is sufficiently fast to permit its integration into a high-performance, production-quality virtual machine.

Jones, R. E. (2003) DO garbage collection. Kent University. doi: 2-03.

View in KAR

Printezis, T. and Jones, R. E. (2002) GCspy: An Adaptable Heap Visualisation Framework. Computing Laboratory, UKC. doi: 1426.

Abstract | View in KAR | View Full Text

GCspy is an architectural framework for the collection, transmission, storage and replay of memory management behaviour. It makes new contributions to the understanding of the dynamic memory behaviour of programming languages (and especially object-oriented languages that make heavy demands on the performance of memory managers). GCspy's architecture allows easy incorporation into any memory management system: it is not limited to garbage-collected languages. It requires only small changes to the system in which it is incorporated but provides a simple to use yet powerful data-gathering API. GCspy scales to allow very large heaps to be visualised effectively and efficiently. It allows already-running, local or remote, systems to be visualised and those systems to run at full speed outside the points at which data is gathered. GCspy's visualisation tool presents this information in a number of novel ways. Deep understanding of program behaviour is essential to the design of the next generation of garbage collectors and explicit allocators. Until now no satisfactory tools have been available to assist the implementer in gaining an understanding of heap behaviour. GCspy has been demonstrated to be a practical solution to this dilemma. It has been used to analyse production Java virtual machines running applications of realistic size. Its use has revealed important insights into the interaction between application program and JVM and has led to the development of better garbage collectors.

Rodrigues, H. C. and Jones, R. E. (1997) Cyclic Distributed Garbage Collection with Group Merger. University of Kent at Canterbury. doi: 17-97.

Abstract | View in KAR | View Full Text

This paper presents a new algorithm for distributed garbage collection and outlines its implementation within the Network Objects system. The algorithm is based on a reference listing scheme, which is augmented by partial tracing in order to collect distributed garbage cycles. Our collector is designed to be flexible, allowing efficiency, expediency and fault-tolerance to be traded against completeness. Processes may be dynamically organised into groups, according to appropriate heuristics, in order to reclaim distributed garbage cycles. Unlike previous group-based algorithms, multiple concurrent distributed garbage collections that span groups are supported: when two collection meet they may either merge, overlap or retreat. The algorithm places no overhead on local collectors and suspends local mutators only briefly. Partial tracing of the distributed graph involves only objects thought to be part of a garbage cycle: no collaboration with other processes is required.

Thomas, S. P. and Jones, R. E. (1994) Garbage Collecting Shared Environment Closure Reducers. Computing Laboratory, University of Kent at Canterbury. doi: 10.1016/0020-0190(95)00131-U.

Abstract | View in KAR

Shared environment closure reducers such as Fairbairn and Wray's TIM incur a comparatively low cost when creating a suspension, and so provide an elegant method for implementing lazy functional evaluation. However, comparatively little attention has been given to the problems involved in identifying which portions of a shared environment are needed (and ignoring those which are not) during a garbage collection. Proper consideration of this issue has subtle consequences when implementing a storage manager in a TIM-like system. We describe the problem and illustrate the negative consequences of ignoring it. We go on to describe a solution in which the compiler determines statically which portions of that code's environment are required for each piece of code it generates, and emits information to assist the run-time storage manager to scavenge environments selectively. We also describe a technique for expressing this information directly as executable code, and demonstrate that a garbage collector implemented in this way can perform significantly better than an equivalent, table-driven interpretive collector.

Jones, R. E. and Utting, I. (1993) Teaching Electronic Publishing: Learning Software Engineering. University of Kent, Canterbury, UK. doi: 9-93*.

Abstract | View in KAR | View Full Text

This paper describes an experience of teaching electronic publishing to computer scientists. It takes the novel position of using electronic publishing concepts to support computer science education, rather than the more usual reverse.

Jones, R. E. and Lins, R. D. (1992) Cyclic Weighted Reference Counting without Delay. UKC. doi: 28-92*.

Abstract | View in KAR

Lins, R. D. and Jones, R. E. (1991) Cyclic Weighted Reference Counting. UKC.

View in KAR

Jones, R. E. (1990) Tail Recursion Without Space Leaks. UKC. doi: 72*.

View in KAR

Conference or workshop items

Kaleba, S., Larose, O., Jones, R. and Marr, S. (2022) 'Who You Gonna Call? Analyzing the Run-time Call-site Behaviour of Ruby Applications', in. 18th ACM SIGPLAN International Symposium on Dynamic Languages, ACM. doi: 10.1145/3563834.3567538.

Abstract | View in KAR | View Full Text

Applications written in dynamic languages are becoming larger and larger and companies increasingly use multi-million line codebases in production. At the same time, dynamic languages rely heavily on dynamic optimizations, particularly those that reduce the overhead of method calls. In this work, we study the call-site behavior of Ruby benchmarks that are being used to guide the development of upcoming Ruby implementations such as TruffleRuby and YJIT. We study the interaction of call-site lookup caches, method splitting, and elimination of duplicate call-targets. We find that these optimizations are indeed highly effective on both smaller and large benchmarks, methods and closures alike, and help to open up opportunities for further optimizations such as inlining. However, we show that TruffleRuby's splitting may be applied too aggressively on already-monomorphic call-sites, coming at a run-time cost. We also find three distinct patterns in the evolution of call-site behavior over time, which may help to guide novel optimizations. We believe that our results may support language implementers in optimizing runtime systems for large codebases built in dynamic languages.

Ugawa, T., Marr, S. and Jones, R. (2022) 'Profile Guided Offline Optimization of Hidden Class Graphs for JavaScript VMs in Embedded Systems', in. VMIL '22: Virtual Machines and Language Implementations, ACM, pp. 25-35. doi: 10.1145/3563838.3567678.

Abstract | View in KAR | View Full Text

JavaScript is increasingly used for the Internet of Things (IoT) on embedded systems. However, JavaScript's memory footprint is a challenge, because normal JavaScript virtual machines (VMs) do not fit into the small memory of IoT devices. In part this is because a significant amount of memory is used by hidden classes, which are used to represent JavaScript's dynamic objects efficiently. In this research, we optimize the hidden class graph to minimize their memory use. Our solution collects the hidden class graph and related information for an application in a profiling run, and optimizes the graph offline. We reduce the number of hidden classes by avoiding introducing intermediate ones, for instance when properties are added one after another. Our optimizations allow the VM to assign the most likely final hidden class to an object at its creation. They also minimize re-allocation of storage for property values, and reduce the polymorphism of inline caches. We implemented these optimizations in a JavaScript VM, eJSVM, and found that offline optimization can eliminate 61.9% of the hidden classes on average. It also improves execution speed by minimizing the number of hidden class transitions for an object and reducing inline cache misses.

Kaleba, S., Larose, O., Marr, S. and Jones, R. (2022) 'Who You Gonna Call? A Case Study about the Call-Site Behaviour in Ruby-on-Rails Applications', in. MoreVMs'22: Workshop on Modern Language Runtimes, Ecosystems, and VMs. Available at: https://2022.programming-conference.org/details/MoreVMs-2022-papers/4/Who-You-Gonna-Call-A-Case-Study-about-the-Call-Site-Behaviour-in-Ruby-on-Rails-Appli.

Abstract | View in KAR | View Full Text

Web-applications are ubiquitous, from simple personal blogs to e-commerce platforms with millions of sales. Ruby-on-Rails is a popular framework implemented in Ruby that provides tools to build such web-applications. Performance is often critical in the context of large-scale web-applications; especially in dynamic languages such as Ruby that feature reflection and the use of many small methods. Such languages therefore benefit from run-time optimisations, notably through the combined use of lookup caches, splitting and inlining. To limit their overhead, such optimisations generally rely on assumptions that do not necessarily match with the actual run- time behaviour. With Phase-based splitting, we showed that splitting can benefit from using homogeneous patterns of behaviour, called "phases" to reach better performance. In an effort to identify such phases in real-world web-applications, we thoroughly analyse the run-time call-site behaviour of Ruby programs and Ruby-on-Rails applications, running on top of TruffleRuby. This talk describes our findings and aims at guiding future research on call-site optimisation.

Kaleba, S., Marr, S. and Jones, R. E. (2021) 'Avoiding Monomorphisation Bottlenecks with Phase-based Splitting', in. 16th ACM International Workshop on Implementation, Compilation, Optimization of OO Languages, Programs and Systems.

View in KAR | View Full Text

Ugawa, T., Jones, R. E. and Ritson, C. G. (2014) 'An On-The-Fly Copying Garbage Collection Framework for Jikes RVM', in. 12th Asian Symposium on Programming Languages and Systems.

View in KAR | View Full Text

Kalibera, T. and Jones, R. E. (2013) 'Rigorous Benchmarking in Reasonable Time', in. ACM SIGPLAN International Symposium on Memory Management (ISMM 2013), New York: ACM, pp. 63-74. doi: 10.1145/2464157.2464160.

Abstract | View in KAR | View Full Text

Experimental evaluation is key to systems research. Because modern systems are complex and non-deterministic, good experimental methodology demands that researchers account for uncertainty. To obtain valid results, they are expected to run many iterations of benchmarks, invoke virtual machines (VMs) several times, or even rebuild VM or benchmark binaries more than once. All this repetition costs time to complete experiments. Currently, many evaluations give up on sufficient repetition or rigorous statistical methods, or even run benchmarks only in training sizes. The results reported often lack proper variation estimates and, when a small difference between two systems is reported, some are simply unreliable. In contrast, we provide a statistically rigorous methodology for repetition and summarising results that makes efficient use of experimentation time. Time efficiency comes from two key observations. First, a given benchmark on a given platform is typically prone to much less non-determinism than the common worst-case of published corner-case studies. Second, repetition is most needed where most uncertainty arises (whether between builds, between executions or between iterations). We capture experimentation cost with a novel mathematical model, which we use to identify the number of repetitions at each level of an experiment necessary and sufficient to obtain a given level of precision. We present our methodology as a cookbook that guides researchers on the number of repetitions they should run to obtain reliable results. We also show how to present results with an effect size confidence interval. As an example, we show how to use our methodology to conduct throughput experiments with the DaCapo and SPEC CPU benchmarks on three recent platforms. NOTE: this version corrects the ISMM 2013 version

Kalibera, T., Mole, M., Jones, R. E. and Vitek, J. (2012) 'A Black-box Approach to Understanding Concurrency in DaCapo', in Dwyer, M. and Leavens, G. T. (eds). Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), Tucson, AZ, USA: ACM. doi: 10.1145/2384616.2384641.

Abstract | View in KAR

Increasing levels of hardware parallelism are one of the main challenges for programmers and implementers of managed runtimes. Any concurrency or scalability improvements must be evaluated experimentally. However, application benchmarks available today may not reflect the highly concurrent applications we anticipate in the future. They may also behave in ways that VM developers do not expect. We provide a set of platform independent concurrency-related metrics and an in-depth observational study of current state of the art benchmarks, discovering how concurrent they really are, how they scale the work and how they synchronise and communicate via shared memory.

Singer, J. and Jones, R. E. (2011) 'Economic Utility Theory for Memory Management Optimization', in Rogers, I. (ed.). Proceedings of the workshop on Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems, ACM, pp. 182-196. Available at: http://www.cs.kent.ac.uk/pubs/2011/3156.

Abstract | View in KAR

In this position paper, we examine how economic theory can be applied to memory management. We observe the correspondence between the economic notion of a consumer and an instance of a virtual machine running a single program in an isolated heap. Economic resource consumption corresponds to the virtual machine requesting and receiving increased amounts of heap memory from the underlying operating system. As more memory is allocated to a virtual machine's heap, there is additional benefit (cf. economic utility) from the extra resource. We also discuss production and cost functions, which might assist in efficient memory allocation between multiple virtual machines that are competing for a fixed amount of shared system memory.

Singer, J., Marion, S., Brown, G. D., Jones, R. E., Lujan, M., Ryder, C. and Watson, I. (2008) 'An Information Theoretic Evaluation of Software Metrics for Object Lifetime Prediction', in. 2nd Workshop on Statistical and Machine learning approaches to ARchitectures and compilaTion (SMART'08), Goteborg, Sweden.

Abstract | View in KAR | View Full Text

Accurate object lifetime prediction can be exploited by allocators to improve the performance of generational garbage collection by placing immortal or long-lived objects directly into immortal or old generations. Object-oriented software metrics are emerging as viable indicators for object lifetime prediction. This paper studies the correlation of various metrics with object lifetimes. However, to date most studies have been empirical and have not provided any information theoretic underpinning. We use the information theoretic calculation of normalized mutual information to measure correlation. We assess which metrics are most useful for prediction and construct some simple yet accurate object lifetime predictors.

Jones, R. E. and Ryder, C. (2006) 'Garbage Collection Should Be Lifetime Aware', in Zendra, O. (ed.). International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (ICOOOLPS'2006).

Abstract | View in KAR | View Full Text

We argue that garbage collection should be more closely tied to object demographics. We show that this behaviour is sufficiently distinctive to make exploitation feasible and describe a novel GC framework that exploits object lifetime analysis yet tolerates imprecision. We argue for future collectors based on combinations of approximate analyses and dynamic sampling.

Jones, R. E. and Utting, I. (1994) 'Teaching Electronic Publishing: Learning Software Engineering', in. Teaching Electronic Publishing, pp. 71-83.

View in KAR

Bowman, H., Derrick, J. and Jones, R. E. (1994) 'Modelling Garbage Collection Algorithms --- Extend abstract', in. Proceedings of Principles of Distributed Computing'94.

Abstract | View in KAR | View Full Text

We show how abstract requirements of garbage collection can be captured using temporal logic. The temporal logic specification can then be used as a basis for process algebra specifications which can involve varying amounts of parallelism. We present two simple CCS specifications as an example, followed by a more complex specification of the cyclic reference counting algorithm. The verification of such algorithms is then briefly discussed.

Lins, R. D. and Jones, R. E. (1993) 'Cyclic Weighted Reference Counting', in. WP & FP'93 Workshop on Parallel and Distributed Processing, Sofia, Bulgaria, pp. 369-382.

Abstract | View in KAR

Weighted Reference Counting is a low communication distributed storage reclamation scheme for loosely-couple multiprocessors. The algorithm we present herein extends weighted reference counting to allow the collection of cyclic data structures. To do so, the algorithm identifies candidate objects that may be part of cycles and performs a tricolour mark-scan on their subgraph in a lazy manner to discover whether the subgraph is still in use. The algorithm is concurrent in the sense that multiple useful computation processes and garbage collection processes can be performed simultaneously. Keywords: Memory management, Distributed memory, Reference counting, Garbage collection. Introduction Computation on distributed systems involving several processors is already a reality. In a distributed multiprocessor system each processor is responsible for allocating and reclaiming structures residing in its local memory; interprocessor communication is far less efficient than local memory access...

Books

Jones, R. E., Hosking, A. L. and Moss, E. (2012) The Garbage Collection Handbook: The Art of Automatic Memory Management. Chapman and Hall, pp. 182-196. Available at: http://www.cs.kent.ac.uk/pubs/2012/3154.

View in KAR

Jones, R. E. (1996) Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley and Sons.

View in KAR

Theses

Jones, R. E. (1989) On the Implementation of Functional Programming Languages.

View in KAR

Others

Jones, R. E. and Jul, E. (2000) 'Introduction to Memory Management and Garbage Collection'. ACM SIGPLAN.

View in KAR

Jones, R. E. and Jul, E. (1999) 'Garbage Collection', ACM SIGPLAN. OOPSLA'99 Conference on Object-Oriented Programming, Systems, Languages and Applications.

View in KAR

Jones, R. E. (1995) 'Zed --- a Z specification language font for PostScript Type 1 and TrueType'.

View in KAR

Jones, R. E. (1993) 'Zed --- a Z specification language Type 1 font'.

View in KAR

Edited books

Peyton Jones, S. and Jones, R. E. (eds) (1998) Proceedings of the 1st international symposium on Memory management. New York, USA: ACM.

View in KAR

Total publications in KAR: 61 [See all in KAR]

Information obtained from the Kent Academic Repository. The University of Kent official repository of academic activity.

More information about the Kent Academic Repository