School of Computing

Aspects of CXXR internals

Andrew R. Runnalls

Computational Statistics, pages 182-196, September 2010 Based on a paper delivered at the Directions in Statistical Computing conference (DSC2009), Copenhagen, 2009. [doi].

Abstract

CXXR is a project to refactor (reengineer) the interpreter of the R language, currently written for the most part in C, into C++. It is hoped that by reorganising the code along object-oriented lines, by deploying the tighter code encapsulation that is possible in C++, and by improving the internal documentation, the project will make it easier for researchers to develop experimental versions of the R interpreter.

The design of CXXR endeavours to reconcile three objectives: (a) Above all, to be functionally consistent with standard R, both at the R language level, and at the C/Fortran package interface level. (b) For the core of the interpreter to be written in idiomatic, standards-conforming C++, making best use of the C++ standard library, and providing a well documented C++ API on which C++ package writers can build. (c) To provide a reasonably simple mechanism for CXXR to be upgraded to parallel the continuing evolution of standard R.

Development of CXXR started in May 2007, then shadowing R 2.5.1; at the time of this abstract it reflects the functionality of R 2.8.1. An offshoot project is underway to introduce provenance-tracking facilities into CXXR, so that for any R data object it will be possible to determine exactly which original data files it was derived from, and exactly which sequence of operations was used to produce it: in other words, an enhanced version of the old S AUDIT facility.

The primary purpose of the proposed paper is to articulate the design philosophy underlying CXXR, and to illustrate it by describing ways in which the internal engineering of CXXR differs substantially from that of standard R, in particular in the following aspects:

1. The mechanisms for memory allocation and garbage collection (currently undergoing a second round of major refactorisation).

2. The CXXR::RObject C++ class hierarchy, which replaces the SEXPREC union. The paper will illustrate the benefits that this offers to package writers, by enabling them to extend this hierarchy with additional C++ classes, and consider to what extent this circumvents the drawbacks that Bates (2001) identified in combining C++ with standard R.

3. The implementation of environments, based around an abstract C++ class CXXR::Frame, and its relation to the RObjectTables package (Temple Lang, 2001).

The paper will assume some familiarity with the R language, and with either C++ or Java: aspects of C++ that differ substantially from Java will be explained as required.

Download publication 194 kbytes (PDF)

Bibtex Record

@article{3091,
author = {Andrew R. Runnalls},
title = {Aspects of {CXXR} Internals},
month = {September},
year = {2010},
pages = {182-196},
keywords = {determinacy analysis, Craig interpolants},
note = {Based on a paper delivered at the Directions in Statistical Computing conference (DSC2009), Copenhagen, 2009.},
doi = {10.1007/s00180-010-0218-0},
url = {http://www.cs.kent.ac.uk/pubs/2010/3091},
    publication_type = {article},
    submission_id = {26404_1300706221},
    ISSN = {0943-4062},
    journal = {Computational Statistics},
    publisher = {Springer Verlag},
}

School of Computing, University of Kent, Canterbury, Kent, CT2 7NF

Enquiries: +44 (0)1227 824180 or contact us.

Last Updated: 21/03/2014