The aim of the CXXR project is gradually to refactor (reengineer) the interpreter of the R language, currently written for the most part in C, into C++, whilst as far as possible retaining full functionality. CXXR is being carried out independently of the main R development and maintenance effort.
Note: the CXXR documentation often uses the acronym CR to refer to the standard R interpreter, in contradistinction to CXXR.
It is hoped that by reorganising the code along object-oriented lines, by deploying the tighter code encapsulation that is possible in C++, and by improving the internal documentation, the project will make it easier for researchers to develop experimental versions of the R interpreter. An important subsidiary objective is to create a variant of R with built-in facilities for provenance tracking, so that for any R data object it will be possible to determine exactly which original data files it was derived from, and exactly which sequence of operations was used to produce it: if you remember the old S AUDIT facility, you will probably know how useful this can be.
C++, though perhaps somewhat unfashionable, is a strongly-typed language with a powerful range of facilities for object-oriented programming. In its design, constant attention has been paid to providing a smooth conversion pathway from C. Compilers, including free compilers, are readily available, and the language is well standardised. The current standard is ISO14882:2003, but the objective in CXXR is to require only that the compiler be able to cope with code conforming to the earlier standard, ISO14882:1998. And last but not least, it is a language that I have had years of experience with (though always learning more!).
Maybe you're right: if you have the time and the expertise, go right ahead!
Allocatorlook after memory allocation;
WeakReflook after garbage collection. (All CXXR classes are within the namespace
CXXR.) Garbage collection is now based primarily on reference counting, with a (non-generational) mark-sweep algorithm as a backstop.
SEXPRECunion of CR has been converted into an extensible hierarchy of classes rooted at a class
RObject(which inherits from
GCNode). The functionality of
duplicate1()(in CR's file
duplicate.c) has been reimplemented using class copy constructors and a virtual function
RObject::clone(). Code associated with a particular R data type is progressively being shifted into the relevant class, and C++'s public/protected/private access controls used to defend class invariants.
RObjecthierarchy can apply its own checks on how attributes are set, and override the default way in which attribute values are stored internally.
Frameas the fundamental building block. Facilities such as those provided by the package RObjectTables can now be implemented more simply by inheriting from
Frame. Hooks have been provided for monitoring the reading or writing of symbol bindings within environments.
RCNTXT)have been separated and refactored using a variety of mechanisms. In particular, indirect flows of control are now much more in line with C++ idioms, in particular in relying on object destructors to restore necessary state as the stack is unwound.
$(R_HOME)/include/CXXRAPI. For example R's subscripting operations (subsetting and subassignment) are now carried out by algorithms implemented as C++ templates, so that they are applicable to generalised vectors of arbitrary element types, not just the R built-in vector types.
See the refactoring history for more information.
Certainly, most readily by trying out CXXR and reporting any bugs you find. Beware however that if you come across program faults, CXXR is likely to abort gracelessly without saving your work! (Control-C will also abort the interpreter at present.) Testing in a non-English locale would be particularly welcome.
If you want to contribute to coding, experience specifically of C++ would
a definite advantage: unfortunately, good C programmers tend to make bad
programmers (and vice versa); Java likewise. I would
welcome help in porting CXXR to platforms other than Linux, particularly
Microsoft Windows (using mingw etc.).
My contact email is at the foot of this page.
CXXR would obviously not have been feasible without the work of the R core team in developing and maintaining R itself. The overwhelming majority of the code in CXXR is lifted directly from R (under the terms of the GNU General Public Licence). But equally important is the excellent test suite that the R team has developed, and to which I hope CXXR will in due course be able to contribute.
Particular thanks are owed to the following (in alphabetical order):