While this code is made publicly available, all work that uses this
code, extends this code or makes a comparison to this code is
required to cite my corresponding papers that describe my work on
decision tree induction using evolutionary algorithms.

INFO

The package contains software for building decision trees using evolutionary
search. Input format is exactly the same as in C4.5. It can learn
several types of decision trees. The standard C4.5-like top
down learning is implemented as well but pruning is missing.
For evaluation, the testing set can be provided or cross-validation
can be requested and the program will split the data, perform
cross-validation and return the result. See the HOW TO RUN section below
for details.

The code is written in portable C++ and compiles in at least Linux
and Windows. Below is the description how to compile and run it in
Linux. The package contains a project file for Windows Visual Studio
as well.

HOW TO COMPILE

Last check in Ubuntu 12.04 using g++ 4.6.3.

1) mlpcore has to be compiled first:

   cd mlpdtapp/mlpcore
   ./configure
   make

2) mlpdtapp is the main application and can be compiled as follows:
   cd mlpdtapp
   ./configure
   make
   mlpdt/MLPAppMain/mlpdt # is the final executable file

HOW TO RUN

Program can take several command line options. Use -h to see them:
mlpdt -h

There is some example data in folder data that can be used to run
the algorithm. The algorithm takes the task description in the XML file.
By default it looks for file MLPExperiment.xml in the current
directory. A custom configuration file can be set using option -f.
Option -g is very useful because it prints an example configuration
file that shows different types of algorithms and how to set up the
experiment.

For example, you can run:

mlpdtapp/mlpdt/MLPAppMain/mlpdt -f examples/example_iris.xml

from the main project directory. As indicated in that xml file, results
are logged to /tmp. So, in this case results will appear in:
/tmp/__MLPDT_* files. You will recognise there familiar trees that C4.5
prints to standard output. Also check *.dat and *.plt files.
These are gnuplot files that contain graphs for all experiments specified
in your xml file. Run gnuplot and you will see comparisons. The final
result is in MLPResults.xml.

The engine to run experiments is quite clever. If you start a very long
experiment and you stop it half way, you can run the program again, and it
will check log files to determine how far it got, and it will resume from
the correct point. Hence, if you execute the same experiment after it
completes, it will print that it skips all requested algorithm-data
combinations. In order to run it again, you need delete or move the log
files of the previous run (MLPResults.xml file has to be removed as well).

I did not test it very extensively on latest versions of gcc and last
time I was working on that code was in 2006.

DISCLAIMER

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

----------

I hope that you will find it useful. Have fun! If you have questions
or comments, I'd love to hear from you.

Marek Grzes