While this code is made publicly available, all work that uses this code, extends this code or makes a comparison to this code is required to cite my corresponding papers that describe my work on decision tree induction using evolutionary algorithms. INFO The package contains software for building decision trees using evolutionary search. Input format is exactly the same as in C4.5. It can learn several types of decision trees. The standard C4.5-like top down learning is implemented as well but pruning is missing. For evaluation, the testing set can be provided or cross-validation can be requested and the program will split the data, perform cross-validation and return the result. See the HOW TO RUN section below for details. The code is written in portable C++ and compiles in at least Linux and Windows. Below is the description how to compile and run it in Linux. The package contains a project file for Windows Visual Studio as well. HOW TO COMPILE Last check in Ubuntu 12.04 using g++ 4.6.3. 1) mlpcore has to be compiled first: cd mlpdtapp/mlpcore ./configure make 2) mlpdtapp is the main application and can be compiled as follows: cd mlpdtapp ./configure make mlpdt/MLPAppMain/mlpdt # is the final executable file HOW TO RUN Program can take several command line options. Use -h to see them: mlpdt -h There is some example data in folder data that can be used to run the algorithm. The algorithm takes the task description in the XML file. By default it looks for file MLPExperiment.xml in the current directory. A custom configuration file can be set using option -f. Option -g is very useful because it prints an example configuration file that shows different types of algorithms and how to set up the experiment. For example, you can run: mlpdtapp/mlpdt/MLPAppMain/mlpdt -f examples/example_iris.xml from the main project directory. As indicated in that xml file, results are logged to /tmp. So, in this case results will appear in: /tmp/__MLPDT_* files. You will recognise there familiar trees that C4.5 prints to standard output. Also check *.dat and *.plt files. These are gnuplot files that contain graphs for all experiments specified in your xml file. Run gnuplot and you will see comparisons. The final result is in MLPResults.xml. The engine to run experiments is quite clever. If you start a very long experiment and you stop it half way, you can run the program again, and it will check log files to determine how far it got, and it will resume from the correct point. Hence, if you execute the same experiment after it completes, it will print that it skips all requested algorithm-data combinations. In order to run it again, you need delete or move the log files of the previous run (MLPResults.xml file has to be removed as well). I did not test it very extensively on latest versions of gcc and last time I was working on that code was in 2006. DISCLAIMER This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ---------- I hope that you will find it useful. Have fun! If you have questions or comments, I'd love to hear from you. Marek Grzes