Last modified 17th April 2005
This document is maintained by Fred Barnes and Peter Welch
Department of Computer Science, University of Kent.

Getting Started with KRoC

This document is based on the "doc/essentially-kroc.txt" file from the distribution, primarily written by Peter Welch.

The Kent Retargetable occam-pi Compiler (KRoC) provides a compiler, run-time system and library support for the occam multiprocessing language, including the occam-pi extensions. This version implements the full occam2.1 language as defined by Inmos, various extensions such as INITIAL-ising declarations and user-defined operators, and support for the occam-pi extensions (mobile data, channels and processes; extended-rendezvous; dynamic process creation; array-constructors and a variety of other features). Certain features have been taken from the occam3 draft specification. Documentation on user-defined operators is included in the distribution as "doc/udo.ps".

KRoC setup procedure
The structure of a top-level KRoC process
Compiling a top-level KRoC process
Compiling a top-level KRoC process that #USEs libraries
Separate compilation

An example of separate compilation

Libaries

Building a byte-code library
Building a native-code library
An example of a user-built library

Post-mortem debugging

KRoC setup procedure

Before any of the KRoC utilities can be used, one of the following commands (depending on which shell you are using) must be run:


    $ source kroc/bin/setup.sh       # for sh or bash shells
    % source kroc/bin/setup.csh      # for csh or tcsh shells

It may be useful to include this command in your shell start-up script.

All the above does is extend your command search path to include the "kroc/bin" directory (to give direct access to `kroc', `ilibr' and `kmakef') and define the "OCSEARCH" environment variable used by those utilities. It also extends existing "LD_LIBRARY_PATH" and "MANPATH" environments to include the run-time library directory ("kroc/lib") and the docs directory ("kroc/doc").

If you are using a Pentium or better CPU, cpu-timer support will have probably been enabled. To make use of this you need to copy the generated "kroc_clock" file to "/etc/kroc_clock", as root. If this is not possible, a copy left in "~/.kroc_clock" or the current-working-directory as "./.kroc_clock" will do the job. The latest versions of KRoC will also read the CPU speed out of "/proc/cpuinfo" if the clock file cannot be found. The occam timer period is 1 micro-second, i.e., 1,000,000 ticks/second.

The KRoC top-level process

KRoC provides a simple interface to the UNIX "stdin", "stdout" and "stderr" file descriptors. Typically, the top-level occam process has the following form:


    PROC foo (CHAN BYTE stdin?, stdout!, stderr!)
      ...  body of foo
    :

Inside the body of foo, inputs from "stdin?" take single characters from the UNIX standard input. Outputs to "stdout!" or "stderr!" send single characters to the UNIX standard output or standard error respectively.

KRoC runs stdin in "raw" mode without echoing. This means that individual keystrokes are supplied to the occam process immediately -- not buffered up until a carriage-return is typed. If echoing of keyboard input is needed, the occam process will have to be programmed to do that (e.g. by outputting the keyboard character on stdout and flushing -- see below).

KRoC runs stdout in its normal "line-buffered" mode. This means that stdout characters are not normally delivered until a carriage-return is output (or the stdout buffer becomes full). However, this buffer can be flushed early by sending the BYTE value 255 to stdout -- for example, see:

    kroc/course/examples/echoing.occ

KRoC runs stderr in its normal "immediate" mode - i.e. each character is flushed to the UNIX device as soon as the occam program outputs there. (this is a feature of stderr, not KRoC).

By default, KRoC will attempt to figure out the channels used in the top-level process and tune the run-time environment to match. The supported interfaces are zero-to-three of the three standard channels (stdin, stdout and stderr) in that order, or two dummy SP channels with an optional memory parameter. For example, the following are all legal top-level PROC interfaces:


    PROC thing (CHAN BYTE keyboard?, screen!)

    PROC bar (CHAN BYTE error!)

    PROC foo (CHAN SP fs?, ts!, []INT free.mem)

This last example, with the SP channels, is intended only for compatability with older occam programs which used the Inmos toolset. The "fs" and "ts" channels, along with the "free.mem", are all invalid. They may be passed around as parameters for compatability, but any attempt to actually use them will result in a run-time error.

In order to differentiate between a single "stdout" or "stderr" channel, the names are important. For the standard-input channel, anything starting "kyb", "key" or containing "in" is valid. For the standard-output channel, anything starting "scr" or containing "out" is valid. For the standard-error channel, anything containing "err" is valid.

It is recommended that the file containing the top-level occam process should be named after that process -- i.e. the files for the above three examples would be, respectively, "thing.occ", "bar.occ" and "foo.occ". Traditionally, occam source files have the ".occ" suffix.

[Exception: occam source files that are only going to be #INCLUDEd by other source files (and not separately compiled themselves) should be given the ".inc" suffix. See the sections on Separate Compilation and Libraries for more information and examples on #INCLUDE.]

Note: in the top-level file, there may be any number of occam declarations (e.g. DATA TYPEs, CHAN TYPEs, PROC TYPEs, PROTOCOLs, VALs, PROCs, FUNCTIONs and/or user-defined operators) before the top-level occam PROC conforming to the above signature. However, declarations of variables, CHANnels, TIMERs and PORTs are not allowed global to this top-level PROC.

Compiling a top-level KRoC process

Suppose that the file "thing.occ" contains only the declaration of "PROC thing" (with the above parameter list) and its code body. Suppose also that it does not #USE any separately compiled processes or libraries. Then, it may be compiled with the command:


    bash$ kroc thing.occ

Assuming a successful compilation, this also links in the occam run-time system to produce an executable file called "thing". To run this file, just issue the command:


    bash$ ./thing

The body of "PROC thing" will start executing. Examples of such files can be found in:


    kroc/course/examples/hello_raw_world.occ
    kroc/course/examples/hello_seq_world.occ
    kroc/course/examples/echoing.occ

Note: the compiler issues warnings about any declared items (such as variables and parameters) that are not subsequently used. If the top-level process (or any other process) does not use one or more of its parameters, such warnings will be issued. Warnings could be suppressed by passing the "-w" flag to the kroc command, but this is not recommended.

[A summary of the flags accepted by the "kroc" command is returned by invoking it with no arguments.]

Note: compiler error messages are written to the UNIX standard error. If a lot of error messages are being generated, it may help to pipe the error output through the UNIX "less" utility. Depending on which shell is being used, this can be accomplished using one of the following commands:


    $ kroc thing.occ 2>&1 | less     # for bash or sh
    % kroc thing.occ |& less         # for csh or tcsh

Of course, the error messages could also be redirected to a file for later examination:


    $ kroc thing.occ 2> error.txt      # for bash or sh
    % kroc thing.occ >& error.txt      # for csh or tcsh

Compiling a top-level KRoC process that `#USE`s libraries

If a program #USEs separately compiled processes or libraries, these must be findable by the compiler (to get information about parameter signatures, workspace needs and channel usage) and by the linker (to build the complete executable).

A KRoC occam library exists in two forms -- an extended transputer byte code (used by the compiler) and native object code (used by the linker).

For example, suppose the file "thing.occ" contains sources that use one or more of the processes or functions provided by the standard "course" library -- i.e. the file has the outline:


    #USE "course.lib"

    PROC thing (CHAN BYTE kyb?, scr!, err!)
      ...  body of thing
    :

The file "course.lib" holds the byte code for this library. The compiler will look for it in the current directory and, failing that, in the sequence of directories defined by the "OCSEARCH" environment variable.

When "source" was run on "kroc/bin/setup.sh" (or "kroc/bin/setup.csh" -- see the previous section), "OCSEARCH" was initialised to include the directories containing all the standard libraries provided by this KRoC release. Since "course.lib" is one of those standard libraries, the compiler will be successful in its search.

The linker has to find the native code version of the library. Native code libraries are just UNIX shared object files. It is traditional for such files to have the suffix ".so" and the prefix "lib" - hence, the name of the needed file, in this case, is "libcourse.so".

The names of any needed native code libraries must be supplied to the linker. For our example, this can either be done explicitly:


    bash$ kroc thing.occ path-to-kroc-install/kroc/lib/libcourse.so

or, and much more pleasantly, by using the "-lname" flag:


    bash$ kroc thing.occ -lcourse

Assuming a successful compilation, this instructs the linker to search through the current and "OCSEARCH" directories for the file "libcourse.so".

[This naming rule follows `gcc' conventions: "-lname" generates a search for the file "libname.so".]

In our example, the search for "libcourse.so" will be successful and all needed references will be found. Hence, the final executable "thing" will be built.

Examples of such files can be found in:


    kroc/course/examples/hello_world.occ
    kroc/course/examples/test_utils.occ
    kroc/course/examples/demo.occ
    kroc/course/examples/sort_pump.occ
    kroc/course/examples/sort_inside.occ
    kroc/course/examples/cast.occ

Any number of libraries (or separately compiled ".o" files) can be linked in by a single "kroc" command. Details of how to build, use and link user-defined libraries are given in the next section.

Separate compilation

A KRoC occam system can be built from any number of separately compiled units. Optionally, these units may be combined into occam "libraries" -- see later.

The general rule for top-level kroc compilation is that the file may contain any sequence of occam declarations (apart from global data variables, CHANnels, TIMERs or PORTs), but that the final item must be a PROC that has the required top-level parameter signature. However, normal engineering requirements mean that larger projects cannot be constructed in one file and must be composed from smaller compilation units.

Passing the "-c" flag to kroc suppresses the final link stage. Suppose "finkle.occ" contains a legal sequence of occam declarations -- none of which need be a (kroc) top-level process. Then, the command:


    bash$ kroc -c finkle.occ

will produce, assuming successful compilation, the following two files:


    finkle.tce          # extended transputer byte-code
    finkle.o            # linkable native object code

The first file contains, in addition to byte code, information defining the outermost level PROC, FUNCTION and user-defined operator signatures (from "finkle.occ") plus their workspace/vectorspace requirements and, in the case of PROCs, channel useage. That information is needed when compiling anything that #USEs this code.

Without building any libraries, this code may be accessed in an occam source by saying:


    #USE "finkle"

When compiling such a source, the compiler will look for the file "finkle.tce" in the current directory and, failing that, in the sequence of directories defined by the "OCSEARCH" environment variable. "OCSEARCH" is initialised by the "kroc/bin/setup.sh" (or "kroc/bin/setup.csh") command but can be extended at any time -- for instance, in the shell start-up script just after the "source" of that "setup" command.

If the source code that "#USE"s "finkle" is not top-level (i.e. it is compiled with the "-c" flag), that is the end of the story. If it is top-level source code, then an executable has to be built -- which means that the native code "finkle.o" file must be found. As with "gcc", this can be done simply by passing it to the "kroc" command. So, if "thing.occ" contains the code:


    #USE "course.lib"
    #USE "finkle"

    PROC thing (CHAN BYTE kyb?, scr!)
      ...  body of thing
    :

it can be compiled and linked (producing the executable "thing") by the command:


    bash$ kroc thing.occ finkle.o -lcourse

Note: a separate compilation unit (such as "finkle.occ") may only contain "templates" at its outermost level -- i.e. DATA TYPEs, CHAN TYPEs, PROC TYPEs, PROTOCOLs, VALs, PROCs, FUNCTIONs and user-defined operators. It may not contain global data declarations -- i.e. variables, CHANnels, TIMERs or PORTs.

Note: only information about PROCs, FUNCTIONs and user-defined operators is stored in the ".tce" file. This means that other forms of template (e.g. DATA TYPEs, CHAN TYPEs, PROC TYPEs, PROTOTYPES, VALs and INLINE PROCs/FUNCTIONs) are allowed but not exported to #USErs. Those other forms of template can only shared by making them available in source form -- the sharers importing them through the #INCLUDE "file-name" mechanism. Such files only have to be found by the compiler which searches through the current directory and, failing that, those listed by "OCSEARCH". Traditionally, occam #INCLUDE source file names have the suffix ".inc".

An example of separate compilation

The file:


    kroc/syncs/examples/crew_test.occ

contains a system demonstrating the CREW synchronisation primitive. This file does not use any separate compilation ... but two DATA TYPEs, one PROTOCOL and four PROC declarations precede the final top-level process declaration (that conforms to the required parameter signature). This system, which uses the standard "course.lib" (for screen control/output and random number generation) may be compiled as before:


    bash$ kroc crew_test.occ -lcourse

However, it is ripe for breaking into separately compiled units. The directory:


    kroc/doc/sc/

contains the source files:


    kroc/doc/sc/blackboard.inc       (DATA TYPE)    (#INCLUDE file)
    kroc/doc/sc/control_info.inc     (DATA TYPE)    (#INCLUDE file)
    kroc/doc/sc/display.inc          (PROTOCOL)     (#INCLUDE file)

    kroc/doc/sc/philosopher.occ      (PROC)         (#USE file)
    kroc/doc/sc/controller.occ       (PROC)         (#USE file)
    kroc/doc/sc/timekeeper.occ       (PROC)         (#USE file)
    kroc/doc/sc/display.occ          (PROC)         (#USE file)

    kroc/doc/sc/crew_test.occ        (PROC)         (top-level file)

These files are just the individual components extracted from the sequential listing (in "kroc/syncs/examples/crew_test.occ").

The ".occ" files #INCLUDE only the special synchronisation primitives being demonstrated ("semaphore.inc" and "crew.inc", which are in the standard KRoC library directory already on the "OCSEARCH" path) and the ".inc" files in their own directory. No special action, therefore, is needed to enable the kroc compiler to find these include-files.

Both "display.occ" and "philosopher.occ" #USE the course library. The needed file ("course.lib") is in the KRoC library directory on the "OCSEARCH" path. The four non-top-level PROCs #USE no other files and so, assuming we are in the "kroc/doc/sc/" directory, may be compiled by one command:


    bash$ kroc -c philosopher.occ controller.occ timekeeper.occ display.occ

which produces the corresponding ".tce" and ".o" files.

The top-level file, "crew_test.occ", #USEs "philosopher", "controller", "timekeeper" and "display". Since their ".tce" files are in the current directory, the compiler will find them. The linker, however, has to be given the matching ".o" files -- as well as the "libcourse.so" library that two of them use. This can be done with the command:


    bash$ kroc crew_test.occ *.o -lcourse

Libraries

Separately compiled compilation units may be bound into a single library. As before, two versions of this library must be built -- one to hold the byte code and the other for native code.

Suppose we have three non-top-level files "a.occ", "b.occ" and "c.occ" from which, by invoking "kroc -c", we have compiled the byte code files "a.tce", "b.tce" and "c.tce" and the UNIX linkables "a.o", "b.o" and "c.o".

Building a byte-code library

The three ".tce" files may be combined into a single byte code library, "abc.lib", by the command:


    bash$ ilibr a.tce b.tce c.tce -o abc.lib

Alternatively, if the list of files to be combined is long, their names may be listed in a single text file -- one filename per line and no spaces. Traditionally, the name for this file is the same as the target ".lib", but with the suffix ".lbb" instead. For the above example, therefore, we need the file "abc.lbb" to hold the three ".tce" file names. Then, the command:


    bash$ ilibr -f abc.lbb -o abc.lib

does the same job as the previous "ilibr" command.

[Note: this alternative approach is preferred, since it makes the automatic generation of makefiles easier -- see the section on Automatic Makefile Generation.]

Future occam sources that need any of the routines from the "abc.lib" need only say:


    #USE "abc.lib"

When compiling such a source, the "abc.lib" file must be either in the current directory or in one of the "OCSEARCH" directories. Your "OCSEARCH" environment variable is freely extendable.

Building a native-code library

Native code libraries are just UNIX shared object library files. It is traditional for such files to have the suffix ".so" and the prefix "lib". Hence, if "abc.lib" is the name of the byte-code library, the name of the corresponding native-code library should be "libabc.so".

For the above example, the three ".o" files may be combined into a single UNIX shared object by using the UNIX "ld" command:


    bash$ ld -r -o libabc.so a.o b.o c.o

The native libraries are only needed for building the final executable. Suppose "main.occ" is a top-level occam source file containing:


    #USE "abc.lib"

and no other #USEs. Suppose also that none of the components of "abc.lib" (i.e. the files "a.occ", "b.occ" and "c.occ") have dependencies (i.e. #USEs or #INCLUDEs). Then, "main.occ" can be compiled and linked into the "main" executable by:


    bash$ kroc main.occ -labc

The "-lname" argument generates a search for the file "libname.so" in the current directory and, failing that, in the "OCSEARCH" directories. So, the above command -- assuming a successful compilation -- will search for the file "libabc.so".

[Note that this "-labc" argument is only used for the linker. As mentioned above, the compiler will find the file "abc.lib" by itself.]

Alternatively, a path name direct to the native library would suffice:


    bash$ kroc main.occ wherever-was-filed/libabc.so

Of course, if "libabc.so" were in the current directory, this reduces to:


    bash$ kroc main.occ libabc.so

Warning: if several libraries need to be linked in and those libraries #USE each other, they must be supplied to the "kroc" command in dependency order (with the most dependent first). This is because "kroc" uses "gcc" (which uses "ld") to perform the final link stage. Standard "ld" rules, therefore, apply. Because occam does not support mutual recursion, there can be no mutual dependencies between libraries and a linear dependency ordering can always be found.

For example, suppose "main.occ" and/or one (or more) of the files making up "abc.lib" contains:


    #USE "def.lib"

Suppose also that "libdef.so" is the native code archive corresponding to "def.lib". Then, the correct ordering for the link arguments is:


    bash$ kroc main.occ -labc -ldef

If the last two arguments were switched around, the linking would fail with unresolved references for the names in "libdef.so".

Another implication from using "ld" for linking is that there must be no PROC or FUNCTION name clashes between any of the components linked into the final executable -- "ld" does not know how to resolve them. This may be a problem when linking in large and independently written libraries. C and C++ share this problem.

[Note: occam semantics resolve such clashes using its normal block structuring rules for namespaces. This problem arises because we are using a standard UNIX linker that is not occam-aware.]

A better solution may lie in a formal LIBRARY mechanism for occam. A proposal for one exists in the draft occam3 specification. We look forward to a speedy resolution from the open source community ;-) ...

An example of a user-built library

This example is based upon the earlier demonstration of separate compilation. The directory:


    kroc/doc/lib/

contains the source files:


    kroc/doc/lib/blackboard.inc      (DATA TYPE)    (#INCLUDE file)
    kroc/doc/lib/control_info.inc    (DATA TYPE)    (#INCLUDE file)
    kroc/doc/lib/display.inc         (PROTOCOL)     (#INCLUDE file)

    kroc/doc/lib/philosopher.occ     (PROC)         (library component)
    kroc/doc/lib/controller.occ      (PROC)         (library component)
    kroc/doc/lib/timekeeper.occ      (PROC)         (library component)
    kroc/doc/lib/display.occ         (PROC)         (library component)

    kroc/doc/lib/crew_test.occ       (PROC)         (top-level file)

The only file that is different from its namesake in "kroc/doc/sc/" is the top-level file "crew_test.occ". Instead of:


    #USE "philosopher"
    #USE "controller"
    #USE "timekeeper"
    #USE "display"

it contains just the one:


    #USE "college.lib"

For this to work, the four library components must be combined into the library "college.lib". First, they must be compiled as before:


    bash$ kroc -c philosopher.occ controller.occ timekeeper.occ display.occ

which produces the corresponding ".tce" and ".o" files. Then, the byte-code library must be formed:


    bash$ ilibr -f college.lbb -o college.lib

where "college.lbb" is a file containing just the names of the files to be combined ("philosopher.tce", "controller.tce", "timekeeper.tce" and "display.tce").

Finally, the native code archive must be built:


    bash$ ld -r -o libcollege.so *.o

The final system may be compiled and linked with the command:


    bash$ kroc crew_test.occ libcollege.so -lcourse

where the final argument is needed since two of the "college.lib" components #USE "course.lib".

Of course, if this college library were reusable for more than one system, its files "college.lib" and "libcollege.so" should be put in a publically readable directory and that directory added to the "OCSEARCH" path. In which case, the above command should be replaced with:


    bash$ kroc crew_test.occ -lcollege -lcourse

Note that, in this case, the ordering of the last two arguments is significant.

Post-mortem debugging

Occasionally, we make mistakes in our occam coding or system design that leads to run-time errors. Examples include numeric overflow, array-bounds violation, IFs without a TRUE condition, deliberate STOPs and deadlock.

The default action upon these errors simply terminates the run with a terse message that gives no clue as to which part of our code was responsible.

Passing the "-d" flag to the kroc command causes a very small amount of extra code to be generated that saves a very small amount of extra information at run-time.

[These overheads are very small -- e.g. around 4 nanoseconds on top of the 80 nanosecond context switch - so there is an argument that this should be made the default action for KRoC.]

If an error occurs from a region of code compiled with this "-d" flag, the line number, PROC name and file name of the offending statement will be reported. The line number is only approximate (but it's usually right!).

If deadlock occurs, the occam workspace is searched for blocked processes. All processes blocked on a communication from a region of code compiled with the "-d" flag will be found and the line number, PROC name and file name where each of them are stuck will be reported. It is possible that a false hit may be reported, but this will be extremely rare and we have never seen it in practice.

At present, only processes blocked on a channel communication (including an ALT) will be reported after a deadlock. Future releases of KRoC may also hunt down processes blocked on the additional synchronisation primitives provided (SEMAPHOREs, BARRIERs, BUCKETs, CREWs, ...).