Arena Language Manual

(C) 2006, Pascal Schmidt <arena-language@ewetel.net>

Contents

1 Introduction
    1.1 What's Arena?
    1.2 Why another scripting language
    1.3 Target audience
    1.4 Versioning
    1.5 Structure of this manual
    1.6 License

2 Language
    2.1 Basic tokens
        2.1.1 Comments
        2.1.2 Keywords
        2.1.3 Operators
        2.1.4 Identifiers
        2.1.5 Integer literals
        2.1.6 Float literals
        2.1.7 String literals
        2.1.8 Grouping symbols
    2.2 Runtime type system
        2.2.1 void
        2.2.2 bool
        2.2.3 int
        2.2.4 float
        2.2.5 string
        2.2.6 array
        2.2.7 struct
        2.2.8 fn
        2.2.9 resource
    2.3 Scopes and namespaces
        2.3.1 Top-level vs. function-level scope
        2.3.2 Global vs. local namespace
    2.4 Statements
        2.4.1 Basic rules for statements
        2.4.2 Include statement
        2.4.3 Control flow statements
            2.4.3.1 if statement
            2.4.3.2 while loop statement
            2.4.3.3 do loop statement
            2.4.3.4 for loop statement
            2.4.3.5 continue statement
            2.4.3.6 break statement
            2.4.3.7 switch statement
            2.4.3.8 try statement
            2.4.3.9 throw statement
        2.4.4 User-defined functions
            2.4.4.1 Function definition
            2.4.4.2 return statement
        2.4.5 Structure templates
            2.4.5.1 Defining structure fields
            2.4.5.2 Defining structure methods
            2.4.5.3 Constructor method
    2.5 Expressions
        2.5.1 Basic rules for expression nesting
        2.5.2 Constant expressions
        2.5.3 Reference expressions
            2.5.3.1 Static reference expressions
            2.5.3.2 Indexing of elements
        2.5.4 Cast expressions
            2.5.4.1 Conversion to void
            2.5.4.2 Conversion to bool
            2.5.4.3 Conversion to int
            2.5.4.4 Conversion to float
            2.5.4.5 Conversion to string
            2.5.4.6 Conversion to array
            2.5.4.7 Conversion to struct
            2.5.4.8 Conversion to fn
            2.5.4.9 Conversion to resource
        2.5.5 Assignment expressions
            2.5.5.1 Indexing in assignments
            2.5.5.2 Combining assignments and operators
        2.5.6 Function calls
            2.5.6.1 Passing arguments "by reference"
        2.5.7 Basic rules for structure templates
        2.5.8 Constructor calls
        2.5.9 Method calls
            2.5.9.1 Static method calls
            2.5.9.2 Dynamic method calls
        2.5.10 Operators
            2.5.10.1 Math operators
            2.5.10.2 Boolean operators
            2.5.10.3 Equality operators
            2.5.10.4 Order operators
            2.5.10.5 Bitwise operators
            2.5.10.6 Operator precedence
        2.5.11 Conditional expression
        2.5.12 Source file and line expressions
        2.5.13 Anonymous functions

3 Library
    3.1 Runtime system
        3.1.1 FLT_RADIX
        3.1.2 FLT_DIG
        3.1.3 FLT_MANT_DIG
        3.1.4 FLT_MAX_EXP
        3.1.5 FLT_MIN_EXP
        3.1.6 FLT_EPSILON
        3.1.7 FLT_MAX
        3.1.8 FLT_MIN
        3.1.9 INT_MAX
        3.1.10 INT_MIN
        3.1.11 type_of
        3.1.12 tmpl_of
        3.1.13 is_void
        3.1.14 is_bool
        3.1.15 is_int
        3.1.16 is_float
        3.1.17 is_string
        3.1.18 is_array
        3.1.19 is_struct
        3.1.20 is_fn
        3.1.21 is_resource
        3.1.22 is_a
        3.1.23 is_function
        3.1.24 is_var
        3.1.25 is_tmpl
        3.1.26 is_local
        3.1.27 is_global
        3.1.28 cast_to
        3.1.29 set
        3.1.30 get
        3.1.31 get_static
        3.1.32 unset
        3.1.33 global
        3.1.34 assert
        3.1.35 versions
    3.2 Math functions
        3.2.1 exp
        3.2.2 log
        3.2.3 log10
        3.2.4 sqrt
        3.2.5 ceil
        3.2.6 floor
        3.2.7 fabs
        3.2.8 sin
        3.2.9 cos
        3.2.10 tan
        3.2.11 asin
        3.2.12 acos
        3.2.13 atan
        3.2.14 sinh
        3.2.15 cosh
        3.2.16 tanh
        3.2.17 abs
    3.3 Printing functions
        3.3.1 print
        3.3.2 dump
        3.3.3 sprintf
        3.3.4 printf
    3.4 String functions
        3.4.1 strlen
        3.4.2 strcat
        3.4.3 strchr
        3.4.4 strrchr
        3.4.5 strstr
        3.4.6 strspn
        3.4.7 strcspn
        3.4.8 strpbrk
        3.4.9 strcoll
        3.4.10 tolower
        3.4.11 toupper
        3.4.12 isalnum
        3.4.13 isalpha
        3.4.14 iscntrl
        3.4.15 isdigit
        3.4.16 isgraph
        3.4.17 islower
        3.4.18 isprint
        3.4.19 ispunct
        3.4.20 isspace
        3.4.21 isupper
        3.4.22 isxdigit
        3.4.23 substr
        3.4.24 left
        3.4.25 right
        3.4.26 ord
        3.4.27 chr
        3.4.28 explode
        3.4.29 implode
        3.4.30 ltrim
        3.4.31 rtrim
        3.4.32 trim
    3.5 Array functions
        3.5.1 mkarray
        3.5.2 qsort
        3.5.3 is_sorted
        3.5.4 array_unset
        3.5.5 array_compact
        3.5.6 array_search
        3.5.7 array_merge
        3.5.8 array_reverse
    3.6 List functions
        3.6.1 nil
        3.6.2 cons
        3.6.3 length
        3.6.4 null
        3.6.5 elem
        3.6.6 head
        3.6.7 tail
        3.6.8 last
        3.6.9 init
        3.6.10 take
        3.6.11 drop
        3.6.12 intersperse
        3.6.13 replicate
    3.7 Structure functions
        3.7.1 mkstruct
        3.7.2 struct_get
        3.7.3 struct_set
        3.7.4 struct_unset
        3.7.5 struct_fields
        3.7.6 struct_methods
        3.7.7 is_field
        3.7.8 is_method
        3.7.9 struct_merge
    3.8 Functions on functions
        3.8.1 is_builtin
        3.8.2 is_userdef
        3.8.3 function_name
        3.8.4 call
        3.8.5 call_array
        3.8.6 call_method
        3.8.7 call_method_array
        3.8.8 prototype
        3.8.9 map
        3.8.10 filter
        3.8.11 foldl
        3.8.12 foldr
        3.8.13 take_while
        3.8.14 drop_while
    3.9 Random number functions
        3.9.1 RAND_MAX
        3.9.2 rand
        3.9.3 srand
    3.10 Environment functions
        3.10.1 argc
        3.10.2 argv
        3.10.3 exit
        3.10.4 getenv
        3.10.5 system
    3.11 File I/O functions
        3.11.1 stdin
        3.11.2 stdout
        3.11.3 stderr
        3.11.4 is_file_resource
        3.11.5 fopen
        3.11.6 fseek
        3.11.7 ftell
        3.11.8 fread
        3.11.9 fgetc
        3.11.10 fgets
        3.11.11 fwrite
        3.11.12 setbuf
        3.11.13 fflush
        3.11.14 feof
        3.11.15 ferror
        3.11.16 clearerr
        3.11.17 fclose
        3.11.18 remove
        3.11.19 rename
        3.11.20 errno
        3.11.21 strerror
    3.12 Date and time functions
        3.12.1 Date and time structure
        3.12.2 time
        3.12.3 gmtime
        3.12.4 localtime
        3.12.5 mktime
        3.12.6 asctime
        3.12.7 ctime
        3.12.8 strftime
    3.13 Locale functions
        3.13.1 getlocale
        3.13.2 setlocale
        3.13.3 localeconv
    3.14 Dictionary functions
        3.14.1 is_dict_resource
        3.14.2 dopen
        3.14.3 dread
        3.14.4 dwrite
        3.14.5 dremove
        3.14.6 dexists
        3.14.7 dclose
    3.15 Memory management functions
        3.15.1 is_mem_resource
        3.15.2 malloc
        3.15.3 calloc
        3.15.4 realloc
        3.15.5 free
        3.15.6 cnull
        3.15.7 is_null
        3.15.8 cstring
        3.15.9 mputchar
        3.15.10 mputshort
        3.15.11 mputint
        3.15.12 mputfloat
        3.15.13 mputdouble
        3.15.14 mputstring
        3.15.15 mputptr
        3.15.16 mgetchar
        3.15.17 mgetshort
        3.15.18 mgetint
        3.15.19 mgetfloat
        3.15.20 mgetdouble
        3.15.21 mgetstring
        3.15.22 mgetptr
        3.15.23 mstring
        3.15.24 is_rw
        3.15.25 msize
        3.15.26 memcpy
        3.15.27 memmove
        3.15.28 memcmp
        3.15.29 memchr
        3.15.30 memset
    3.16 Foreign function calls
        3.16.1 dyn_supported
        3.16.2 is_dyn_resource
        3.16.3 dyn_open
        3.16.4 dyn_close
        3.16.5 dyn_fn_pointer
        3.16.6 cfloat
        3.16.7 dyn_call_void
        3.16.8 dyn_call_int
        3.16.9 dyn_call_float
        3.16.10 dyn_call_ptr
    3.17 PCRE functions
        3.17.1 pcre_supported
        3.17.2 PCRE_ANCHORED
        3.17.3 PCRE_CASELESS
        3.17.4 PCRE_DOLLAR_ENDONLY
        3.17.5 PCRE_DOTALL
        3.17.6 PCRE_EXTENDED
        3.17.7 PCRE_MULTILINE
        3.17.8 PCRE_UNGREEDY
        3.17.9 PCRE_NOTBOL
        3.17.10 PCRE_NOTEOL
        3.17.11 PCRE_NOTEMPTY
        3.17.12 is_pcre_resource
        3.17.13 pcre_compile
        3.17.14 pcre_match
        3.17.15 pcre_exec
        3.17.16 pcre_free

4 Changes
    4.1 Language changes
        4.1.1 Version 1.0 to 2.0
        4.1.2 Version 2.0 to 2.1
        4.1.3 Version 2.1 to 2.2
    4.2 Library changes
        4.2.1 Version 1.0 to 1.1
        4.2.2 Version 1.1 to 2.0
        4.2.3 Version 2.0 to 2.1
        4.2.4 Version 2.1 to 2.2
        4.2.5 Version 2.2 to 2.3
        4.2.6 Version 2.3 to 2.4
        4.2.7 Version 2.4 to 2.5
        4.2.8 Version 2.5 to 2.6
        4.2.9 Version 2.6 to 2.7
        4.2.10 Version 2.7 to 3.0

1 Introduction

This manual describes the Arena scripting language. It is meant to give a complete overview of the language. This includes syntax, semantics, and standard library functions provided by the language runtime environment.

1.1 What's Arena?

Arena is a scripting language. It is closely modelled on the C programming language, but with some features removed and added to create a language more suitable to ad-hoc scripting. The following is a description of the main differences between Arena and C.

Arena does automatic memory management. This means the programmer does not have to reserve memory for strings and arrays. Additionally, variables do not have to be declared before they are used.

Arena uses dynamic typing. This means variables can be used to store arbitrary values. A variable that holds an integer at the beginning of a script may well be used to hold a string at the end of the same script. The concept extends to arrays -- arrays can have elements of different types.

Arena has anonymous functions. Sometimes you may want to pass a function into another function (functions can accept other functions as their arguments), and anonymous functions provide a way of doing so without having to invent a function name. This is especially useful if you need a particular function just once and just for passing into another function.

Arena provides exception support. Exceptions can be used for handling error situations in a script. They provide out-of-band error signalling and handling.

Arena does not allow user-defined datatypes. This is a restriction common to many scripting languages. It does, however, have structure templates, which work a lot like classes in object-oriented programming languages.

Arena does not provide a way to define constants -- that is, values set by the programmer that cannot change during the execution of a script. The rationale is that it is not strictly necessary to have constants provided by the language. One can simply use a global variable and write to it only once at the beginning of a script.

Apart from the functions listed above, Arena tries to emulate C as much as possible. The semantics of language construct are supposed to match C, and the standard library of functions uses the same names as the C standard library where both provide the same functionality.

1.2 Why another scripting language

There is no shortage of existing scripting languages, so why design and write another one? Two reasons, mainly.

The first reason is that many people, especially in the Unix community, know how to program in C, but having to do your own memory management all the time is a pain for small or quick projects. Arena provides a way to write "almost C" code without having to think about memory management. Dynamic typing was added because it is very convenient to have once you have already abandoned the need to declare variables before use (which you have to do in C so that the compiler can set aside memory for variables).

The second reason for writing another language is that most scripting languages of today are not really lightweight anymore. Extensive function libraries often mean that a scripting language interpreter is several megabytes in size. For fans of more minimalist approaches, several megabytes ain't it. Arena's standard library of functions is based on that of ISO C for the very reason that it is very compact and does not provide bells and whistles.

1.3 Target audience

This manual tries to describe the syntax and semantics of the Arena language, but it does not go into every detail and certainly is no guide on how to solve real problems using Arena.

It is assumed the reader already knows how to program. Most of the language constructs of Arena appear in other languages, as well, so already knowing a different programming language helps. Since Arena is modelled on C, knowing C helps a lot. For structure templates, which are not taken from C, knowledge of object-oriented programming languages such as C++ or Java should help, since structure templates are basically a low-level version of classes.

1.4 Versioning

Both the language and standard library are versioned. This manual describes version 2.2 of the language and version 3.0 of the standard library.

Incompatible changes to the language or library result in a change of the major version number and an implementation of the new version cannot run all scripts written for a previous version of the language. Thus, an implementation of version 2.0 of the language will not run all possible version 1.0 scripts.

Compatible changes to the language or library result in a change of the minor version number. An implementation of such a new version must still be able to run all scripts written for a version of the language with the same major version number and a smaller minor version number. Thus, an implementation for version 1.3 of the language will still run all version 1.0, 1.1, and 1.2 scripts.

Minor version number changes for the language are only possible if some new syntax is introduced, in such a way that the new syntax would have been a syntax error in the previous version. Changes to existing syntax require a new major version.

Minor version number changes for the library are possible as long as only new library functions are introduced by the new version. Old scripts that already use the same function names for user-defined functions will still work as the user-defined functions will overwrite the library functions.

1.5 Structure of this manual

The rest of the text is divided into two main chapters. The first describes the syntax and intended semantics of the language. The second describes the standard library of functions that come with the language.

If some aspect of the behaviour of the language or library is said to be "implementation-defined", this means an implementation of the language can freely choose how to behave for the described situation. However, the choice must be consistent -- under the same circumstances, the same behaviour must result.

If some aspect of the behaviour of the language or library is said to be "undefined", this means an implementation of the language can do anything for the described situation, no matter how inconsistent. An implementation may even crash if an undefined situation arises during the execution of a script; or, as has been observed about C, an implementation may make demons fly out of your nose if you invoke undefined behaviour.

1.6 License

You are free to copy, distribute, display, make derivative works of, and/or make commercial use of this manual, provided you follow these conditions:

You must keep any copyright notices and license terms intact. You are free to add your own copyright notices to parts of a derivative work that you wrote yourself.

If you make changes to the semantics of existing parts of the text, those parts must carry prominent notice that you changed them. This condition is made because this manual describes the behaviour of a programming language, and changes to the text can easily change the described behaviour. This could lead to the changed text describing another, slightly incompatible language.

2 Language

This section of the manual describes the syntax and semantics of the Arena scripting language.

This version of the language manual describes version 2.2 of the language.

2.1 Basic tokens

When a script is parsed by the Arena interpreter, it is first split up into tokens. These tokens are then combined to form statements and expressions. Since it is important to know what kind of tokens (for example, variable and function names) are accepted by the language, the different token types are described next.

2.1.1 Comments

Comments can be part of a script. They are ignored by the interpreter and can be used to annotate the script for human readers. There are two forms of comments: one-line comments and multi-line comments.

One-line comments start with the character "#" (hash) or the characters "//" (double forward slash). They can be placed anywhere on an input line and cause the rest of the line to be treated as a comment. The following are examples of one-line comments:

	# this line is ignored
	a = 5; // everything back here is ignored


Multi-line comments start with the characters "/*" (forward slash followed by asterisk) and end with the characters "*/" (asterisk followed by forward slash). Everything between those two markings is ignored. Multi-line comments can be nested -- you need a matching number of "/*" and "*/" sequences to really end a comment. The following is an example of a multi-line comment:

	/* this is lengthy explanation of what is happening,
	   but you can probably figure that out yourself */


2.1.2 Keywords

Keyword are words reserved by the language. They are used to make up statements and expressions. They cannot be used as names for variables, functions, or templates. Keywords are case-sensitive: "do" is a language keyword, "Do" or "DO" are not.

The following is a list of all Arena keywords:

	array   break    bool     case    catch   continue
	default do       else     extends false   float
	fn      for      forced   if      include int
	mixed   new      resource return  string  struct
	switch  template throw    true    try     void
	while


2.1.3 Operators

Operators are special symbols reserved by the language. They are used to combine expressions and generally represent operations performed on pieces of data. For example, the + operator denotes mathematical addition.

The following is a list of all Arena operator symbols:

	::      ==      !=      <=      >=      <
	>       ++      --      &&      ||      **
	+       -       *       /       %       &
	|       ^       <<      >>      !       ~
	=       +=      -=      *=      /=      &=
	|=      ^=      <<=     >>=


2.1.4 Identifiers

An identifier is a name used for a variable, a function, or a structure template. It is used in a script to refer to entities of the language by name. Identifiers are chosen by the programmer. The language actually puts some identifiers in place before a script starts (those for the standard library of functions), but those are not reserved in the same way that keywords are -- you can reuse them for your own variables, functions, or structure templates if you wish.

An identifier starts with an underscore character or an upper-case or lower-case letter. A letter is one of the 26 characters in the range A-Z (no umlauts or accented letters allowed). For the rest of an identifier, the same characters are allowed, with the addition of decimal digits. Decimal digits are characters in the range 0-9.

Keywords cannot be used as identifiers.

The following is a list of example identifiers:

	foo
	x2
	my_funny_name
	__something


2.1.5 Integer literals

An integer literal is used to represent an integer number in a script. An integer literal is made up of an optional prefix and one or more digits. An integer literal with no prefix is treated as a decimal number. Decimal digits are characters in the range 0-9. An integer literal with the prefix "0" (zero) is treated as an octal number. Octal digits are characters in the range 0-7. An integer literal with the prefix "0x" (zero x) is treated as a hexadecimal number. Hexadecimal digits are characters in the ranges 0-9, a-f, and A-F.

The following are examples of integer literals:

	0
	123
	0755
	0xFF
	0xbeef


2.1.6 Float literals

A float literal is used to represent a floating point number in a script. A float literal is made up of zero or more decimal digits, followed by a period, followed by one or more decimal digits. A decimal digit is a character in the range 0-9. Optionally, an exponent can be added to the end of the literal. This is composed of the letter "e" or "E", followed by either "+" or "-", followed by one or more decimal digits. If present, the exponent is used as a base 10 exponent and multiplied with the rest of the number. As an example, "1E-2" is the same as 1 * 10^(-2) which is 0.01.

The following are examples of float literals:

	1.0
	.25
	0.376568E-10
	1E+30


2.1.7 String literals

A string literal is used to represent a string inside a script. A string literal is made up of a single or double quote character, followed by an arbitrary number of characters, followed by a matching single or double quote. If the string literal is enclosed in single quotes, it cannot contain a single quote. The same applies to string literals in double quotes; they cannot contain double quotes.

To allow the representation of characters that cannot directly appear inside a script or string, some escape sequences are permitted. An escape sequence begins with the character "\" (backslash). The following escape sequences are defined:

	\\      a literal backslash
	\b      backspace character
	\e      escape character
	\f      form feed character
	\n      newline character
	\r      carriage return character
	\t      tab character
	\ccc    character with octal character code ccc
	\occc   character with octal character code ccc
	\dccc   character with decimal character code ccc
	\xcc    character with hexadecimal char code cc


For character code escapes, less digits than given above can be used if the character code needed is small enough. Note that if any character not listed above follows the backslash, the escape sequence results in that character. For example, the escape sequence "\q" results in the character "q".

The following are examples of string literals:

	"Hello"
	'Greetings to you!\n'
	"All your base are belong to us"
	'Embedded \0 zero \0 characters'


2.1.8 Grouping symbols

Grouping symbols are used to make up larger entities from statements and expressions or to change the order in which script code is executed. The following is a list of the grouping symbols used by the Arena language:

	(       )       {      }        [       ]
	.       ;       ,


2.2 Runtime type system

Types are used to provide categories for different kinds of values that a script deals with. Arena provides eight datatypes for use by the programmer. No user-defined types are possible, but a script can use structure templates to provide a sort of sub-typing for the struct datatype.

Values of some types can be converted into values of other types by use of a cast expression. More on that later in the chapter about expressions.

2.2.1 void

The void type is used in places where no meaningful value can be returned. The void type has only one value, which is written "()" (two parenthesis immediately following each other, pronounced "void" or "unit"). All Arena functions must return a value. If a function does not have a meaningful result (for example, a function that outputs a message to the user), it can return a void value instead of having to invent something else.

2.2.2 bool

The bool type is used to represent truth values. It has two values called "false" and "true". It is normally used to hold the results of boolean computations or for representing simple on-off switches.

2.2.3 int

The int type is used to hold signed integer values. The precision is at least 32 bits. This means an int can generally hold integer values between -2^31 and 2^31 - 1.

Arena does not provide unsigned integers. The rationale for this is that the additional bit of precision that an unsigned type provides for large positive integer values is not enough of a benefit to warrant extra complexity for an implementation.

2.2.4 float

The float type is used to represent signed floating point number. The precision of a float is at least that of an IEEE double precision floating point number.

Arena does not provide multiple floating point types with different precisions, like C does. Like the omission of an unsigned integer type, this was decided to keep implementation complexity down to a minimum.

2.2.5 string

The string type is used to represent an arbitrary sequence of bytes or characters. It is normally used to represent text. Note that unlike strings (character pointers, really) in C, an Arena string can contain bytes with the value 0 (zero). In C such a byte would be considered the end of the string.

2.2.6 array

The array type is used to represent a numbered collection of values. The types of the values stored in an array, called the elements of the array, are not constrained. This means each element can have a different type from the other elements. An array can have other arrays as elements.

Arrays are indexed using integers, starting at 0. This means the first element of an array has index 0, the second has index 1, the third has index 2, and so on.

2.2.7 struct

The struct (short for structure) type is used to represent a collection of values. Unlike an array, in which the elements are reference by integer indices, the elements of a struct have names. The order of elements in a struct is not significant, which is another important difference to the array type. Elements in a structure are called "fields" or sometimes "methods" (if they are of type fn, see below).

The names of structure elements are identifier tokens, but there are also library function that use normal string values as structure element names. In general, you can think of a struct as being indexed by string values.

2.2.8 fn

The fn type is used to represent functions. This type allows an Arena script to use functions like any other value. For example, functions can be used as arguments to other functions or can be returned as results from other functions. It is also possible to create so-called anonymous functions on the fly, by use of a special expression that results in an fn value.

2.2.9 resource

The resource type is used to represent operating system resources in use by a script. Examples are file handles or manually allocated memory. The resource type has automatic management that ensures that operating system resources are freed when a resource value is no longer accessible by a running script.

The contents of a resource value are opaque from the viewpoint of a running Arena script.

2.3 Scopes and namespaces

A scope is defined as the area where a given portion of source code appears a script. A namespace defines a limited area of visibility for variables, functions, and structure templates. Both concepts are related and determine what parts of a script can access other parts of the same script.

2.3.1 Top-level vs. function-level scope

The scope of a piece code is determined wholly by its position in the source code. The scope of a given piece of code cannot and does not change at runtime.

The scope active at the beginning of a script is the top-level scope. At this scope, arbitrary statements can be used, including function and structure template definitions.

When a function definition begins, the source code scope changes to function-level scope. At this scope, all statements except other function definitions and structure template definitions are allowed. This means function definitions cannot be nested.

When a function definition ends, the statements that appeared in the function-level scope become the function's body. The function body is what gets executed when a function later is called from other code. After leaving a function definition, the top-level scope is active again.

When a structure template definition begins, the scope remains top-level scope, but the following definitions up to the end of the structure template definition are considered to be part of it. Structure template definitions cannot be nested.

2.3.2 Global vs. local namespace

Namespaces are areas where variables, functions, and structure templates are stored. All the named entities of the language that are used in a script are part of a namespace. A namespace associates identifiers with the entities they name. Note that there are no separate namespaces for variables, functions, and structure templates. A given identifier can only be used for one kind of entity at a time.

Namespaces can be visible or invisible to the currently executing code. Code can only see variables, functions, and structure templates stored in a visible namespace. Entities stored in an invisible namespace are involatile until they become visible again.

There is one special namespace called the global namespace. This namespace is always visible. Variables and functions provided by the Arena standard library are stored in the global namespace. Code running at top-level scope has access to only one namespace, the global namespace.

In addition to the global namespace, there are local namespaces. A local namespace is created whenever a function is called. The code inside the function runs within a local namespace of its own. To this code, both the global namespace and the local namespace of the function are visible. The local namespace starts out empty.

The visibility rules inside a local namespace are as follows: the local namespace has priority. Only if an identifier is not found in the local namespace, the global namespace is consulted. When the namespace is written to, the write always only effects the local namespace. If a function attempts to change a variable it has obtained from the global namespace, a copy of the variable is created in the local namespace.

When a function calls another function, another local namespace is created. The previous local namespace is invisible to the code inside the called function. Only when the called function exits, that namespace becomes visible again.

When a function exits, its local namespace is destroyed. Everything that was stored in the local namespace is no longer accessible. You can assume memory that was used by the local namespace is freed at this point.

What the above boils down to is that functions have their own set of local variables and can manipulate them without affecting variables outside of the function itself.

As a side note, the struct type works just like a namespace of its own.

2.4 Statements

Statements provide a way to sequence and structure code. In other words, statements determine what gets executed and under which conditions.

The following sections include code examples that make use of expressions, which have not been described up to now. Expressions will be explained in the next chapter.

2.4.1 Basic rules for statements

Statements are executed in order that they appear in the top-level scope. Individual statements are end with a ";" (semicolon) character. Expressions can be used as statements by simply adding a semicolon at the end of the expression. For example, if "expr" is a valid expression, then the following is a valid statement:

	expr;


Using an expression as a statement evaluates the expression. Evaluation of an expression results in a value in one of the types provided by the language. When an expression is used as a statement, that value is discarded.

Statements can be grouped together into one statement by using curly braces. The whole block of statements counts as one statement. When the block is executed, the statements inside it are executed in the order they are listed. For example:

	{
	  stmt1;
	  stmt2;
	  stmt3;
	}


The above is a block consisting of three statements. Note that there is no semicolon at the end of the block itself. Blocks can be nested arbitrarily deep. Blocks are normally used when you want to supply a list of statements to execute in a place where only one statement is allowed by the language.

A semicolon all by itself also constitutes a valid statement that does nothing when executed. Blocks are allowed to be empty. An empty block does nothing when executed.

2.4.2 Include statement

The include statement is made up of the keyword "include" followed by a string in double quotes, followed by a semicolon as usual for ending a statement. The string is used as a filename. The contents of the file are parsed as source code as if it were present after the line with the include statement on it.

Note that the included code will be parsed at the current scope. If the current scope is inside a function, the included code cannot define functions or structure templates. Normally include statements are only used at global scope, for including files that contain libraries of functions or structure template definitions.

An implementation of Arena may search for the named include file in implementation-defined locations on the system running the script. However, it is only guaranteed that the current working directory will be searched.

Include files can be nested arbitrarily deep. It is the responsibility of the programmer to prevent loops.

The following is an example of an include statement used to read in a file called "library.inc":

	include "library.inc";


2.4.3 Control flow statements

Control flow statements influence the order in which statements are executed, or whether they are executed at all.

2.4.3.1 if statement
The if statement is used to execute code based on a condition. It consists of the keyword "if", followed by an expression in parenthesis, followed by a statement or block. The expression is called a guard expression.

When the if statement is executed, the guard expression is evaluated. If the resulting value is not of type bool, it is converted to bool (using the same rules as for cast expressions, see below). If the result is the bool value "true", the statement part of the if statement is executed. If the the result of the guard expression is "false", the statement part is not executed.

The following is an example of an if statement:

	if (x % 2 == 0)
	  print("x is even!");


If you need to execute multiple statements, use a block statement.

You can also give a statement to be executed when the guard expression evaluates to "false". This is done by following the first statement with the keyword "else" and another statement. An example:

	if (x % 2 == 0)
	  print("x is even!");
	else
	  print("Sorry, x is uneven!");


2.4.3.2 while loop statement
The while loop statement can be used to execute another statement or block multiple times. It consists of the keyword "while", followed by a guard expression in parenthesis, followed by a statement known as the loop body.

When a while loop is executed, the guard expression is evaluated, following the same rules as given for the guard expression of an if statement. If the result is "true", the loop body is executed. Execution of the while loop then restarts at the beginning. If the guard expression evaluates to "false", the loop body is not executed and the while loop is not restarted at the beginning.

These rules mean that a while loop only executes as long as the guard expression evaluates to "true". If the guard expression evaluates to "false" the first time it is considered, the loop body is never executed.

The code inside the while loop normally has side effects that eventually change the result of the guard expression to "false".

The following is an example of a while loop with a block statement as its loop body:

	while (x % 2 == 0) {
	  print("x was even");
	  x = rand(0,999);
	}


2.4.3.3 do loop statement
The do loop statement is a close cousin of the while loop statement; only the positions of the guard expression and loop body are exchanged. A do loop consists of the keyword "do", followed by a statement as the loop body, followed by the keyword "while" and a guard expression in parenthesis.

When a do loop is executed, the loop body gets executed first. Then the guard expression is evaluated using the same rules as given for the guard expression of an if statement. If the result is "true", the do loop is executed again. If the result is "false", execution continues after the loop.

The above rules mean that the body of a do loop is always executed at least once. It is then executed again as long as the guard expression evaluates to "true".

The following is an example of a do loop:

	do {
	  now = time();
	} while (now - saved < 10);


2.4.3.4 for loop statement
The for loop statement offers a more versatile form of looping compared to the while and do loops detailed in the previous two sections. A for loop consists of the keyword "for", followed by three semicolon-separated expressions in parenthesis, followed by a statement that serves as the loop body. The first expression is called an initialiser expression, the second a guard expression, and the third a loop expression.

When a for loop executes, the initialiser expression is evaluated. This happens only once, and the result of the evaluation is discarded. Then the guard expression is evaluated using the same rules as given for the guard expression of an if statement. If the result is "true", the loop body is executed. Following the loop body, the loop expression is evaluated and its result discarded. Execution of the for loop then restarts, omitting the initialiser expression. If the guard expression evaluates to "false", the loop body and loop expression are not executed and execution resumes after the for loop.

The above rules mean that a for loop executes as long as its guard expressions evaluates to "true". If it does not evaluate to "true" on the first execution of a for loop, the loop body is never executed.

Each of the three expressions in a for loop statement can be left empty. In that case the (empty) expression is replaced with the literal constant "true". This means a for loop with all three expressions left off produces an infinite loop.

For loops are often used to execute a piece of code a given number of times. For example, the following loop prints the word "hello" ten times in a row:

	for (i = 0; i < 10; i++) {
	  print("hello");
	}


2.4.3.5 continue statement
The continue statement can be used inside of do, while, and for loops. It consists of the keyword "continue".

When a continue statement is executed inside of a loop body, the statements following the continue statement in the loop body are skipped. Processing continues as normal for the loop statement in question. Normally this means the loop's guard expression will be evaluated again.

When a continue statement is executed outside of a loop body, it has the same effect as an empty statement.

The following is a (silly) example of counting the number of odd integers between 0 and 99. A for loop is used and the increment of a counter variable is skipped by use of a continue statement if the number in question is even.

	odd = 0;
	for (i = 0; i < 100; i++) {
	  if (i % 2 == 0) continue;
	  ++odd;
	}
	print(odd, " odd numbers found");


2.4.3.6 break statement
The break statement can be used inside of do, while, and for loops (for the use of break in a switch statement, see the next section). It consists of the keyword "break".

When a break statement is executed inside of a loop body, the execution of the rest of the loop body is skipped. Execution then resumes with the next statement following the loop statement that contains the break statement. In effect, execution of that loop is terminated by the break statement.

When a break statement is executed outside of a loop body (or switch statement, see below), it has the same effect as an empty statement.

The following is an example use of break which exits from an infinite for loop as soon as a random number between 0 and 99 equals zero.

	for (;;) {
	  number = rand(0, 99);
	  print("my number: ", number, "\n");
	  if (number == 0) break;
	}


2.4.3.7 switch statement
The switch statement is used to execute one or more of a number of statement groups depending on the value of a guard expression. It consists of the keyword "switch", followed by a guard expression in parenthesis, followed by statement groups enclosed in curly braces.

Two different kinds of statement groups are possible. There can be an arbitrary number of case groups and one default group. A case group starts with the keyword "case" followed by an expression, followed by a colon, followed by an arbitrary number of statements. If the last statement in the group is a break statement, this has a special meaning described below. The default group consists of the keyword "default" followed by a colon, followed by an arbitrary number of statements. A break statement at the end of a default group has no special meaning relevant to the switch statement, but it still has its normal effect on an enclosing loop statement.

When a switch statement is executed, its guard expression is evaluated. The resulting value is then used to decide which case group to execute. Case groups are considered in the order that they appear in the switch statement. When a case group is considered, its expression is evaluated. If the resulting value is equal (in type and value) to the value of the guard expression, the statements inside the case group are executed. If the last statement of the group is a break statement, execution of the switch ends and the next statement executed is the one following the switch statement. If there is no break at the end of the case group, the statements of the next group are also executed, without evaluating the expression of that group. This is called "fall through". This behaviour continues until either a break statement at the end of a case group is encountered, a default statement group is executed, or the switch statement ends.

If a case group is considered and its value does not match the value of the switch's guard expression, the statements in the case group are not executed. The next case group is considered instead and its expression will be evaluated and checked. A default group, if present, is not included in the case statements to consider for execution.

When all case statements have been considered and no match was found, the behaviour of the switch statement depends on the presence of a default group. It it is present, the statements associated with it are executed. If it is not present, the switch simply executes nothing. Note that there is no fall through out of a default group, execution of a switch always ends once the last statement of the default group has been executed.

The following example counts how many numbers between 0 and 99 are divisible by 3 or 6. It uses a switch that evaluates the remainder of a division by 6. It employs fall through since anything divisible by 6 is also divisible by 3. It uses a default group to count how many numbers were not divisible by 3 or 6.

	three = six = none = 0;
	for (i = 0; i < 100; i++) {
	  switch (i % 6) {
	    case 0:
	      ++six;
	    case 3:
	      ++three;
	      break;
	    default:
	      ++none;
	  }
	}
	print(three, "numbers were divisible by 3\n");
	print(six, "numbers were divisible by 6\n");
	print(none, "number were not divisible by either\n");


2.4.3.8 try statement
The try statement is used to handle exceptions. It consists of the keyword "try", followed by a statement, followed by the keyword "catch", followed by an identifier in parenthesis, followed by another statement.

When a try statement is executed, the statement following the keyword "try" is executed. What gets executed next depends on whether this statement causes an exception (by use of a throw statement, see below). If the enclosed statement does not cause an exception, the next statement executed is the statement directly following the try statement; the statement in the catch part of the try statement is not executed.

If the enclosed statement does throw an exception, the value thrown is assigned to a variable with the identifier given in the catch part of the try statement. The statement given in the catch part is then executed. Execution then continues behind the try statement. The variable with the exception value remains visible to the code following the try statement. Executing the catch part of a try statement is often called "handling" the exception.

It is possible for try statements to be nested arbitrarily deep. An exception is always handled by the innermost try statement that encloses the code that caused the exception.

It is common for both statements in a try statement to actually be block statements.

The following is an example of a try statement used to encapsulate two function calls which may cause exceptions. If an exception occurs, its value is printed.

	try {
	  a = somefunc();
	  b = someotherfunc();
	} catch (e) {
	  print("exception ", e, " occurred\n");
	}


2.4.3.9 throw statement
The throw statement is used to cause an exception. It consists of the keyword "throw" followed by an expression.

When a throw statement is executed inside of a try statement (either directly or because it occurs inside a function called from within a try statement), the throw expression is evaluated and the resulting value becomes the exception value. Execution then continues with the catch part of the innermost enclosing try statement.

Note that the above means a throw statement executed inside a loop body breaks out of the loop if the handling try statement is outside of the loop.

When a throw statement is executed outside of a try statement, this is considered a fatal error and execution of the whole Arena script is terminated at the point where the exception was thrown.

The following is an example of the use of a throw statement to throw an exception with the string value "oops" as the exception value:

	throw "oops";


2.4.4 User-defined functions

User-defined functions provide a way to structure code into separate, named entities. Each function accepts input values, called function arguments, and computes a value called the return value of the function when called.

2.4.4.1 Function definition
A function definition declares a user-defined function to the script interpreter. It consists of the function return type, followed by an identifier naming the function, followed by a list of argument types and names in parenthesis, followed by a statement to be used as the function body. The individual argument types and names are separated by commas. The list of arguments can be left empty.

The return type can be given by using one of the keywords "void", "bool", "int", "float", "string", "array, "struct", "resource", or "fn". The intent is to specify that the function returns a value of the given type when it is called. It is a fatal error if the code of the function body does not return a value that has the return type. The special keyword "forced" can be prefixed to the return type. If it is, it is not an error if the function attempts to return a value not having the return type -- instead, the language automatically casts (see cast expressions, below) the return value to the appropriate type. The special keyword "mixed" can also be used in place of a real type to indicate that the return value of the function does not always have one and the same type.

Function arguments are specified by using the optional keyword "forced", followed by a type name (same as the return type detailed above), followed by an identifier. The identifier is used to name the argument. When a function is called, the function's arguments are available to the function body as variables with names as given in the function definition. The argument type of an argument is checked when a function is called. If the "forced" keyword was used, the argument value is automatically cast to the given type. If not, it is a fatal error to call the function with an argument value not matching the given argument type.

The type of a function argument can be left out, in which case the language behaves as if the type "mixed" had been specified.

The function body can be any statement. Most functions contain more than one statement, thus most function bodies will be block statements.

The return type, name, and argument types of a function are called the prototype of the function.

When a function definition is executed, the new function's existence is recorded in the current namespace. Since function definitions can only occur at top-level scope, this will always be the global namespace. It is not an error to define a function with the same name as an existing variable, function, or structure template. The new function definition will override any previous meaning of the same name.

The result value, or return value, of a function is determined by using a return statement, described below. A function body that does not use a return statement will automatically be made to return a void value by the language runtime system.

The following is an example of a function definition for a function named "sum" that returns an int value and excepts two int arguments named "x" and "y", respectively. The example function body returns the sum of both int values.

	int sum(int x, int y)
	{
	  return x + y;
	}


The function definition above will result in a fatal error if passed float arguments, for example. To cause the language to automatically convert both arguments to int when the function is called, the definition would have to be changed to:

	int sum(forced int x, forced int y)
	{
	  return x + y;
	}


2.4.4.2 return statement
The return statement is used to set the return value of a function and terminate the execution of a function body. It consists of the keyword "return" followed by an optional expression.

When a return statement is executed inside a function body, the return expression is evaluated and used as the return value of the function. If no return expression is present, a void value is substituted instead. Statements following the return statement in the function body are not executed. The effect of the return statement is to always end the execution of a function body.

The return value is passed back the caller of the function.

When a return statement is executed outside of a function body, it behaves like an empty statement and the return expression is not evaluated.

The following is an example of a return statement used to return the bool value "true":

	return true;


2.4.5 Structure templates

A structure template is a blueprint for constructing values of type struct. Structure templates support inheritance, meaning one structure template can build upon another structure template defined earlier. Structure templates can define fields and methods that are to be created when a struct value is constructed from the template.

A structure template consists of the keyword "template", followed by an identifier to name the template, followed by field and method definitions enclosed in curly braces. Optionally, the name of the template can be followed by the keyword "extends" and an identifier naming another structure template that this template builds upon.

When a structure template is executed, the new template is stored in the current namespace and is available to code following the structure template. Since structure template can only occur at top-level scope, they are always stored in the global namespace. It is not an error if the template name is already used by an existing variable, function, or other template. The new structure template overrides any previous definition of the same name.

See the following sections for examples of structure templates. See the section "Constructor calls" in the chapter on expressions for information on how to create struct values from structure templates.

2.4.5.1 Defining structure fields
Structure fields in structure templates are used to define data fields that will appear in struct values created from the template. The definition of a structure field gives the identifier of the field. A value for the field can also be given, but this is optional.

A structure field definition without value consists of an identifier followed by a semicolon. When a struct value is constructed from the template, the resulting value will have an element named by the identifier that contains a void value.

A structure field definition with value consists of an identifier, followed by the assignment operator ("="), followed by an expression, followed by a semicolon. When a struct value is constructed from the template, the resulting value will have an element named by the identifier that contains the result of evaluating the expression.

The following is an example of a structure template that defines two structure fields. The first field is named "i" and not given a value, the second is called "foo" and given the constant int expression 42 as a value.

	template example
	{
	  i;
	  foo = 42;
	}


When a template extends another template, both may contain fields of the same name. The values given by the extending template have precedence. In the following example, a struct value constructed from template "bar" will contain a field called "i" with the int value 2.

	template foo
	{
	  i = 1;
	}
	template bar extends foo
	{
	  i = 2;
	}


2.4.5.2 Defining structure methods
A method is a function stored within a structure. This is basically the same as a struct field with type fn. The name "method" was chosen because that is how object-oriented languages name a similar construct.

A structure method definition inside a structure template is written exactly like a function definition (see above). The only difference is that the function definition occurs within the curly braces enclosing the structure template's definition.

When a struct value is constructed from the structure template, it will contain an element with the function name from the function definition. The element will contain a value of type fn that corresponds to the given function prototype and body.

The following is an example of a structure template that defines a method called "double", which is given as a function that will double its int argument.

	template foo
	{
	  int double(int x)
	  {
	    return 2 * x;
	  }
	}


For structure templates extending other structure templates, the same rules as for structure fields apply: when both templates define a method of the same name, the definition in the extending template takes precedence. In the following example, struct values constructed from the "bar" template will contain a method called "fiddle" that quadruples its argument, whereas struct value constructed from the "foo" template will contain a method called "fiddle" that triples its argument.

	template foo
	{
	  int fiddle(int x)
	  {
	    return 3 * x;
	  }
	}
	template bar extends foo
	{
	  int fiddle(int x)
	  {
	    return 4 * x;
	  }
	}


Note that field and method definitions in a structure template can be intermixed in any order.

2.4.5.3 Constructor method
A constructor method is a structure method definition with a special name. A method is called the constructor method if its identifier is the same as the identifier of the structure definition it is part of.

Constructor methods play a special role when a struct value is constructed from a template, as described in the section "Constructor calls" in the chapter on expressions. Apart from that, a constructor method behaves identically to other methods defined by a structure template.

The following is an example of a structure template "foo" that contains a constructor method that will print out a message whenever it is called.

	template foo
	{
	  void foo()
	  {
	    print("constructor method foo called!\n");
	  }
	}


2.5 Expressions

Expressions are basically descriptions on how to compute a value. Determining the value of an expression is called evaluating the expression. The result of evaluating an expression, called its value, is a value from one of the eight built-in types of the Arena scripting language.

2.5.1 Basic rules for expression nesting

Expression can be made up of other expressions by use of several operators which are detailed in the sections below. The exact meaning of compound expressions such as "2 + 3 * 5" is determined by precedence and associativity. For example, in the expression "2 + 3 * 5", the multiplication is performed before the addition. To override the order in which parts of an expression are evaluated, it is possible to put parts of an expression into parenthesis. The sub-expression thus formed must be a valid expression in itself and its value will be evaluated before the rest of the original expression. For example, to compute the addition before the multiplication in the aforementioned example, the expression would have to be changed to "(2 + 3) * 5".

The following sections list all possible types of expressions supported by the Arena scripting language. Precedence and associativity of all language operators are given near the end of the chapter.

2.5.2 Constant expressions

A constant expression consists of a literal token. There are literal tokens for the types void, bool, int, float, and string.

When a literal token expression is evaluated, the result is a value of the appropriate type. For example, the literal expression "12" evaluates to the int value 12.

The following are examples of constant expressions:

	true
	12.0
	"I'm a string"
	()
	42


2.5.3 Reference expressions

A reference expression is used to refer to a variable or function. It consists of an identifier.

When a reference expression is evaluated, the result is the value of the named variable in the current namespace. If the identifier refers to a function, the result is a value of type fn. If the identifier is unknown or names a structure template, the result is a void value.

The following are examples of reference expressions:

	a
	foo
	some_long_identifier


2.5.3.1 Static reference expressions
A static reference expression is used to refer to elements of a structure template. It consists of an identifier, followed by the operator symbol "::" (double colon), followed by another identifier.

The first identifier is a template name that is looked for in the current namespace. If it does not denote an existing structure template, a fatal error is generated. Otherwise, a separate namespace is created. The field and method definitions of the structure template are then executed inside the new namespace. The second identifier is then used like a normal reference expression inside the new namespace. The new namespace is destroyed after obtaining the value of the static reference, which is the value of the whole static reference expression.

The following are examples of static references:

	foo::bar
	some_template::some_field


2.5.3.2 Indexing of elements
Indexing is used to refer to elements of array and struct values. Indices can be placed directly after reference expressions, static reference expressions, and all kinds of function and method calls.

An array index consists of one or more expressions, each enclosed in square brackets. When an array index is evaluated, the indexed expression and the expression(s) used as the index are evaluated. If the result value of the indexed expression is not an array, a void value is returned. Otherwise, the result of the index expression is cast to an integer (see below for type casting rules) and used as an index into the array. If the resulting integer index is valid for the array in question, the element stored at that index is the result of the indexed expression. Otherwise, the result is a void value.

The following is an example expression that assumes "a" is the name of an array variable and references the third element of the array:

	a[2]


As a special case, negative indexing is allowed. A negative index is taken to be an offset from the end of the array. This way, the index -1 accesses the last element of an array. -2 accesses the element immediately preceding the last element, and so on. If a negative index reaches beyond the beginning of an array, the result is a void value.

Struct values contain values indexed by identifiers. A reference to a struct field consists of the operator symbol "." (period) followed by an identifier.

When a struct index is evaluated, the preceding expression is evaluated. If the result is not a struct value, the result is a void value. Otherwise, the index identifier is used as an element name for the struct value. If the struct has an element of that name, the value stored under that name is the result of the indexing expression. If the struct value does not have an element with the given name, a void value is used as the result.

The following is an example of an expression that uses "a" as the name of a struct variable and indexes a field "name" off the variable's value:

	a.name


Array and struct indices can be freely mixed. Multiple array and struct indices can follow each other. Evaluation proceeds from left to right. The following are examples of expressions with multiple indices:

	a[2].foo[3][7].value
	str.data[100]
	a[0][1][2]
	foo.bar.foobar
	a[1].bar.foo[2]


The last example above would be evaluated as follows: first the variable reference "a" would be evaluated. If the resulting value is an array, the second element of the array is accessed. If the result is a struct, the field named "bar" is accessed. If the result is again a struct, the field named "foo" is accessed. If this results in an array value, the third element of that array is accessed and used as the value for the whole expression. If any value produced along the way does not have the expected type (array or struct, depending on the kind of indexing used), the result of the whole expression is a void value.

2.5.4 Cast expressions

Cast expressions are used to convert values from one type to another. A cast expression consists of an opening parenthesis, followed by a type name, followed by closing parenthesis, followed by an expression. No whitespace is allowed between parenthesis and type name.

The result of a cast expression is obtained by first computing the value of the inner expression and then converting it to the type named in the cast expression. If the value produced by the inner expression already has the right type, it is directly used as the result of the cast expression. Otherwise, the type conversion rules given in the following sections are applied.

This is an example of a cast expression casting the integer constant "1" to float:

	(float) 1


2.5.4.1 Conversion to void
Since the void type has only one value, all values of all other types are converted to that one value.

2.5.4.2 Conversion to bool
Converting a void value to bool results in the bool value "false".

Converting an int value to bool results in the bool value "false" if the int value is 0 (zero). Otherwise, the result is the bool value "true".

Converting a float value to bool results in the bool value "false" if the float value is 0.0 (zero). Otherwise, the result is the bool value "true".

Converting a string value to bool results in the bool value "false" if the string is empty (that is, contains no characters). Otherwise, the result is the bool value "true".

Converting an array value to bool results in the bool value "false" if the array is empty (that is, contains no elements). Otherwise, the result is the bool value "true".

Converting a struct value to bool results in the bool value "false" if the struct is empty (that is, contains no fields or methods). Otherwise, the result is the bool value "true".

Converting an fn value to bool results in the bool value "true".

Converting a resource value to bool results in the bool value "true".

2.5.4.3 Conversion to int
Converting a void value to int results in the int value 0 (zero).

Converting a bool value to int results in the int value 0 (zero) if the bool value is "false". If the bool value is "true", the resulting int value is 1 (one).

Converting a float value to int results in an int value that corresponds to the integral part of the float value. If the integral part of the float value cannot be represented as an int, the resulting value is undefined.

Converting a string value to int attempts to interpret the string as an integer literal. Only an initial part of the string consisting solely of digits is considered for conversion.

Converting an array value to int results in an int value that gives the number of elements in the array.

Converting a struct value to int results in an int value that gives the number of elements in the struct.

Converting an fn value to int results in the int value 1 (one).

Converting a resource value to int results in the int value 1 (one).

2.5.4.4 Conversion to float
Converting a void value to float results in the float value 0.0 (zero).

Converting a bool value to float results in the float value 0.0 (zero) if the bool value is "false". If the bool value is "true", the resulting float value is 1.0 (one).

Converting an int value to float results in a float value with the same integral value as the original int value and no fractional part.

Converting a string value to float attempts to interpret the string as an float literal. Only an initial part of the string consisting solely of character that can occur in a float literal is considered for conversion.

Converting an array value to float results in an float value that gives the number of elements in the array.

Converting a struct value to float results in an float value that gives the number of elements in the struct.

Converting an fn value to float results in the float value 1.0 (one).

Converting a resource value to float results in the float value 1.0 (one).

2.5.4.5 Conversion to string
Converting a void value to string results in an empty string value.

Converting a bool value to string results in an empty string value if the bool value is "false" or in a string value containing the single character "1" (digit one) if the bool value is "true".

Converting an int value to string results in a string value containing the integer literal for the original int value.

Converting a float value to string results in a string value containing the float literal for the original float value.

Converting an array value to string results in a string value containing the word "Array".

Converting a struct value to string results in a string value containing the word "Struct"

Converting an fn value to string results in a string value containing the word "Function".

Converting a resource value to string results in a string value containing the word "Resource".

2.5.4.6 Conversion to array
Converting a non-array value to an array results in a one-element array that contains the original value at index 0 (zero).

2.5.4.7 Conversion to struct
Converting a non-struct value to a struct results in a struct with a single field named "value" that contains the original value.

2.5.4.8 Conversion to fn
Attempting to convert a non-fn value to fn is a fatal error.

2.5.4.9 Conversion to resource
Attempting to convert a non-resource value to resource is a fatal error.

2.5.5 Assignment expressions

An assignment expression is used to assign a value to a variable. It consists of an identifier, followed by the assignment operator "=" (equals sign), followed by an expression.

Evaluation of an assignment expression evaluates the inner expression and stores the result in the current namespace, in the form of a variable with the name given by the identifier in the assignment expression. Any previous meaning of the same identifier is lost. The assignment expression itself has the same result value as the inner expression.

The following is an example expression that assigns the float value "12.5" to a variable named "val":

	val = 12.5


Note that if an exception is thrown while evaluating the right side of an assignment, the assignment does not take place and the variable retains its previous value.

Since an assignment expression has the assigned value as its own value, and assignment associates to the right, it is possible to assign a value to multiple variables with an expression like this:

	a = b = 0


2.5.5.1 Indexing in assignments
Array and struct indices can be used in an assignment expression just like they can be used in combination with reference expressions. For example, the following expression will assign the bool value "true" to the fifth element of an array stored in the variable "map":

	map[4] = true


There is a difference to using indices in references, though. The above example will enforce "map" to be a variable of type array. If it is not an array before the assignment, an empty array will be created on the fly, the fifth element be set to "true", and the resulting array will be assigned to the variable "map". In the same way, when struct indexing is used on something that is not a struct, an empty struct value will be created on the fly and substituted for the original non-struct value.

If a negative array index is used in an assignment that does not fall into the bounds of the array, the effect is to assign to the first element of the array.

Consider the following example:

	a.foo.data[3] = 12


No matter what the value of the variable "a" is before the assignment, the following will be true after the assignment expression was evaluated: "a" will be a struct with at least the field "foo". The field "foo" will itself be a struct with a least the field "data". The field "data" will itself contain an array with at least four elements, the one at index 3 containing the int value 12. Values that already had the correct type for the assignment are not disturbed: for example, if the "data" field above already existed as an array of ten elements, it would still be an array of ten elements after the assignment; just the element at index 3 would have been overwritten with an int value of 12.

If both an index and the outer assignment have side effects on the same structure or array, the side effects of the index expression are discarded after the value of the index has been computed. In the following example, the value of "s.sp" is not changed after evaluation of the whole assignment expression:

	s.stack[s.sp++] = 42


2.5.5.2 Combining assignments and operators
Instead of the plain assignment operator, the following operators can also be used:

	+=      -=      *=       /=     &=      |=
	^=      <<=     >>=


These are all composed of a normal operator symbol of the Arena language and the assignment operator symbol. The meaning of a special assignment is best explained by an example. Consider this expression using a special assignment operator:

	a += 2


This expression behaves exactly the same as another, longer expression:

	a = a + 2


In effect, using a special assignment operator is exactly the same as first referencing the target of the assignment, combining the result with the operator and inner expression given, and assigning the result to the target of the assignment.

2.5.6 Function calls

Function calls are used to call library functions or user-defined functions. A function call consists of an identifier naming the function, followed by a comma-separated list of expressions (the function call arguments) in parenthesis. The argument list is allowed to be empty.

When a function call expression is evaluated, the existence of the function is checked. If the identifier name is not found in the current namespace or does not refer to a function or fn variable, a fatal error is generated. If the function is found, the number of argument expressions is checked against the number of arguments given in the function's prototype. It is a fatal error to pass less arguments than present in the prototype. It is allowed to pass more arguments, extra arguments will be made available to a function's body as described below.

When it has been determined that a function call is valid as described above, the argument expressions are evaluated. Argument expressions are evaluated from left to right. The types of the resulting values are checked against the function's prototype, as described in the section about function definition statements (above, in the chapter about statements).

If the argument type check succeeds, a new local namespace is created. The values of the function's arguments are then added to the new namespace as if they were local variables assigned inside the function's body. For example, consider a function with the following prototype:

	int mult(int x, int y)


When this function is called with the arguments 42 and 12, the local namespace of the function will contain an int variable named "x" with initial value 42 and another int variable named "y" with initial value 12.

In addition to the named arguments, the local int variable "argc" is defined and is assigned the number of arguments actually passed to the function. The variable "argv" is also defined and contains an array filled with copies of all function arguments. The function's body can use these two variables to gain access to extra parameters given in a call of the function, beyond those named in the function's prototype.

When these preparations are complete, the function's body is executed inside its own local namespace. If the function body executes a return statement, the value used in the statement becomes the result of the function call expression. If the function does not explicitly return a value, a void value is automatically generated. The local namespace of the function is then destroyed, which frees all local variables, including the values of the function arguments.

The following are examples of function call expressions:

	printf("Hello World!\n");
	array_merge(a, b ,c);
	versions()
	my_func(12, "foo", 42);


The above rules mean that function arguments are passed to the function as copies. For example, consider the function call:

	foo(a, b)


When this function call is evaluates, the variables "a" and "b" are referenced and copies of their current values are passed to the body of the function "foo". No matter what the function does with its argument values, the values of the variables "a" and "b" as stored in the namespace outside of the function's body are not changed.

2.5.6.1 Passing arguments "by reference"
As detailed in the last section, function arguments are normally passed into function bodies as copies. Even if the argument expressions are variable reference, a function body cannot manipulate the variables themselves.

However, there is a special syntax for passing argument expressions to a function that makes it possible for the function's body to influence the value of variables that are used as arguments. It consists of placing an ampersand before variable reference expressions or indexed variable reference expressions that are used as function arguments. This is called passing "by reference", though Arena does not exactly use references for this construct (the method that Arena uses is called "copy-retract" or "copy-in copy-out").

When a function call using this syntax is evaluated, the normal function call semantics as described in the last section are in effect. However, when the function's body finishes executing, the language tries to update the values of all arguments that were passed "by reference". This is best explained by an example. Consider the following function body:

	void swap(mixed a, mixed b)
	{
	  c = a;
	  a = b;
	  b = c;
	}


For example, this function might be called like this:

	swap(&x, &y);


During the function call, the values of the variables "x" and "y" are available inside the function's body as local variables "a" and "b" (copy-in). When the function's code has been executed, the language checks whether the local variable "a" is still defined. If yes, its value is copied into the variable "x" outside the function. The same happens for local variable "b" and "y" outside the function (copy-out). The order given here is for explanatory purposes. The language takes care that the copy-out actions happen atomically with regard to each other -- from the script's point of view, all copy-out actions look as if they happen at exactly the same time. For example, the above example function might be called like this:

	swap(&i, &a[i])


In this case, the array index used for the update of the second variable will always be the same one that was used for the actual argument value passed into the function, even if the function changes its first argument.

If the same variable is passed into a function twice or more using "by reference" passing more than once, the value of the variable after the function call is implementation-defined.

Note that passing "by reference" only works for arguments named in the called function's prototype. It does not work for arguments accessed via the special "argv" array.

2.5.7 Basic rules for structure templates

Structure templates are used to construct values of the struct datatype. This process is called creating an instance of the template. Another use of a template is to use a static reference, which means accessing something inside the structure template without actually creating an instance.

In both cases, the language needs to create concrete versions of the abstract definitions given in the template. This happens as follows: a new local namespace is created. Inside this namespace, the definitions given in the template are executed. Field definitions with values are executed like assignment expressions. Field definitions without values are executed like assignment expressions that assign a void value. Method definitions are executed as normal. The result is a local namespace that contains all fields and methods from the template with their default values.

If a template extends another template, the process above is used recursively, depth-first. This means the chain of templates extending each other is searched until a template that does not extend another is found. The definitions from that template are evaluated first, followed by those in the template that extends the first one, and so on until the definitions from the template that started the process are evaluated. This means definitions in a template can override all fields and methods from another template that it extends.

If the process was used to create a struct value, the completed local namespace is then used to populate the new struct value. If the process was used for evaluating a static reference, the referenced member is copied and the namespace discarded.

2.5.8 Constructor calls

Constructor call expressions are used to create struct values from structure templates. A constructor call consists of the keyword "new", followed by an identifier naming a template, followed by a comma-separated list of argument expressions enclosed in parenthesis. The argument list is allowed to be empty.

When a constructor call expression is evaluated, the identifier is used to look for a structure template definition in the local and global namespace. It is a fatal error if none is found. If the template is found, the initial values of a new struct value are computed as described under "Basic rules", above.

If a constructor method is defined in the template, it is called using the argument expressions given as arguments in the constructor call expression. If the template itself does not define a constructor method but a template it extends does, the constructor of the parent template is called instead. Consider this example:

	template foo
	{
	  void foo()
	  {
	    print("this is foo\n");
	  }
	}
	template bar extends foo
	{
	  i = 12;
	}


When a constructor call is evaluated for template "bar", the constructor method defined in the "foo" template will be called. Note that it is legal for there to be no constructor method to call at all.

Normal argument type checks take place for constructor methods. Using an incorrect number of arguments or arguments of unsuitable types results in a fatal error. Values returned from a constructor (by use of a return statement) are discarded.

During execution of the constructor method, a special local variable "this" is defined. It contains a copy of the struct value that is being constructed. It behaves like a function argument passed "by reference", meaning the constructor method's body can use it to access and change elements in the struct value that is the result of the whole constructor call expression.

Note that the argument expressions given in the constructor call expression are only evaluated when a constructor method is actually called. If no constructor method is defined, the argument expressions are not evaluated.

At the end of the evaluation of a constructor call expression, an additional element called "__template" is added to the new struct value. It contains a string value with the name of the template used to create the struct value.

An example. The following structure template contains a constructor method that will set an field called "i" to the value of the first argument used in the constructor call expression:

	template foo
	{
	  void foo(int x)
	  {
	    this.i = x;
	  }
	}


The above example can be used in a constructor call expression like this:

	new foo(12)


The result is a value of type struct. This value will have three elements: a field called "i" with the int value 12, a method called "foo", and a field called "__template" that contains the string value "foo".

2.5.9 Method calls

A method call works like a normal function call, but refers to a function defined by a structure template or contained in a struct value.

The conventions for argument evaluation, type checks and namespaces are the same as for function calls, described above.

2.5.9.1 Static method calls
A static method calls is used to call a function defined in a structure template. It consists of an identifier naming a template, followed by the characters "::" (double colon), followed by another identifier naming the method, followed by an argument list of expressions in parenthesis.

It is a fatal error if the template named by the first identifier is not defined in the current namespace. It is also a fatal error if the named template does not contain, either directly or via inheritance from an extended template, a method with the name given by the second identifier.

The following are examples of static method calls:

	foo::bar(1, 2, 3)
	input::check("foo", false)
	login::logout()


2.5.9.2 Dynamic method calls
A dynamic method call is used to call a method contained in a struct value. It consists of appending a single period, followed by an identifier and an argument list of expressions in parenthesis, to some other expression that results in a struct value.

If a method call is appended to a non-struct value or the named method does not exist in the struct value, a fatal error is generated.

If the method exists and the arguments are compatible with its prototype, the method's body is called as described for normal functions. A special local variable called "this" is also defined and contains a copy of the struct that contains the called method. This variable can be used to access fields and methods stored in the same struct value. Changes to the variable "this" will be copied into the real struct variable (if any) when the method body is finished executing.

The following are examples of dynamic method calls (the last is a method call applied to the result of a previous constructor call):

	foo.bar()
	registry[512].files.destroy(2)
	new foo().something("foo", 42)


2.5.10 Operators

Operators work a lot like functions, but instead of names and argument lists they consist of an operator symbol applied to one or more other expressions. Which other expressions are combined by the operator depends on the kind of operator, as described next.

A prefix operator expression affects a single inner expression and consists of the operator symbol prefixed to another expression.

An infix operator expression affects two inner expressions and consists of the operator symbol written between the two other expressions.

A postfix operator expression affects a single inner expression and consists of the operator symbol suffixed to another expression.

Operators work on different types of expressions. All operators automatically cast the values of their argument expression to a type appropriate to the operator, as described below for different kinds of operators.

Not all operators evaluate all of their argument expressions. The rules for evaluation are also described below.

2.5.10.1 Math operators
Math operators are used to represent arithmetic operations. They work with values of types int and float.

A math operator always evaluates all its argument expressions. If at least one of the argument expressions results in a float value, both values are cast to float before use. Otherwise both values are cast to int.

There is only a single math prefix operator. It uses the operator symbol "-" (minus sign) and denotes negation of the value of the argument expression.

The following table lists the infix math operators and their respective meanings.

	+     addition
	-     subtraction
	*     multiplication
	/     division
	%     remainder
	**    exponentiation


If the result of a math operator expression falls outside of the domain of the type of its arguments (after casting), the result is an undefined value of the same type as the argument values.

The following are examples of math operator expressions:

	-12
	1 + 2
	1.2 * 5
	2 ** 10


2.5.10.2 Boolean operators
Boolean operators are used to represent logic computations on truth values. When a boolean operator computes the value of one of its argument expressions, the result is always cast to bool.

The prefix operator "!" (exclamation mark) denotes logical negation. It always computes the value of its argument expression.

The infix operator "||" (double vertical bar) denotes logical disjunction ("or"). It always evaluates its first, left argument expression. If the result is the value "true", the result of the whole expression is also "true" and the second argument expression is not evaluated. Otherwise, the second argument expression is evaluated and its bool value is the result of the whole expression.

The infix operator "&&" (double ampersand) denotes logical conjunction ("and"). It always evaluates its first, left argument expression. If the result is the value "false", the result of the whole expression is also "false" and the second argument expression is not evaluated. Otherwise, the second argument expression is evaluated and its bool value is the result of the whole expression.

The following are examples of boolean operator expressions:

	!failed
	x && y
	(x || y) && !z


2.5.10.3 Equality operators
Equality operators are used to compare values for equality. The two equality operators always evaluate both their argument values. No casting of the resulting values takes place.

If both arguments to an equality operator are of type array, struct, or resource, the result of the equality operator expression is implementation-defined.

The operator "==" (double equals sign) denotes an equality test. The value of the whole expression is "true" if both argument values are of the same type and represent the same value of that type. Otherwise the value of the whole expression is "false".

For values of type fn, two values are considered equal if and only if they refer to the same function body.

The operator "!=" (exclamation mark followed by equals sign) denotes an inequality test. The value of the whole expression is "true" if the argument values are of different types or do not represent the same value if they are of the same type. Otherwise the value of the whole expression is "false".

The following are examples of equality operator expressions:

	1 != 2
	x == "foo"
	divisor != 0.0


2.5.10.4 Order operators
Order operators are used to compare the ordering of two values with respect to each other. An order operator always evaluates both of its argument expressions. If only one of the values is a literal constant, the other value is cast to the same type. Otherwise, the second value is cast to the type of the first value (the first value is the one produced by the argument expression on the left of the operator symbol).

Possible result values of an order operator expression are "true" and "false", depending on whether the ordering the expression checks for is present for the argument values.

Ordering of void values is always "false" by convention since there is only one value in the datatype.

Ordering of bool values is such that the value "false" is smaller than "true", but not equal.

Ordering of int values is the same as for whole numbers in mathematics.

Ordering of float values is the same as for rational numbers in mathematics.

Ordering of string values is such that the bytes forming the string are compared from left to right, interpreting them as numbers in the range 0-255. The comparison stops as soon as one of the bytes is smaller or larger than the other one. The string with larger byte is considered to be larger than the other. If both bytes are the same, the comparison moves on to the next byte in both strings. If this process reaches the end of exactly one of the strings, that string is considered to be the smaller of the two. If the process reaches the end of both strings at the same time, the strings are considered equal.

Ordering of array, struct, fn, and resource values is implementation-defined.

The following table lists all order operators and the condition that they check for.

	<    left value smaller than right value
	>    left value larger than right value
	<=   left value smaller or equal to right value
	>=   left value larger or equal to right value


The following are examples of order operator expressions:

	a < b
	x >= 10
	epsilon < 0.01


2.5.10.5 Bitwise operators
Bitwise operators are used to manipulate bits in int values. A bitwise operator always evaluates all of its argument expressions and casts their values to int.

The prefix operator "~" (tilde) denotes bitwise negation of its argument value.

The prefix operator "++" (double plus sign) returns the value of its argument expression increased by one. If the argument is a reference expression or indexed reference expression, the increased value is also stored in the namespace in the same place that the original value was obtained from.

The prefix operator "--" (double minus sign) returns the value of its argument expression decreased by one. If the argument is a reference expression or indexed reference expression, the decreased value is also stored in the namespace in the same place that the original value was obtained from.

The infix operator "|" (vertical bar) computes the bitwise "or" of its argument values. This means bits set in either of the argument values will be set in the result value.

The infix operator "&" (ampersand) computes the bitwise "and" of its argument values. This means only bits set in both the argument values will be set in the result value.

The infix operator "^" (caret) computes the bitwise "exclusive or" of its argument values. This means only bits set in exactly one of the argument values will be set in the result value.

The postfix operator "++" (double plus sign) returns the value of its argument expression. In addition, if the argument expression is a reference or indexed reference expression, the value stored in the namespace is increased by one. The previous value is returned as result of the whole expression.

The postfix operator "--" (double minus sign) returns the value of its argument expression. In addition, if the argument expression is a reference or indexed reference expression, the value stored in the namespace is decreased by one. The previous value is returned as result of the whole expression.

The following are examples of bitwise operator expressions:

	i++
	flags & 0x40
	x ^ y
	--refcount


2.5.10.6 Operator precedence
If multiple operators occur in one expression, the order in which they are evaluated depends on the relative precedence of the two operators. Operators with higher precedence are evaluated first.

If the same operator occurs multiple times in an expression, the order of evaluation depends on the associativity of the operator. If the operator is left-associative, it is evaluated so that applications proceed from left to right. For a right-associative operator, applications proceed from right to left.

To change the order of evaluation or to use more than one instance of a non-associative operator in a single expression, the programmer can enclose subexpressions in parenthesis. Expressions inside parenthesis are evaluated first, independent of any operators outside the parenthesis.

The following table lists all operator symbols. Operators listed at the top have lower precedence than those listed below them. Operators listed on the same line have the same precedence. Associativity is given on the same line as the operator symbols it applies to.

	Associativity   Operators
	right           = += -= *= /= |= &= ^= <<= >>=
	none            ?
	right           ||
	right           &&
	right           !
	none            == != < <= > >=
	left            & | ^
	left            + - (infix)
	left            * / %
	right           **
	left            << >>
	left            ~ - (postfix)
	left            ++ --


Casts have higher precedence than any operator and associate to the right.

2.5.11 Conditional expression

A conditional expression is the expression equivalent to an if-else statement. It consists of an expression, followed by a "?" (question mark) character, followed by another expression, followed by a ":" (colon) character, followed by a third expression.

When a conditional expression is evaluated, the value of the first argument expression is evaluated and its result value is cast to bool. If the result is "true", the value of the second expression is evaluated and used as the value of the whole expression. The third expression is not evaluated. If the value of the first expression is "false", the third expression is evaluated and its value used as the value of the whole expression. The second expression is not evaluated in that case.

The following are examples of conditional expressions:

	x % 2 == 0 ? "even" : "odd"
	x ? false : true


2.5.12 Source file and line expressions

Source file and line expressions are used to refer to the script they appear in. They are mostly useful for printing error messages annotated with script source code locations.

The expression "__FILE__" is evaluated to a string value that contains the name of the script file that the expression appears in.

The expression "__LINE__" is evaluated to an int value that gives the line number that the expressions appears on, relative to the script file that it appears in.

2.5.13 Anonymous functions

An anonymous function is a function that does not have a name. Such a function cannot be defined by use of a function definition statement since that mandates an identifier to be used as the function's name. Instead, an anonymo