Galax Version
1.0 Documentation |
Part I |
XQuery 1.0 is a query language for XML, defined by the World-Wide Web Consortium (W3C), under the XML activity. XQuery is a powerful language, which supports XPath 2.0 as a subset and includes expressions to construct new XML documents, SQL-like expressions to perform selection, joins, and sorting over collections of XML values, operations on namespaces, and expressions over XML Schema types. XQuery is a functional language, which comes with an extensive library of built-in functions, and allows user to define their own functions. More information about XQuery can be found of the XML Query Working Group Web page1.
Galax is an implementation of XQuery 1.0 designed with the following goals in mind: completeness, conformance, performance, and extensibility. Galax is open-source, and has been used on a large variety of real-life XML applications. Galax relies on a formally specified and open architecture which is particularly well suited for users interested in teaching XQuery, or in experimenting with extensions of the language or optimizations.
Here is a list of the main Galax features.
Pass | Fail | Total | Percent | |
Minimal Conformance | 14553 | 71 | 14637 | (99.4%) |
Optional Features | ||||
Static Typing Feature | 46 | 0 | 46 | (100%) |
Full Axis Feature | 130 | 0 | 130 | (100%) |
Module Feature | 32 | 0 | 32 | (100%) |
The following features are experimental.
See Chapter 4 for details on Galax’s alignment with the XQuery and XPath working drafts.
The following lists the main changes included with this version (1.0).
Changes from older versions can be found in Chapter 8.
The official distribution can be downloaded from the main Galax Web site4. Detailed installation instructions are provided in Chapter 2.
The Galax processor offers the following user interfaces:
A number of stand-alone command-line tools are provided with the Galax
distribution. Assuming the Galax distribution is intalled in
$GALAXHOME
, and that Galax executables are reachable from your
$PATH
environment variable, The following examples show how to
use the main command-line tools.
% echo "<two>{ 1+1 }</two>" > test.xq % galax-run test.xq <two>2</two>evaluates the query
<two>{ 1+1 }</two>
and prints the XML
result <two>2</two>
.% galax-parse -validate -xmlschema $GALAXHOME/examples/docs/hispo.xsd \ $GALAXHOME/examples/docs/hispo.xml
For instance, this command will print out the XQuery type representation of the schema in hispo.xsd:
% galax-mapschema $GALAXHOME/examples/docs/hispo.xsd
Chapter 5 describes the command-line tools in detail.
The Web interface is a simple and convenient way to get acquainted with Galax. It allows users to submit a query, and view the result of compilation and execution for that query.
An on-line version is available on-line at: http://www.galaxquery.org/demo/galax_demo.html
You can also re-compile the demo from the Galax source and install it on your own system. You will need an HTTP server (Apache is recommended), and follow the compilation instructions in Section 2.4.
Galax supports APIs for OCaml, C, and Java. See Chapter 6 for how to use the APIs.
If you have installed the binary distribution of Galax, all three APIs are available.
If you have intalled the source distribution of Galax, you will need to select the language(s) for which you need API support at configuration time. See Chapter 2 for details on compiling Galax from source.
Examples of how to use Galax’s APIs can be found in the following
directories:
$GALAXHOME/examples/caml_api/
$GALAXHOME/examples/c_api/
$GALAXHOME/examples/java_api/.
This chapter contains the following sections:
There are two methods for installing Galax from source:
This method may be preferable if you are an OCaml user, already have an installation of OCaml and its libraries, and do not want to install a new version of OCaml. Even if you are already an OCaml user, the first method is always recommended.
Warning: Installing packages using GODI is an ON-LINE process, so preferably, you should be connected to the Net by a fast, hard-wire connection.
Follow GODI installation instructions through "bootstrap_stage2"
command, adding the
<godi-prefix-path>/bin
and
<godi-prefix-path>/sbin
directories to your PATH as instructed.
Note: For OCaml users, after GODI bootstrapping phase, the
OCaml executables are located in <godi-prefix-path>/bin
and the
OCaml libraries are located in <godi-prefix-path>/lib/ocaml
.
Select [b]uild & install, then e[x]it.
GODI will give you the PCRE library configure options:
[ 1] GODI_BASEPKG_PCRE = no [ 2] GODI_PCRE_INCDIR = [ 3] GODI_PCRE_LIBDIR =
Select 2 to set the path to the directory containing pcre.h (version 5.0 or higher).
Select 3 to set the path to the directory containing libpcre.a (version 5.0 or higher).
Select e[x]it twice to return to the main menu.
Select [b]uild & install, then e[x]it.
GODI will give you the BDB library configure options:
[ 1] GODI_BDB_INCDIR = [ 2] GODI_BDB_LIBDIR =
Select 1 to set the path to the directory containing db.h. (version 4.4 or higher).
Select 2 to set the path to the directory containing libdb.a (version 4.4 or higher).
Select e[x]it twice to return to the main menu.
Select [b]uild & install, then e[x]it.
This will add all of Galax’s dependencies to your GODI configuration. GODI will give you Galax’s configure options:
[ 1] GODI_GALAX_INCLUDE_JUNGLE = no [ 2] GODI_GALAX_INCLUDE_C_API = no [ 3] GODI_GALAX_INCLUDE_JAVA_$ = no [ 4] GODI_GALAX_INCLUDE_UTF8 = yes [ 5] GODI_GALAX_INCLUDE_ISO88$ = yes [ 6] GODI_GALAX_JAVA_HOME = unset [ 7] GODI_GALAX_JAVA_BIN = unset [ 8] GODI_GALAX_JAVA_INCLUDE = unset [ 9] GODI_GALAX_REGRESSION_SU$ = unset
You can change the default Galax configuration as follows:
Select 1 to include Jungle, Galax’s secondary storage manager.
Select 2 to include Galax’s C API.
Select 3 to include Galax’s Java API (requires the C API).
Select 4(5) to exclude UTF8(ISO88) character set (reduces sizes of executables).
Select 6(7,8) to set the Java home(bin,include) directories.
Select 9 to set directory where XQuery 1.0 test suite is installed.
Select e[xi]t.
Select [s]tart/continue, which will report all dependencies.
Then select [s]tart/continue, which will report actual dependencies based on your current GODI installation.
Then select [o]k.
Add <godi-prefix-path>/bin to your PATH.
When installation has completed, Galax will be installed in <godi-prefix-path>/ with the following subdirectories:
Directory | Sub-directory | Content |
bin/ | Galax’s command-line executables | |
lib/ | ||
pervasive.xq | Signatures of XQuery 1.0 Function and Operators | |
c/ | C libraries and API header files | |
java/ | Java libraries and API interfaces | |
ocaml/pkg-lib/galax/ | OCaml libraries and interfaces | |
share/galax/ | ||
examples/ | Examples of using C, Java, and OCaml APIs | |
regress/ | Test harness and configuration for the W3C XQuery Test Suite | |
usecases/ | XQuery 1.0 usecases |
If you already have an OCaml installation, you can install the Galax source by hand. To do so, you need the following versions of the OCaml compiler and libraries, and C libraries:
Library | Version | GODI-Package |
OCaml compiler3 | 3.10 | godi-ocaml |
OCaml Libraries | ||
findlib4 | 1.1.2pl1 | godi-findlib |
Pcre-ocaml5 | 5.12.2 | godi-pcre |
Ocamlnet6 | 2.2.8.1 | godi-ocamlnet |
Caml IDL7 | 1.05 | godi-camlidl |
PXP8 | 1.2.0test1 | godi-pxp |
Camomile9 | 0.7.1 | godi-camomile |
C Libraries | ||
Berkeley DB (libdb) | 4.3 | Only for Jungle |
PCRE | > 4.5 |
tar xvf galax-1.0.tar
From the galax/
you just created, run:
./configure -galax-home $GALAXHOME
Where $GALAXHOME
is the target installation directory.
If you want the C API, add -with-c.
If you want the Java API, add -with-java -java-home <Top-level Java directory>.
The configure script will try to find your OCaml installation and other necessary libraries by itself. Review the default configuration before proceeding. If you need to change the default configuration, run ./configure -help for all options.
In galax/
run:
make world
Go get more coffee...
In galax/
run:
make install
Add $GALAXHOME/bin
to your PATH environment variable.
When installation has completed, Galax will be installed in
$GALAXHOME
with these subdirectories:
Directory | Content |
bin/ | Command-line executables |
examples/ | Examples of using C, Java, and OCaml APIs |
regress/ | Test harness and configuration for the W3C XQuery Test Suite |
usecases/ | XQuery 1.0 usecases |
The Galax distribution includes a test harness for the W3C XQuery tests suite. It can be run by following the steps listed below.
http://www.w3.org/XML/Query/test-suite/
The Test Suite comes as a zip
archive, which can be unzipped in
a directory of your choice (We use $XQTS
in the following
instructions).
-regression $XQTS
make
from the $GALAXHOME/regress
directory. This will produce a file
called:
testresults-W3C.xml
which contains the result of running the test suite (following the format required for test results by the XQTS).
$XQTS/ReportingResultsThe stylesheet can be run by:
$XQTS/ReportingResults/Results.xmlas follows:
<results> <result>$GALAXHOME/regress/testresults-W3C.xml</result> </results>
$XQTS/ReportingResults/Results.xmland running one of the two following commands (you will need a working installation of ant, and of an XSLT processor):
ant -f Build.xml create ant -f Build.xml createsimplewhich should respectively produce the following HTML files:
XQTSReportSimple.html XQTSReport.html
The Galax Web site and on-line demo are bundled with the source distribution (only).
The Galax Web site has only been tested with Apache Web servers. We recommend you use Apache as some of the CGI scripts might be sensitive to the server you are using. Apache is commonly installed with most Linux distributions, or can be downloaded from: http://www.apache.org/.
<Directory "/var/www/html/galax"> Options All AllowOverride None AddHandler cgi-script .cgi Order allow,deny Allow from all </Directory>This permits scripts with suffix .cgi in /var/www/html/galax to be executed. The Galax demo is available at http://localhost/galax.
Your sysadmin may already have set up an Apache server for general use, and allows CGI programs by any user. You can verify by finding directives similar to the following in httpd.conf (wherever it might be located on your system),
AddHandler cgi-script .cgi <DirectoryMatch "/galax/cgi-bin"> AllowOverride AuthConfig Options ExecCGI SetHandler cgi-script </DirectoryMatch>
In that case, simply follow the comments in website/Makefile.config to choose installation destinations for your CGI programs and the HTML documents should suffice. The URL for accessing the installed site will depend on how your webserver is set up. Consult your sysadmin or webadmin for further help.
The simplest way to use Galax is by calling the galax-run interpreter from the command line. This chapter describes the most frequently used command-line options. Chapter 5 enumerates all the command-line options.
Before you begin, follow the instructions in Section 2.2 and run the following query to make sure your environment is set-up correctly:
% echo '<two>{ 1+1 }</two>' > test.xq % galax-run test.xq <two>2</two>
Galax evaluates expression <two>{ 1+1 }</two>
in file
test.xq and prints the result <two>2</two>
.
By default, Galax parses and evaluates an XQuery main module, which contains both a prolog and an expression. Sometimes it is useful to separate the prolog from an expression, for example, if the same prolog is used by multiple expressions. The -context option specifies a file that contains a query prolog.
All of the XQuery use cases in $GALAXHOME/usecases are implemented by separating the query prolog from the query expressions. Here is how to execute the Parts usecase:
% cd $GALAXHOME/usecases % galax-run -context parts_context.xq parts_usecase.xq
The other use cases are executed similarly, for example:
% galax-run -context rel_context.xq rel_usecase.xq
You can access an input document by calling the fn:doc() function and passing the file name as an argument:
% cd $GALAXHOME/usecases % echo ’fn:doc("docs/books.xml")’ > doc.xq % galax-run doc.xq
You can access an input document by referring to the context item (the “.” dot variable), whose value is the document’s content:
% echo ’.’ >dot.xq % galax-run -context-item docs/books.xml dot.xq
You can also access an input document by using the -doc argument, which binds an external variable to the content of the given document file:
% echo ’declare variable $x external; $x’ > var.xq % galax-run -doc x=docs/books.xml var.xq
By default, Galax serializes the result of a query in a format that reflects the precise data model instance. For example, the result of this query is serialized as the literal 2:
% echo "document { 1+1 }"> docnode.xq % galax-run docnode.xq document { 2 }
If you want the output of your query to be as the standard prescribes, then use the -serialize standard option:
% galax-run docnode.xq -serialize standard 2
By default, Galax serializes the result value to standard output. Use the -output-xml option to serialize the result value to an output file.
% galax-run docnode.xq -serialize standard -output-xml output.xml % cat output.xml 2
By default, Galax compiles the given query an returns the corresponding result. The following options can be set to print the query as it progresses through the compilation pipeline .
-print-expr [on/off] Print input expression -print-normalized-expr [on/off] Print expression after normalization -print-rewritten-expr [on/off] Print expression after rewriting -print-logical-plan [on/off] Print logical plan -print-optimized-plan [on/off] Print logical plan after optimization -print-physical-plan [on/off] Print physical plan
As the output for the compiled query can be quite large, it is often convenient to set the output to verbose using -verbose on, which prints headers for each phase. For instance, the following command prints the original query, and the optimized logical plan for the query.
% galax-run docnode.xq -verbose on -print-expr on -print-optimized-plan on
Galax supports several extensions to XQuery 1.0, notably XML updates and a procedural extensions. To enable one of those extensions, you must use the corresponding language level option on the command line:
galax-run -language ultf (: W3C Update Facility :) galax-run -language xquerybang (: XQuery! Language :) galax-run -language xqueryp (: XQueryP Language :)
Some examples of each of the three languages are provided in the
$GALAXHOME/examples/extensions
directory.
Part II |
This chapter documents the relationship of Galax to the target W3C working drafts. Galax 1.0 is a prototype implementation, and therefore it is not (yet) completely aligned with the relevant W3C working drafts (WDs). This chapter also document the non-standard features in Galax 1.0 and the known bugs and limitations.
Galax 1.0 implements the January, 2007 XQuery 1.0 and XPath 2.0 Candidate Recommendations, the XML 1.0 Recommendation, the Namespaces in XML Recommendation, and XML Schema Recommendation (Parts 1 and 2).
Galax 1.0 implements the XQuery 1.0 Recommendations:
Galax 1.0 fully supports the XQuery 1.0 and XPath 2.0 Data Model.
Galax 1.0 implements an xsd:float value as an xsd:double value.
The alignment issues in this section follow the outline of the "Expressions" section in http://www.w3c.org/TR/xquery. If a subsection is not listed here, it means that Galax 1.0 implements the semantics described in that section.
Galax 1.0 does not support:
The implicit timezone is set to the local timezone.
Galax’s processing model is similar to XQuery’s abstract processing model. See Section 7 for more information on Galax’s internal processing model.
Galax 1.0 does not check that a numeric value is equal to NaN when computing an effective boolean value.
Galax 1.0 does not support the fn:collection() function.
The context item and values for external variables can be specified on the command line or in the API. See Sections 5.1.1 and 6.2.
Galax requires that all actual types, that is, those types that annotate input documents be in the in-scope schema definitions. Galax will raise a dynamic error if it encounters a type in a document that is not imported into the query by an import schema prolog statement.
Galax supports the Schema Import, Static Typing, and Full Axis features.
Galax 1.0 supports the Module feature.
Galax 1.0 does not support the Pragmas feature.
Galax 1.0 does not support must-understand extensions.
Galax 1.0 does not support static typing extensions.
Galax 1.0 implements an xsd:float value as an xsd:double value.
Galax 1.0 supports all axes with the exception of the preceding and following axes.
When constructing a new element, Galax 1.0always erases/eliminates type annotations on copied elements.
When constructing a new element, Galax 1.0requires that the new element’s attributes precede its other content.
Namespace declarations in input and output documents and in input queries are not handled consistently. We are working on this.
Galax 1.0 will accept queries that contain the ordered or unordered expressions, but they have no effect on query evaluation (i.e., they are no-ops).
Galax 1.0 supports the Module feature.
Galax 1.0 does not support collations.
Galax 1.0 does not support the construction declaration.
Galax 1.0 does not support the default ordering declaration.
Schema components in an imported schema are mapped into XQuery types according to the mapping rules specified in the XQuery 1.0 Formal Semantics (see below).
The XQuery 1.0 formal semantics defines the mapping of every XQuery expression into an expression in the XQuery core, and it defines the static and dynamic semantics of each core expression. The formal semantics also defines how imported schemas are mapped into internal XQuery types.
Galax 1.0 implements the static and dynamic semantics of core expressions defined in the XQuery 1.0 Formal Semantics.
Galax 1.0 supports most of the functions in the XQuery 1.0 and XPath 2.0 Functions and Operators document. The signatures of supported functions are listed in $GALAXLIB/pervasive.xq.
Galax 1.0 does not support the following functions:
fn:id fn:idref fn:collection
The fn:trace function emits its input sequence and message are to standard output.
Galax 1.0does not support constructor functions for user-defined types.
Galax 1.0 does not support any functions on binary data.
See $GALAXHOME/usecases/STATUS
Type values are available in a query by either importing a predefined XML schema using the import schema declaration in the query prolog or by defining XQuery types explicitly in the query prolog.
Galax 1.0 supports the definition of XQuery types in the query prolog using the internal type syntax defined in the XQuery 1.0 Formal Semantics. The grammar is provided here for reference:
TypeDeclaration ::= ("define" "element" QName "{" TypeDefn? "}") | ("define" "attribute" QName "{" TypeDefn? "}") | ("define" "type" QName "{" TypeDefn? "}") TypeDefn ::= TypeUnion | TypeBoth | TypeSequence | TypeSimpleType | TypeAttributeRef | TypeElementRef | TypeTypeRef | TypeParenthesized | TypeNone TypeUnion ::= TypeDefn "|" TypeDefn TypeBoth ::= TypeDefn "&" TypeDefn TypeSequence ::= TypeDefn "," TypeDefn TypeSimpleType ::= QName OccurrenceIndicator TypeAttributeRef ::= "attribute" NameTest ("{" TypeDefn? "}")? OccurrenceIndicator TypeElementRef ::= "element" NameTest ("{" TypeDefn? "}")? OccurrenceIndicator TypeTypeRef ::= "type" NameTest OccurrenceIndicator TypeParenthesized::= "(" TypeDefn? ")" OccurrenceIndicator TypeNone ::= "none"
Galax-only functions are put in the Galax namespace (http://www.galaxquery.org), which is bound by default to the glx: prefix.
See $GALAXHOME/lib/pervasive.xq for a complete list of functions in the Galax namespace.
Galax supports the following stand-alone command-line tools:
The simplest way to use Galax is by calling the ’galax-run’ interpreter from the command line. The interpreter takes an XQuery input file, evaluates it, and yields the result on standard output.
Usage: galax-run options query.xq
For instance, the following commands from the Galax home directory:
% echo "<two> 1+1 </two>" > test.xq % $(GALAXHOME)/bin/galax-run test.xq <two>2</two>
evaluates the simple query <two> { 1+1 } </two>
and prints the
XML value <two>2</two>
.
The query interpreter has eight processing stages: parsing an XQuery expression; normalizing an XQuery expression into an XQuery core expression; static typing of a core expression; rewriting a core expression; factorizing a core expression; compiling a core expression into a logical plan; selecting a physical plan from a logical plan; and evaluating a physical plan.
Parsing, factorization, and compilation are always enabled. By default, the other phases are:
-normalize on -static off -rewriting on -optimization on -dynamic on
By default, all result values (XML result, inferred type, etc.) are written to standard output.
The command line options permit various combinations of phases, printing intermediate expressions, and redirecting output to multiple output files. Here are the available options. Default values are in code font.
galax-parse parses an XML document and optionally validates the document against an XML Schema.
Usage: galax-parse options document.xml
galax-mapschema maps XML schemas in xmlschema(s) into XQuery type expressions.
Usage: galax-mapschema options schema.xsd
galaxd is a server that allows Galax to be invoked over the network.
Usage: galaxd [options] [query.xq]
The query file query.xq should define a function named local:main(). An XQuery program can get the result of local:main() on host by calling doc("dxq://host/"). If the server is using a non-default port port, then use doc("dxq://host:port/").
It is sometimes useful to simulate a network of Galax servers on a single host. The -s option makes this possible. The way to set this up is to create a directory with a query file for each simulated host. For example, create a directory example with query files a.xq, b.xq, and c.xq. Each .xq file should define a local:main() function. Also, these files can refer to each other’s local:main() functions using doc("dxq://a/"), doc("dxq://b/"), and doc("dxq://c/"). Then start up the three servers as follows:
galaxd -s example -port 3324 galaxd -s example -port 3325 galaxd -s example -port 3326
galaxd uses the -s option to find out what the virtual network will look like: it will have hosts a, b, and c, operating on non-virtual ports 3324, 3325, and 3326 on localhost. The first invocation of galaxd above uses port 3324, so it uses a.xq to define its local:main() function. Similarly, the second and third invocations use b.xq and c.xq, respectively.
The quickest way to learn how to use the APIs is as follows:
Every Galax API has functions for:
This chapter describes how to use each kind of functions.
Galax currently supports application-program interfaces for the O’Caml, C, and Java programming languages.
All APIs support the same set of functions; only their names differ in each language API. This file describes the API functions. The interfaces for each language are defined in:
If you use the C API, see Section 6.5.1 “Memory Management in C API”.
Example programs that use these APIs are in:
To try out the API programs, edit examples/Makefile.config to set up your environment, then execute: cd $GALAXHOME/examples; make all.
This will compile and run the examples. Each directory contains a "test" program that exercises every function in the API and an "example" programs that illustrates some simple uses of the API.
The Galax query engine is implemented in O’Caml. This means that values in the native language (C or Java) are converted into values in the XQuery data model (which are represented are by O’Caml objects) before sending them to the Galax engine. The APIs provide functions for converting between native-language values and XQuery data-model values.
There are two kinds of Galax libraries: byte code and native code. The C and Java libraries require native code libraries, and Java requires dynamically linked libraries. Here are the libraries:
O’Caml libraries in $GALAXHOME/lib/caml:
C libraries in $GALAXHOME/lib/c:
Java libraries in $GALAXHOME/lib/java:
Note that Java applications MUST link with a dynamically linked library and that C applications MAY link with a dynamically linked library.
For Linux users, set LD_LIBRARY_PATH to $GALAXHOME/lib/c:$GALAXHOME/lib/java.
The Makefiles in examples/c_api and examples/java_api show how to compile, link, and run applications that use the C and Java APIs.
The simplest API functions allow you to evaluate an XQuery statement in a string. If the statement is an update, these functions return the empty list, otherwise if the statement is an Xquery expression, these functions return a list of XML values.
The example programs in $(GALAXHOME)/examples/caml_api/example.ml, $(GALAXHOME)/examples/c_api/example.c, $(GALAXHOME)/examples/java_api/Example.java illustrate how to use these query evaluation functions.
Galax accepts input (documents and queries) from files, string buffers, channels and HTTP, and emits output (XML values) in files, string buffers, channels, and formatters. See $(GALAXHOME)/lib/caml/galax_io.mli.
All the evaluation functions require a processing context. The default processing context is constructed by calling the function Processing_context.default_processing_context():
val default_processing_context : unit -> processing_context
There are three ways to evaluate an XQuery statement:
val eval_statement_with_context_item : Processing_context.processing_context -> Galax_io.input_spec -> Galax_io.input_spec -> item list
Bind the context item (the XPath "." expression) to the XML document in the resource named by the second argument, and evaluate the XQuery statement in the third argument.
val eval_statement_with_context_item_as_xml : Processing_context.processing_context -> item -> Galax_io.input_spec -> item list
Bind the context item (the XPath "." expression) to the XML value in the second argument and evaluate the XQuery statement in the third argument.
val eval_statement_with_variables_as_xml : Processing_context.processing_context -> (string * item list) list -> Galax_io.input_spec -> item list
The second argument is a list of variable name and XML value pairs. Bind each variable to the corresponding XML value and evaluate the XQuery statement in the third argument.
Sometimes you need more control over query evaluation, because, for example, you want to load XQuery libraries and/or main modules and evaluate statements incrementally. The following two sections describe the API functions that provide finer-grained control.
In the XQuery data model, a value is a sequence (or list) of items. An item is either an node or an atomic value. An node is an element, attribute, text, comment, or processing-instruction. An atomic value is one of the nineteen XML Schema data types plus the XQuery type xs:untypedAtomic.
The Galax APIs provide constructors for the following data model values:
The constructor functions for atomic values take values in the native language and return atomic values in the XQuery data model. For example, the O’Caml constructor:
val atomicFloat : float -> atomicFloat
takes an O’Caml float value (as defined in the Datatypes module) and returns a float in the XQuery data model. Similarly, the C constructor:
extern galax_err galax_atomicDecimal(int i, atomicDecimal *decimal);
takes a C integer value and returns a decimal in the XQuery data model.
The constructor functions for nodes typically take other data model values as arguments. For example, the O’Caml constructor for elements:
val elementNode : atomicQName * attribute list * node list * atomicQName -> element
takes a QName value, a list of attribute nodes, a list of children nodes, and the QName of the element’s type. Simliarly, the C constructor for text nodes takes an XQuery string value:
extern galax_err galax_textNode(atomicString str, text *);
The constructor functions for sequences are language specific. In O’Caml, the sequence constructor is simply the O’Caml list constructor. In C, the sequence constructor is defined in galapi/itemlist.h as:
extern itemlist itemlist_cons(item i, itemlist cdr);
The APIs are written in an "object-oriented" style, meaning that any use of a type in a function signature denotes any value of that type or a value derived from that type. For example, the function Dm_functions.string_of_atomicvalue takes any atomic value (i.e., xs_string, xs_boolean, xs_int, xs_float, etc.) and returns an O’Caml string value:
val string_of_atomicValue : atomicValue -> string
Similarly, the function galax_parent in the C API takes any node value (i.e., an element, attribute, text, comment, or processing instruction node) and returns a list of nodes:
extern galax_err galax_parent(node n, node_list *);
The accessor functions take XQuery values and return constituent parts of the value. For example, the children accessor takes an element node and returns the sequence of children nodes contained in that element:
val children : node -> node list (* O’Caml *) extern galax_err galax_children(node n, node_list *); /* C */
The XQuery data model accessors are described in detail in http://www.w3c.org/TR/query-datamodel.
Galax provides the load_document function for loading documents.
The load_document function takes the name of an XML file in the local file system and returns a sequence of nodes that are the top-level nodes in the document (this may include zero or more comments and processing instructions and zero or one element node.)
val load_document : Processing_context.processing_context -> Galax_io.input_spec -> node list (* O’Caml *)
extern galax_err galax_load_document(char* filename, node_list *); extern galax_err galax_load_document_from_string(char* string, node_list *);
The general model for evaluating an XQuery expression or statement proceeds as follows (each function is described in detail below):
let proc_ctxt = default_processing_context() in
let mod_ctxt = load_standard_library(proc_ctxt) in
let library_input = File_Input "some-xquery-library.xq" in let mod_ctxt = import_library_module pc mod_ctxt library_input in
let (mod_ctxt, stmts) = import_main_module mod_ctxt (File_Input "some-main-module.xq") in
let ext_ctxt = build_external_context proc_ctxt opt_context_item var_value_list in let mod_ctxt = add_external_context mod_ctxt ext_ctxt in
let mod_ctxt = eval_global_variables mod_ctxt
** NB: This step is necessary if the module contains *any* global variables, whether defined in the XQuery module or defined externally by the application. **
let result = eval_statement proc_ctxt mod_ctxt stmt in
let result = eval_statement_from_io proc_ctxt mod_ctxt (Buffer_Input some-XQuery-statement) in
let result = eval_query_function proc_ctxt mod_ctxt "some-function" argument-values in
Every query is evaluated in a module context, which includes:
The functions for creating a module context include:
val default_processing_context : unit -> processing_context
The default processing context, which just contains flags for controlling debugging, printing, and the processing phases. You can change the default processing context yourself if you want to print out debugging info.
val load_standard_library : processing_context -> module_context
Load the standard Galax library, which contains the built-in types, namespaces, and functions.
val import_library_module : processing_context -> module_context -> input_spec -> module_context
If you need to import other library modules, this function returns the module_context argument extended with the module in the second argument.
val import_main_module : processing_context -> module_context -> input_spec -> module_context * (Xquery_ast.cstatement list)
If you want to import a main module defined in a file, this function returns the module_context argument extended with the main module in the second argument and a list of statements to evaluate.
The functions for creating an external context (context item and global variable values):
val build_external_context : processing_context -> (item option) -> (atomicDayTimeDuration option) -> (string * item list) list -> external_context
The external context includes an optional value for the context item (known as "."), the (optional) local timezone, and a list of variable name, item-list value pairs.
val add_external_context : module_context -> external_context -> module_context
This function extends the given module context with the external context.
val eval_global_variables : processing_context -> xquery_module -> xquery_module
This function evaluates the expressions for all (possibly mutually dependent) global variables. It must be called before calling the eval_* functions otherwise you will get an "Undefined variable" error at evaluation time.
Analogous functions are defined in the C and Java APIs.
The APIs support three functions for evaluating a query: eval_statement_from_io, eval_statement, and eval_query_function.
Note: If the module context contains (possibly mutually dependent) global variables, the function eval_global_variables must be called before calling the eval_* functions otherwise you will get an "Undefined variable" error at evaluation time.
val eval_statement_from_io : processing_context -> xquery_module -> Galax_io.input_spec -> item list
Given the module context, evaluates the XQuery statement in the third argument. If the statement is an XQuery expression, returns Some (item list); otherwise if the statement is an XQuery update, returns None (because update statements have side effects on the data model store, but do not return values).
val eval_statement : processing_context -> xquery_module -> xquery_statement -> item list
Given the module context, evaluates the XQuery statement
val eval_query_function : processing_context -> xquery_module -> string -> item list list -> item list
Given the module context, evaluates the function with name in the string argument applied to the list of item-list arguments. Note: Each actual function argument is bound to one item list.
Analogous functions are defined in the C and Java APIs.
Once an application program has a handle on the result of evaluating a query, it can either use the accessor functions in the API or it can serialize the result value into an XML document. There are three serialization functions: serialize_to_string, serialize_to_output_channel and serialize_to_file.
val serialize : processing_context -> Galax_io.output_spec -> item list -> unit
Serialize an XML value to the given galax output.
val serialize_to_string : processing_context -> item list -> string
Serializes an XML value to a string.
Analogous functions are defined in the C and Java APIs.
The Galax query engine is implemented in O’Caml. This means that values in the native language (C or Java) are converted into values in the XQuery data model (which represented are by O’Caml objects) before sending them to the Galax engine. Similarly, the values returned from the Galax engine are also O’Caml values – the native language values are "opaque handles" to the O’Caml values.
All O’Caml values live in the O’Caml memory heap and are therefore managed by the O’Caml garbage collector. The C API guarantees that any items returned from Galax to a C application will not be de-allocated by the O’Caml garbage collector, unless the C appliation explicitly frees those items, indicating that they are no longer accessible in the C appliation. The C API provides two functions in galapi/itemlist.h for freeing XQuery item values:
extern void item_free(item i);
Frees one XQuery item value.
extern void itemlist_free(itemlist il);
Frees every XQuery item value in the given item list.
The Galax query engine may raise an exception in O’Caml, which must be conveyed to the C application. Every function in the C API returns an integer error value :
The global variable galax_error_string contains the string value of the exception raised in Galax. In future APIs, we will provide a better mapping between error codes and Galax exceptions
The Galax query engine is implemented in O’Caml. This means that values in the native language (C or Java) are converted into values in the XQuery data model (which represented are by O’Caml objects) before sending them to the Galax engine.
The Java API uses JNI to call the C API, which in turn calls the O’Caml API (it’s not as horrible as it sounds).
There is one class for each of the built-in XML Schema types supported by Galax and one class for each kind of node:
Atomic | Node | Item |
xsAnyURI | Attribute | |
xsBoolean | Comment | |
xsDecimal | Element | |
xsDouble | ProcessingInstruction | |
xsFloat | Text | |
xsInt | ||
xsInteger | ||
xsQName | ||
xsString | ||
xsUntyped |
There is one class for each kind of sequence:
There is one class for each kind of context used by Galax:
Finally, the procedures for loading documents, constructing new contexts and running queries are in the Galax class.
All Galax Java API functions can raise the exception class GalapiException, which must be handled by the Java application.
All Java-C-O’Caml memory management is handled automatically in the Java API.
Currently, Galax is not re-entrant, which means multi-threaded applications cannot create multiple, independent instances of the Galax query engine to evaluate queries.
The C API library libgalaxopt.a,so does not link properly under MinGW. A user reported that if you have the source distribution, you can link directly with the object files in galapi/c_api/*.o and adding the library -lasmrun on the command line works.
The Galax source-code directories roughly correspond to each phase of the query processor. (Put link to Jerome’s tutorial presentation here)
The processing phases are:
Document Parsing => [Schema Normalization (below) =>] Validation => Loading => Evaluation (below)
Schema Parsing => Schema Normalization => Validation (above) Static Typing (below)
Query Parsing => Normalization => [Schema Normalization (above) =>] Static Typing (optional phase) => Rewriting => Compilation => [Loading (above) =>] Evaluation => Serialization
Makefile
base/
ast/
config/
monitor/
toplevel/
website/
datatypes/ (*** Doug)
We are going to extend this module to include lexer for: xsd:date, xsd:time, xsd:dateTime, xs:yearMonthDuration, xs:dayTimeDuration (Skip Gregorian types for now, xsd:gDay, xsd:gMonth, etc)
namespace/
dm/ (*** Doug)
datamodel/
jungledm/
physicaldm/
streaming/
procctxt/
procmod/
lexing/
parsing/
normalization/
fsa/
typing/
schema/
cleaning/
rewriting/
compile/
algebra/
evaluation/
stdlib/
serialization/
usecases/
examples/
regress/
galapi/
tools/
Required tools:
Optional supported tools:
Optional unsupported tools:
extensions/
projection/
wsdl/
wsdl_usecases/
Auxiliary research tools:
Galax version 1.0 implements the XQuery 1.0 Recommendation from January, 2007 (http://www.w3c.org/TR/xquery) and the XQuery Update Facility 1.0 from August, 2007 (http://www.w3.org/TR/xquery-update-10/). It also implements XQueryP, an imperative scripting language that extends XQuery with updates with mutable variables, while loops, and sequential expressions (http://www.ximep-2006.org/papers/Paper-Chamberlin-Carey.pdf).
Feature | Pass | Fail | Total | Percent |
Minimal Conformance | 14555 | 69 | 14637 | (99.4%) Optional Features |
Static Typing Feature | 46 | 0 | 46 | |
Full Axis Feature | 130 | 0 | 130 | |
Module Feature | 32 | 0 | 32 |
Galax version 0.7.2 is a minor release, and should be considered as a beta release. Galax 0.7.2 implements the XQuery 1.0 Recommendation from January, 2007.
This is a development release. Notably static typing and some of the new compiler optimizations are not fully tested.
Feature | Galax Pass | Fail | Total | Percent |
Minimal Conformance | 14514 | 110 | 14637 | (99.1%) |
Optional Features | ||||
Schema Import Feature | 0 | 0 | 174 | |
Schema Validation Feature | 0 | 0 | 25 | |
Static Typing Feature | 46 | 0 | 46 | |
Full Axis Feature | 130 | 0 | 130 | |
Module Feature | 0 | 0 | 32 | |
Trivial XML Embedding Feature | 0 | 0 | 4 |
Galax version 0.6.8 is a minor release, and should be considered as a beta release. Galax 0.6.8 implements the XQuery 1.0 candidate recommendation working drafts from January, 2007.
This is a development source-only release. Notably static typing and some of the new compiler optimizations are not fully tested.
Galax version 0.6.5 is a major release, and should be considered as an alpha release. Galax 0.6.5 implements the XQuery 1.0 candidate recommendation working drafts from January, 2007.
This is a development source-only release. Notably static typing and some of the new compiler optimizations are not fully tested.
Galax version 0.5 is a major release, and should be considered as an alpha release. Galax 0.5.0 implements the XQuery 1.0 working draft published in October 2004.
Among the most noticeable changes:
Have contributed to this release: Mary Fernández, Nicola Onose, Philippe Michiels, Christopher Ré, Jérôme Siméon, Michael Stark.
Galax version 0.4 is a major release, and should be considered as an alpha release. Galax 0.4.0 implements the latest XQuery 1.0 working draft published in July 2004. It contains many improvements from the previous version, as well as new features.
Among the most noticeable improvements and new features: Galax now comes bundled with Jungle, a simple native XML store. It now supports XML Schema and "named typing". Finally, it contains some prototype support for Web services.
Have contributed to this release: Mary Fernández, Vladimir Gapeyev, Nicola Onose, Philippe Michiels, Doug Petkanics, Christopher Ré, Jérôme Siméon, Avinash Vyas.
Main changes over the previous version are listed below.
Language changes:
Environment changes:
Architectural changes:
New features:
Portability:
This is a bug-fix release:
Language extensions:
Command line:
API:
Parsing:
Bugs
Language: Numerous changes to align with Nov. 2002 and upcoming Feb 2003 WDs.
Galax features:
Data model:
Function library:
Parsing
Data model:
Language:
Namespaces:
XML Schema:
Type system:
Function library:
Optimizer:
Compilation:
Tests:
Tools and interfaces:
Documentation:
Contributors:
Alumni:
Feedback and bug reports can be sent by mail to: galax-users@research.att.com
You can be subscribe to the Galax mailing list at: galax-users-request@research.att.com
You can report bugs at http://bugzilla.galaxquery.net/.
Galax version 1.0 is distributed under the terms of the LUCENT PUBLIC LICENSE VERSION 1.0 - see the LICENSE file for details.
Lucent Public License Version 1.0 THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT’S ACCEPTANCE OF THIS AGREEMENT. 1. DEFINITIONS "Contribution" means: a.in the case of Lucent Technologies Inc. ("LUCENT"), the Original Program, and b.in the case of each Contributor, i.changes to the Program, and ii.additions to the Program; where such changes and/or additions to the Program originate from and are "Contributed" by that particular Contributor. A Contribution is "Contributed" by a Contributor only (i) if it was added to the Program by such Contributor itself or anyone acting on such Contributor’s behalf, and (ii) the Contributor explicitly consents, in accordance with Section 3C, to characterization of the changes and/or additions as Contributions. "Contributor" means LUCENT and any other entity that has Contributed a Contribution to the Program. "Distributor" means a Recipient that distributes the Program, modifications to the Program, or any part thereof. "Licensed Patents" mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program. "Original Program" means the original version of the software accompanying this Agreement as released by LUCENT, including source code, object code and documentation, if any. "Program" means the Original Program and Contributions or any part thereof "Recipient" means anyone who receives the Program under this Agreement, including all Contributors. 2. GRANT OF RIGHTS a.Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form. b.Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. The patent license granted by a Contributor shall also apply to the combination of the Contribution of that Contributor and the Program if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license granted by a Contributor shall not apply to (i) any other combinations which include the Contribution, nor to (ii) Contributions of other Contributors. No hardware per se is licensed hereunder. c.Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient’s responsibility to acquire that license before distributing the Program. d.Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement. 3. REQUIREMENTS A. Distributor may choose to distribute the Program in any form under this Agreement or under its own license agreement, provided that: a.it complies with the terms and conditions of this Agreement; b.if the Program is distributed in source code or other tangible form, a copy of this Agreement or Distributor’s own license agreement is included with each copy of the Program; and c.if distributed under Distributor’s own license agreement, such license agreement: i.effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and non-infringement, and implied warranties or conditions of merchantability and fitness for a particular purpose; ii.effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits; and iii.states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party. B. Each Distributor must include the following in a conspicuous location in the Program: Copyright (C) 2003, Lucent Technologies Inc. and others. All Rights Reserved. C. In addition, each Contributor must identify itself as the originator of its Contribution, if any, and manifest its intent that the additions and/or changes be a Contribution, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution. Once consent is granted, it may not thereafter be revoked. 4. COMMERCIAL DISTRIBUTION Commercial distributors of software may accept certain responsibilities with respect to end users, business partners and the like. While this license is intended to facilitate the commercial use of the Program, the Distributor who includes the Program in a commercial product offering should do so in a manner which does not create potential liability for Contributors. Therefore, if a Distributor includes the Program in a commercial product offering, such Distributor ("Commercial Distributor") hereby agrees to defend and indemnify every Contributor ("Indemnified Contributor") against any losses, damages and costs (collectively "Losses") arising from claims, lawsuits and other legal actions brought by a third party against the Indemnified Contributor to the extent caused by the acts or omissions of such Commercial Distributor in connection with its distribution of the Program in a commercial product offering. The obligations in this section do not apply to any claims or Losses relating to any actual or alleged intellectual property infringement. In order to qualify, an Indemnified Contributor must: a) promptly notify the Commercial Distributor in writing of such claim, and b) allow the Commercial Distributor to control, and cooperate with the Commercial Distributor in, the defense and any related settlement negotiations. The Indemnified Contributor may participate in any such claim at its own expense. For example, a Distributor might include the Program in a commercial product offering, Product X. That Distributor is then a Commercial Distributor. If that Commercial Distributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Distributor’s responsibility alone. Under this section, the Commercial Distributor would have to defend claims against the Contributors related to those performance claims and warranties, and if a court requires any Contributor to pay any damages as a result, the Commercial Distributor must pay those damages. 5. NO WARRANTY EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is solely responsible for determining the appropriateness of using and distributing the Program and assumes all risks associated with its exercise of rights under this Agreement, including but not limited to the risks and costs of program errors, compliance with applicable laws, damage to or loss of data, programs or equipment, and unavailability or interruption of operations. 6. DISCLAIMER OF LIABILITY EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 7. EXPORT CONTROL The Recipient acknowledges that the Program is "publicly available" as the term is defined under the United States export administration regulations and is not subject to export control under such laws and regulations. However, if the Recipient modifies the Program to change (or otherwise affect) such publicly available status, the Recipient agrees that Recipient alone is responsible for compliance with the United States export administration regulations (or the export control laws and regulation of any other countries) and hereby indemnifies the Contributors for any liability incurred as a result of the Recipients actions which result in any violation of any such laws and regulations. 8. GENERAL If any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable. If Recipient institutes patent litigation against a Contributor with respect to a patent applicable to software (including a cross-claim or counterclaim in a lawsuit), then any patent licenses granted by that Contributor to such Recipient under this Agreement shall terminate as of the date such litigation is filed. In addition, if Recipient institutes patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient’s patent(s), then such Recipient’s rights granted under Section 2(b) shall terminate as of the date such litigation is filed. All Recipient’s rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient’s rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient’s obligations under this Agreement and any licenses granted by Recipient relating to the Program shall continue and survive. Lucent Technologies Inc. may publish new versions (including revisions) of this Agreement from time to time. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. No one other than Lucent has the right to modify this Agreement. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved. This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation.
Galax’s error messages are often uninformative. We are working on this.
Namespace declarations in input and output documents and in input queries are not handled consistently. We are working on this.
Although module declarations and module import statements are supported, they are not well tested.
This document was translated from LATEX by HEVEA.