Abstract
GNU APL provides a number of communication mechanisms that allow it to control other programs or to be controlled by other programs. With these communication mechanisms one can construct larger systems in which GNU APL is only one component. This document gives an overview of the communication mechanisms, so that the reader can choose the most appropriate mechanisms for a given problem.
Terminology
A server is a program or process that is being controlled by one or more other programs or processes. A client is a program or process that controls another program or process. A peer is a program or process that communicates with GNU APL, but is not a client or a server with respect to that communication. In a client/server communication it is the client that establishes the communication link between the two partners.
In general the communication between a client and a server is somewhat simpler than the communication between two peers. The role of one communication partner dictates the role of the other:
GNU APL | Partner |
---|---|
Client |
Server |
Server |
Client |
Peer |
Peer |
If GNU APL is the client or a peer in a communication then the communication partner usually needs to exist before GNU APL can connect to the partner. Some partners are being started beforehand by the operating system (Web servers, SQL servers), some are started automatically by GNU APL (*APserver( for shared variables) and some may need to be started by other means before a GNU APL program can use them.
GNU APL Communication Mechanisms Overview
The following table gives an overview of the communication mechanisms that are implemented in GNU APL. Most communication mechanisms are accessed from an APL program via system variables. Some system variables need certain libraries and if these libraries are not present at the time when GNU APL was ./configure’d, then these system variables will not exist (raising a SYNTAX ERROR) or raise a DOMAIN ERROR. The fact that GNU APL is in one role, for example as server, does not preclude it from being a client in a different communication.
Mechanism | GNU APL Role |
System Variable(s) | TTFB | Notes |
---|---|---|---|---|
libapl |
Server |
- |
low |
- |
Shared Variables |
Peer |
⎕SVC ⎕SVO ⎕SVQ |
medium |
for IBM APL2 compatibility only, |
Shared Variables |
Client |
⎕SVC ⎕SVO ⎕SVQ |
medium |
for IBM APL2 compatibility only, |
SQL Interface |
Client |
⎕SQL |
low |
Server is PostgreSQL or SQLite |
Server forked by GNU APL |
Client |
⎕FIO[57] |
medium |
bidirectional pipe |
popen() |
Client |
⎕FIO[24] |
low |
unidirectional pipe |
Sockets |
Any |
⎕FIO[32 … 47] |
medium |
TCP, UDP, or U*ix Domain Sockets |
Native functions |
Any |
dyadic ⎕FX |
high |
- |
Note: TTFB is the "Time To First Byte" of a communication and shall indicate how long it takes to get a communication mechanism up and running (APL code plus possibly code needed for the communication partner).
GNU APL Communication Mechanisms Details
In the following, the different communication mechanisms are described in some more detail.
libapl
Normally GNU APL initializes itself end then enters a REPL loop until the )OFF command is given (interactive mode) or the input stream ends (script mode). REPL stands for *R*ead-*E*valuate-*P*rint-*L*oop and that describes the top-=level operation of an APL interpreter in interactive mode.
If GNU APL is ./configure’d with option --with-libapl, then GNU APL does not build a normal executable that can be started from a script or from a shell, but it builds a library libapl. That library can then be linked with another program that either implements its own REPL loop (thus providing some interactive mode, but with a different user interface), or only call GNU APL facilities from time to time.
The following is an example C program that demonstrates the use of libapl:
#include "apl/libapl.h"
int
main(int argc, char * argv[])
{
init_libapl(argv[0], 0);
return apl_exec("⎕←2 3⍴⍳6");
}
If the code above is stored in, say, file test_libapl.c, and if libapl was properly installed (and assuming the default installation prefix /usr/local of GNU APL), the following command should create an executable file test_libapl:
gcc test_libapl.c -L /usr/local/lib/apl -lapl -o test_libapl
And running the newly created test_libapl gives what could be expected:
$ ./test_libapl
1 2 3
4 5 6
Shared Variables
Share variables is a communication mechanism that reaches back to the early days of APL. In essence, a Share variable is an APL variable that represents a communication channel to a peer. If one side assigns a value to a shared variable, then the other side gets that value when referencing the variable (and vice versa). The communication channel is established, maintained, and closed by a set of system variables named ⎕SVC (shared variable create) ⎕SVO (shared variable offer) ⎕SVQ (shared variable query), ⎕SVR (shared variable retract), and ⎕SVS (shared variable status). The details of how these system variables are used will not be explained here, because shared variables exist in GNU APL only for compatibility with IBM APL2. For new APL programs, the socket functions in ⎕FIO are much easier to use than shared variables and have less restrictions.
Shared variables come in 2 varieties:
-
workspace-to-workspace variables which create a communication link between two running APL workspaces (in different processes on the same machine. This is a peer-to-peer communication.
-
Auxiliary Processors (APs), which are processes started locally by the APL interpreter. In this case the APs are servers and APL is the client. GNU APL only implements 2 APs (AP100 and AP210) and only as examples for how to write other ones. In the past, APs were used to make specific operating services available to APL, but in GNU APL the same can be achieved with ⎕FIO[57] for services defined by the user and/or ⎕FIO[24] for services already available on the operating system.
Those interested in the style of shared variables programming may have a look at testcases AP100.tc for AP100, AP210.tc for AP210, and APnnn.tc for workspace-to-workspace shared variables in the directory src/testcases/ of GNU APL.
The GNU APL implementation of shared variables is a separate process running the program APserver (which comes with GNU APL). APserver currently implements only AP100 and AP210, but the source code of APserver is open, so that a user can add more APs. Is seems, however, that most of the APs that were not implemented have been superseded by more modern interfaces (most of which are sub-functions of ⎕FIO) that are easier to use that shared variables.
For workspace-to-workspace shared variables, the APserver stores the shared variables themselves, so that no data is lost if one of the connected workspaces exits or ⎕SVRs a shared variable. APserver also maps the peer-to-peer connection between two workspaces onto two client/server connections from each workspace (as client) to APserver (as server). APserver is started automatically when an APL workspace uses shared variables for the first time.
Since shared variables are considered obsolete, the effort put into their implementing them was kept at a minimum. As a consequence, the implementation is rather crude and not optimized for performance.
SQL Interface
The System function ⎕SQL provides a direct interface to a SQL database. Currently PostgreSQL and SQLite databases are supported. ⎕SQL is a container for a number of sub-functions that can be displayed with:
⎕SQL""
Available function numbers:
type ⎕SQL[1] file - open a database file, return reference ID for it
⎕SQL[2] ref - close database
query ⎕SQL[3,db] params - send SQL query
query ⎕SQL[4,db] params - send SQL update
⎕SQL[5] ref - begin a transaction
⎕SQL[6] ref - commit current transaction
⎕SQL[7] ref - rollback current transaction
⎕SQL[8] ref - list tables
ref ⎕SQL[9] table - list columns for table
More details on ⎕SQL can be found in info apl (around Section 3.24).
Server forked by GNU APL
Handle←⎕FIO[57] Bs spawns a process running the program Bs. Bs is the absolute (!) path of an executable program or script, which is started and connected to GNU APL over a bidirectional pipe.
The return value Handle of ⎕FIO[57] Bs is a file handle that can be used on the APL side of the communication to send data to resp. receive data from the spawned program using, for example, ⎕FIO[6] Handle (aka fread) resp. ⎕FIO[8] Handle (aka. fwrite). The handle is normally closed with ⎕FIO[4] Handle (aka. fclose).
At the spawned program’s end of the communication, the communication with GNU APL uses file descriptor 3 which is normally only used to fdopen() a FILE *. An fclose from the APL side causes an EOF condition (fread() returns 0) in the spawned program. Under normal circumstances, the communication should be closed from the client (= APL) side.
The following is a simple example for a server written in "C", and GNU APL code that forks the server and then sends a TLV (Tag-Length-Value) buffer to the server. The server sends the TLV buffers back, but with the tag negated and the value bytes mirrored.
Client side (APL):
Path ← '/usr/lib/apl/TLV_server' ⍝ wherever TLV_server was installed
Handle ← ⎕FIO[57] Path ⍝ start & connect TLV_server
⍝ typically in a loop,,,
TLV ← 33 ⎕CR 42,'Forty-Two' ⍝ encode a TLV buffer
⊣TLV ⎕FIO[43] Handle ⍝ send TLV buffer to TLV_server
TL ← 8 ⎕FIO[6] Handle ⍝ read tag/length from TLV_server
Value ← (256⊥4↓TL) ⎕FIO[6] Handle ⍝ read value from TLV_server
34 ⎕CR TL,Value ⍝ display response tag and value
⍝ cleanup (one time)
⊣(⎕FIO[4] Handle) ⍝ close connection (stops
TLV_server)
Server side (C, this example can also be found in file tools/TLV_server.c):
#include <errno.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/ioctl.h>
int
main(int argc, char * argv[])
{
enum { TLV_socket = 3 };
char TLV[1000]; // the entire TLV buffer
char * V = TLV + 8; // the value part of the TLV buffer
ssize_t rx_len, tx_len;
int32_t TLV_tag, TLV_len, j;
FILE * f = fdopen(TLV_socket, "r");
if (f == 0)
{
fprintf(stderr,
"fdopen() failed: %s\n"
"That typically happens if this program is started directly,\n"
"more precisely: without opening file descriptor 3 first. The anticipated\n"
"usage is to open this program from GNU APL using ⎕FIO[57] and then to send\n"
"TLV buffers encoded with 33 ⎕CR and send to this program with ⎕FIO[43]\n"
, strerror(errno));
return 1;
}
for (;;)
{
// read the fixed size TL
//
rx_len = fread(TLV, 1, 8, f);
if (rx_len != 8)
{
fprintf(stderr, "TLV socked closed (1): %s\n",
strerror(errno));
fclose(f);
return 0;
}
TLV_tag = TLV[0] << 24 | TLV[1] << 16 | TLV[2] << 8 | TLV[3];
TLV_len = TLV[4] << 24 | TLV[5] << 16 | TLV[6] << 8 | TLV[7];
// read the variable-sized V
//
if (TLV_len)
{
rx_len = fread(V, 1, TLV_len, f);
if (rx_len != TLV_len)
{
fprintf(stderr, "TLV socked closed (2): %s\n",
strerror(errno));
fclose(f);
return 0;
}
V[TLV_len] = 0;
}
// negate the tag
//
TLV_tag = -TLV_tag;
TLV[0] = TLV_tag >> 24;
TLV[1] = TLV_tag >> 16;
TLV[2] = TLV_tag >> 8;
TLV[3] = TLV_tag;
// mirror the value
//
for (j = 0; j < TLV_len/2; ++j)
{
const char tmp = V[j];
V[j] = V[TLV_len - j - 1];
V[TLV_len - j - 1] = tmp;
}
// send the response
//
tx_len = write(TLV_socket, TLV, 8 + TLV_len);
if (tx_len != (8 + TLV_len))
{
fprintf(stderr, "TLV socked closed (3): %s\n", strerror(errno));
fclose(f);
return 0;
}
}
}
popen()
Handle←⎕FIO[24] Bs (aka. popen) performs a function similar to ⎕FIO[57] above, but with the following differences:
-
the pipe is unidirectional
-
either to stdin (file descriptor 0) of the spawned process,
-
or to stdout (file descriptor 1) of the spawned process
-
-
the pipe is closed with ⎕FIO[24] Handle (aka. pclose) instead of pclose.
The following APL code executes the operating system command ls and prints its output:
Command←"ls"
Handle←⎕FIO[24] Command ⍝ PIPE = popen("ls")
Data←⎕FIO[6] Handle ⍝ fread(PIPE)
⎕UCS Data ⍝ display fread() output
⎕FIO[25] Handle ⍝ pclose(PIPE)
Sockets
The ⎕FIO functions 32…47, i.e. ⎕FIO[32], ⎕FIO[33], … ⎕FIO[47] give APL access to the most common socket functions in libstdc. APL can use these function to set up sockets in a way that specific low-level properties can be set via socket options, transport protocol options, and the like. The use of these functions requires some experience with socket programming.
info apl (around section 3.20) provides some more details about ⎕FIO. One can do ⎕FIO "" to obtain a list of ⎕FIO sub-function, in particular the names of the corresponding functions in the libstdc library.
That function name can then used with with man command which explains the expected function arguments. By and large ⎕FIO takes the same arguments as the corresponding libc functions, but some of the function arguments (in particular those taking pointer arguments) were mapped to APL byte arrays.
The testcase file src/testcases/Quad_FIO.tc contains APL code for a short client/server communication between two TCP sockets on Port 22222 (around line 170 of the testcase file).
Native functions
If the capabilities provided by sockets are still not enough, then one can get complete control by means of GNU APL native function. A native function is a function that can be called from APL (and returns results back to APL, but has an implementation in a different language (typically C or C++).
Writing native functions is a little more complicated than using communication facilities already present in APL like the mechanisms explained so far. The design and use of a native functions consists of the following steps:
First of all, preparation of a dynamic library that contains implementations for those function signatures that the native function shall support. For example, suppose a function FOO shall support the 3 signatures
-
dyadic function call, i.e. Z←A FOO B,
-
monadic function call, i.e. Z←FOO B, and
-
monadic function with axis call, i.e. Z←FOO[X] B.
The native function library then needs to implement 3 functions and make them accessible via dlsym().
After the dynamic library, say FOO.so, has been created, GNU APL can open the dynamic library, with:
'FOO.so' ⎕FX 'FOO'
That makes the 3 signatures implemented FOO.so available in APL (for the same signatures).
Note: Native functions were not primarily intended for communication purposes. But since they import the full power of C/C++ into GNU APL they are a last resort if all methods above can not be used.
The directory src/native contains C++ code templates for different APL function signatures. You can use these templates as a starting point for your own native function(s). If you modify a template in-place (i.e. you do NOT copy and rename it) then the GNU APL build system will compile your changes and create .so files in directory src/native/.libs that can be ⎕FXed later on.
Communication over streams
The majority of the communication mechanisms above use - explicitly or implicitly - some sort of data framing between the communication partners. This data framing divides the entire communication between the partners into several transactions, where one transaction is, depending on the machanism:
-
one libapl function call, or
-
one shared variable assignment, or
-
one write() to a ⎕FIO[57] pipe, or
-
one SQL query, or
-
one datagram sent over a socket of type SOCK_DGRAM, or
-
one native function call.
The remaining mechanisms:
-
popen(),
-
sockets of type SOCK_STREAM
send their data in a byte-by-byte fashion, which may cause a problem for the receiver because the receiver cannot reliably detect the boundaries between different transactions. In these cases the division of the byte stream(s) between the communication partners into transactions has to be performed by the APL application, and GNU APL provides a few system functions that can be used to frame (read: encode) an item at the sender side into a sequence of bytes, and to de-frame (read: decode) the received byte stream back into the item that was sent at the receiver side. The frame and de-frame functions cooperate in such a way that;
item ≡ decode encode item
decode ((encode A),(encode B)) ≡ (decode encode A), (decode encode B)
The frame and de-frame differ by the kind of item that is to be sent over a byte stream.
Tag/Length/Value aka. TLV
TLV is a frequantly used format for sending byte vectors over a stream. The byte vector is prepended by a tag field and a length field indicating the length of the byte vector. The tag and length fields have a fixed size (typically 1, 2, or 4 bytes) while the length of the byte vector can vary. Decoding of the TLV is very simple and efficient. The de-framing as such onlt requires the length field, but the tag is often required to distinguish different data types sent over the same stream.
System function 33 ⎕CR converts TAG,byte-vector to a TLV (sender side) while its inverse function ¯33 ⎕CR converts a TLV back to TAG,byte-vector. Example:
5 ⎕CR 33 ⎕CR 42,"Hello"
0000002A0000000548656C6C6F
│ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ └─┴─┴─┴─┴───── "Hello:
│ │ │ │ └─┴─┴─┴─────────────── 5 (lenght)
└─┴─┴─┴─────────────────────── 42 (tag)
¯33 ⎕CR 33 ⎕CR 42,"Hello"
42 Hello
Lines terminated by line-feeds
Many programs, in particular operating system commands, write their output line-by-line to their stdout. If these programs are started from APL with Handle←24 ⎕FIO[24] Bs (aka. popen(Bs, "r") then the handle returned to APL is a stream of lines that are termnated by line-feeds (ASCII 10 or \n). The first step in the processing of such a streams in APL is usually to split the stream into a (nested) vector of lines. This can be easily and efficiently done with 35 ⎕CR. The terminating line-feeds are removed in this conversion.
Example:
9 ⎕CR 35 ⎕CR "Hello\nWorld\n"
╔═══════════════╗
║┏→━━━━┓ ┏→━━━━┓║
║┃Hello┃ ┃World┃║
║┗━━━━━┛ ┗━━━━━┛║
╚═══════════════╝