amend some small typos in techdoc

[spider.git] / techdoc / protocol.pod
diff --git a/techdoc/protocol.pod b/techdoc/protocol.pod

index 151144b61130e84e27cedf4c5bb53f5794d17b8f..3c3ccdcc4af5bb5539d17b4600ecd5851a4380ef 100644 (file)
--- a/techdoc/protocol.pod
+++ b/techdoc/protocol.pod
@@ -1,6 +1,8 @@
  =head1 NAME
  
-DXSpiderWeb Orthogonal Communications Protocol $Revision$
+Aranea Orthogonal Communications Protocol 
+
+$Revision$
  
  =head1 SYNOPSIS
  
@@ -9,9 +11,18 @@ DXSpiderWeb Orthogonal Communications Protocol $Revision$
  =head1 ABSTRACT
  
  For many years DX Clusters have used a protocol which was designed 
-for a non-looped tree of nodes. This has probably never, reliably, 
-been achieved in practice; certainly not recently. This document 
-describes a complete replacement for that protocol. It allows a
+for a non-looped tree of nodes. This environment has probably never, reliably, 
+been achieved in practice; certainly not recently.
+
+There have always been loops, sometimes bringing the network to its 
+knees. In modern usage, both in order to get some resilience and also
+to expedite information flow, we use internet based, deliberately
+looped networks with filtering. Whilst this works, after a fashion, there 
+are all sorts of problems that the current PC protocol can never
+address.
+
+This document 
+describes a complete replacement for the PC protocol. It allows a
  fully looped network, is inherently extensible and should be simple
  to implement (especially in perl).
  
@@ -20,25 +31,77 @@ for inter-node communications.
  
  =head1 DESCRIPTION
  
-This protocol is encoded in UTF8 with HTTP style escaping. It is
-designed to be an extensible basis for any type of one to many
+This protocol is
+designed to be an extensible basis for any type of one too many
  "instant" line-based communications tasks.
  
  This protocol is designed to be flood routed in a meshed network in
-as efficient a manner as possible.
+as efficient a manner as possible. The reason we have chosen this
+mechanism is that most L</Messages> need to be broadcast to all nodes.
+
+Experience has shown that nodes will appear and (more infrequently) 
+disappear without much (or any) notice. 
+Therefore, the constantly changing and uncoordinated
+nature of the network doesn't lend itself to fixed routing policies.
+
+Having said that: directed routing is available where routes have
+been learned through past traffic.
+Those L</Messages> that could be routed (mainly single line one to 
+one "talk" L</Messages>) 
+happen sufficiently infrequently that, should they need to be flood routed
+(because no route has been learned yet) it is a small cost overall.
+
+=head1 Messages
+
+A message is a single line of UTF8 encoded and HTTP escaped text 
+terminated in the standard internet manner with a <CR><LF>. 
  
  Each message consists of a L</Routing Section> and a L</Command Section>. 
-The two sections are separated with the '|' character and the whole
-message is terminated in the standard RFC/Internet manner with the
-ascii <carraige return><linefeed> characters. It follows that these
-characters (as well as a small number of other reserved characters)
+The two sections are separated with the '|' character. 
+It follows that these
+characters (as well as non-printable characters, <CR>, <LF> and
+a small number of other reserved characters)
  can only be sent escaped. This is described further in the 
-L</Command Section>.
+L</Command Section> and L</Fields>.
  
  Most of this document is concerned with the L</Routing Section>, however
  some L</Standard Commands> which all implementation should issue and
  must accept are described.
  
+=head1 Applications
+
+In the past messaging applications such as DX Cluster software have maintained
+a fairly strict division between "nodes" and "users". This protocol attempts
+to get away from that distinction by allowing any entity to connect to any 
+other. 
+
+Applications that use this protocol are essentially all peers and therefore
+nodes the only real difference between a "node" and a "user" (using this 
+protocol) is that a "node" has one or more listeners running that will,
+potentially, allow incoming connections. A "user" simply becomes an end
+point that never uses the L</FrmUser> or L</ToUser> slots in the 
+L</Routing Section>.
+
+The reason for this is that modern clients are more intelligent than simple
+character based connections such as telnet or ax25. They wish to be able to
+distinguish between the various classes of message, such as: DX spots, 
+announces, talk, logging info etc. It is a pain to have to do it, as now,
+by trying to make sense of the (slightly different for each piece of node 
+software) human readable "user" version of the output. Far better to pass on
+regular, specified, easily computer decodable versions of the message,
+i.e. in this protocol, and leave
+the human presentation to the client software.
+
+Having said that, the protocol allows for traditional, character based,
+connections, as in the past. But it is up to applications
+to service and control that type of connection and to provide human readable
+"user" output. 
+
+One of the legacy, character based connections that will probably have to be
+serviced is that of existing PC protocol based nodes. They should be treated
+as local clients, B<not> as peers in this protocol. It is likely that, in order
+to do this, some extra L</Tag>s will need to be defined at application level. 
+
  =head1 Routing Section
  
  The application that implements this protocol is essentially a line
@@ -47,7 +110,7 @@ effectively a datagram.
  
  It is assumed that nodes are connected to
  each other using a "reliable" streaming protocol such as TCP/IP or
-AX25. Having said that: in context, messages in this protocol could be 
+AX25. Having said that: in context, L</Messages> in this protocol could be 
  multi/broadcast, either "as is" or wrapped in some other framing
  protocol. 
  
@@ -56,10 +119,10 @@ through your node" protocol, there is no guarantee that a message
  will get to the other side of a mesh of nodes. There may be a
  discontinuity either caused by outage or deliberate filtering. 
  
-However, as it is envisaged that most messages will be flood routed or,
-in the case of directed messages (those that have L</To> and/or
+However, as it is envisaged that most L</Messages> will be flood routed or,
+in the case of directed L</Messages> (those that have L</To> and/or
  L</ToUser> fields) down some/most/all interfaces showing a route for that
-direction, it is unlikely that messages will be lost in practice.
+direction, it is unlikely that L</Messages> will be lost in practice.
  
  =head2 Field Description
  
@@ -126,7 +189,7 @@ neighbouring nodes must increment this field before passing
  it on to higher layers for onward processing.
  
  Implementations may have an upper limit to this field and may
-silently drop incoming messages with a L</Hop> count greater than the
+silently drop incoming L</Messages> with a L</Hop> count greater than the
  limit.
  
  =item B<FrmUser>
@@ -212,7 +275,7 @@ tuple. The basic system will learn which interfaces can see what nodes
  by looking at the tuple and merging that with the L</Hop> count. 
  Each interface remembers the latest L</TimeSeq> with the lowest L</Hop>
  for each L</Origin> that arrives on that interface. It also remembers
-the number of messages for that L</Origin> that has been received on
+the number of L</Messages> for that L</Origin> that has been received on
  that interface.
  
  Any message for onward broadcast is duplicated and sent out on all
@@ -244,8 +307,8 @@ duplicated!
  =head2 Examples
  
   # on link startup from GB7BAA (both sides hello)
- GB7TLH,3D02350001,0,GB7BAA|HELLO,Aranea,1.2,24.123
- GB7BAA,3D02355421,1,GB7TLH|HELLO,Aranea,1.1,23.245
+ GB7TLH,3D02350001,0|HELLO,Aranea,1.2,24.123
+ GB7BAA,3D02355421,1|HELLO,Aranea,1.1,23.245
  
   # on user startup to GB7TLH
   GB7TLH,3D042506F2,0,G1TLH|HELLO,PClient,1.3
@@ -279,10 +342,32 @@ duplicated!
  
  The L</Command Section> of the message contains the actual data being
  passed. It is called the Command Section because all commands
-are identified with a L</Tag> which is implemented by 
-the software using this protocol.
+are identified with a L</Tag> each of which is implemented by 
+the software using this protocol. Each </Tag> (usually) is followed by one
+or more L</Fields>. 
+
+=head2 Tag
+
+The L</Tag> consists of string of uppercase letters and digits, starting
+with a leading, uppercase, letter. Tags should be as short as is meaningful.
+
+Valid tags would be:
+
+ DX
+ PC23
+ ANN
+
+Invalid tags include:
+
+ 1AAA
+ dx
+ Ann
+
+The L</Tag> is separated from its data L</Fields> by a comma ','. 
+
+=head2 Fields
  
-The L</Tag> is separated from its data by a comma ','. All fields
+All fields
  in any subsequent data shall be separated by a comma ','.
  All fields shall
  be HTTP encoded such that reserved characters (comma ',', 
@@ -290,7 +375,7 @@ vertical bar '|',
  percent '%', 
  equals '=' 
  and non printable characters less than 127 (or %7F in hex)
-[including newline and carraige return] are tranlated to
+[including newline and carraige return] are translated to
  their two hex digit equivalent preceeded by the percent '%' character.
  
  For example:
@@ -308,7 +393,7 @@ are written according to this specification must say:
   use UTF8;
  
  A message (or line) is terminated with <carriage return><linefeed>
-0x0d 0x0a. Incoming messages must be accepted even when terminated
+0x0d 0x0a. Incoming L</Messages> must be accepted even when terminated
  with just <linefeed>.
  
  Care must be taken to make sure that fields have any reserved characters
@@ -325,23 +410,6 @@ specified above and can otherwise contain any character.
  There is no maximum size specified for a message. It is up to each
  implimentation to enforce one (if only for their own protection).
  
-=head2 Tag
-
-The L</Tag> consists of string of uppercase letters and digits, starting
-with a leading, uppercase, letter. Tags should be as short as is meaningful.
-
-Valid tags would be:
-
- DX
- PC23
- ANN
-
-Invalid tags include:
-
- 1AAA
- dx
- Ann
-
  =head2 Standard Commands
  
  There are a number of L</Standard Commands> which must be accepted by 
@@ -349,21 +417,27 @@ all implementations.
  
  =over
  
-=item B<HELLO>,<software name>,<version>,<build>,<comments>
+=item B<HELLO>
+
+ HELLO,<software name>,<version>,<build>,<comments>
  
  Command sent on connection to another node. Both sides send their information
  to the other. All the possible arguments are optional, although some of the
  arguments should be sent in order to help diagnose problems. This command is
  broadcast.
  
-=item B<BYE>,<comments>
+=item B<BYE> 
+
+ BYE,<comments>
  
  Command sent to all connections when the software is shutting down. This is sent
  by the node just before shutdown occurs. This is really only used to help the
  network prune its routing tables. It isn't a requirement. The <comment> field
  is optional. 
  
-=item B<DISC>,<node name>,<comments>
+=item B<DISC>
+
+ DISC,<node name>,<comments>
  
  Command sent when a node has disconnected from this node. This message is sent when
  an interface shuts down. It need not be sent if a L<BYE> from an interface for
@@ -372,15 +446,32 @@ that node has just been received. This command should be broadcast.
  The <node name> is mandatory and is the name of the interface that has just 
  disconnected.
  
-=item B<PING>,<ping id>
+=item B<PING>
+
+ PING,<ping id>
  
  Command to send a ping to a node or user. This command is used both by the software
  and users to determine a) whether a node or user exists and b) how good the path is
-between them.
+between them. 
+
+The <ping id> is a unique string which is usually the hexadecimal equivalent of an 
+integer that is incremented every time it is used. But it can be anything that
+will identify this ping using the tuple (L<Origin>,<ping id>) as unique.
+
+=item B<PONG>
+
+ PONG,<ping id>,<no of hops on ping>
+
+Command to reply to a ping. This is sent as a reply to an incoming ping command.
+The <ping id> is the one supplied and the <no of hops on ping> is the number of
+hops it took for the ping to arrive.
+
+=item B<T>
  
-=item B<PONG>,<ping id>,<no of hops on ping>
+ T,<text>
  
-Command to reply to a successful ping
+All implementations must be able to send "text" (encoded as specified in 
+L</Fields>). There would be little point in doing all this otherwise!
  
  =back