SDD Software Manual
Version 2.2.0 February 24, 1999
The backbone software library can be used to:
- Use existing SDD streams with an SDD Receiver
- Organize a distributed SDD stream with the Redistributor component
- Create new SDD streams with the SddTransmitter base class
- Create new SDD streams with the SddTranslator program
1. Use existing SDD streams with an SDD Receiver
Envisioning a new application for an existing SDD stream
The ITS Backbone at UW makes data streams from multiple sources
available in a common format called Self-Describing Data, or SDD.
The following data streams are currently available:
| SDD stream | ITS Backbone host and port |
| Seattle traffic flow | sdd.its.washington.edu 8411 |
| King County Metro bus locations | sdd.its.washington.edu 8412 |
| Seattle Center parking garage occupancy | sdd.its.washington.edu 8413 |
These data streams have provided the raw material for all sorts of interesting
applications and experiments, such as a map showing highway congestion,
a historical database of traffic measurements, and a kiosk at bus stops that
shows bus arrival information, to name a few. Many interesting applications
remain to be written that analyze or visualize these data streams. There are
several ways to implement a new application for an SDD stream.
Terminology
In order to make use of these data streams, a program must have knowledge of
the SDD protocol. A client of an SDD data stream is called an SDD Receiver, and
the provider of the stream is called an SDD Transmitter. This is because, after
a connection is initiated by a Receiver, all communication is one-way from the
Transmitter to the Receiver.
SddFlash
Much like the “ftp” client included in many operating systems, the
SddFlash program offers a command-line interface to inspect any
SDD stream. Its output can be directed into a file or into any
other program for filtering or analysis. Files containing output
from SddFlash and named with the suffix “.csv” can be opened directly
by Microsoft Excel and many other programs.
Usage: java its.app.SddFlash [-stream] hostname port [table]
This application prints a table from an SDD stream to standard output
in a tabular format (comma separated, with column headers on the first line).
The default behavior is to print the rows of data available from a single
SDD frame, and then terminate. This non-streaming behavior is useful
for checking the status of a stream from the command line, or in other
on-demand situations such as cgi scripts.
The table parameter is optional, and if omitted, a list of the tables
available in the SDD stream will be printed. When the table parameter
is specified, it can name a table whose data originates in either the
Contents or Data frame of the SDD stream. More than one table can be
specified, in which case the data for each will be printed, separated
by a blank line.
If the -stream parameter is specified, the program does not terminate,
but instead keeps listening for incoming data frames and printing the
new rows of data it finds in each one. In this mode, the table or
tables specified must originate in the SDD Data frame. If only a single
table is specified, the column headers will only be printed once.
SddFilter
SddFilter is a program designed to examine the output of SddFlash and
retain only those lines that match user-specified criteria. Other
filter utilities such as grep (on Unix) and findstr (on Windows NT)
can often be used just as effectively, but SddFilter allows the individual
columns of data in each row to be examined and used in an arbitrary expression.
An interpreter for the perl language, version 5.0 or greater, is required to
run SddFilter.
Usage: perl sddfilter.pl criterion ...
Each criterion is a Perl expression involving the column names in
the input SDD stream, which is read from standard input. Expects
input similar to that produced by SddFlash (comma-separated, with
column names on the first line).
For example, to view the highway sensors currently reporting a
speed greater than 80 miles per hour:
java its.app.SddFlash -stream sdd.its.washington.edu 8411 speed_trap_data | perl SddFilter.pl “speed > 80”
To view the progress of all buses currently serving Metro route 43:
java its.app.SddFlash –stream sdd.its.washington.edu 8412 avl_data | perl SddFilter.pl “svc_route == 43”
To receive notification when a particular highway sensor reports speeds greater than 80 mph:
java its.app.SddFlash -stream sdd.its.washington.edu 8411 speed_trap_data | perl
SddFilter.pl “sensor_id eq ‘ES-081D:_MSH_T5’ and speed > 80”
SddAutoExtractReceiver
This demonstration SDD Receiver program has the following two features:
- Displays a graphical input form to avoid the need to specify command line parameters
- Places each type of incoming SDD data into a file.
Usage: java its.app.SddAutoExtractReceiver
Writing a custom SDD Receiver in Java
It’s easy to build support for receiving an SDD data stream directly
into a Java program. This approach offers the greatest amount of
flexibility and performance in building a new application.
- Write a java class that extends
its.backbone.sdd.SddReceiver.
- In the constructor of the new class, call the superclass constructor
with the hostname and port of the desired SDD stream.
- The new class can override any of the following methods of its superclass:
schemaReceived, contentsReceived, dataReceived, extractorReceived, or
extractedDataReceived.
- The ordering of calls to these methods is
determined by the rules of the SDD protocol: the schema frame is
always first to arrive, followed by the contents frame, an optional
extractor frame, and then a continuous stream of data frames. The
SDD Receiver base class generates calls to both the dataReceived and
extractedDataReceived method each time a data frame arrives.
- Using the ContentsData objects passed to the
extractedDataReceived method, the program can examine any of the tables in the
incoming data stream.
2. Organize a distributed SDD stream with the Redistributor component
About the Redistributor
The backbone software library contains an implementation of the Redistributor
component described in the
ITS Component Architecture paper. It functions as a client of any SDD stream,
and a server of that same stream to multiple connected clients. The server
implementation uses lightweight threads inside a single java process to
efficiently handle clients with different capabilities.
Given a reasonably powerful server computer on which to run it, the Redistributor
is capable of serving hundreds of simultaneously connected clients.
Running the Redistributor
Usage: java its.app.Redistributor requestPort serverName serverPort
The requestPort parameter specifies the port on which the Redistributor
will accept connections from clients.
The serverName and serverPort parameters specify the server of the SDD
stream which the Redistributor will join. A connection will be made to
the specified server and port, and the data received will be passed
along to any clients that have connected.
To satisfy the rules of the SDD protocol, the Redistributor maintains a
copy of the most up-to-data Schema, Contents, and Extractor frames it
has received. It sends these frames to any client that connects, before
starting to send the data frames as they arrive.
3. Create new SDD streams with the SDDTransmitter base class
Create the schema
Design the schema for the application. This process is similar
to creating a database intended to store the changing state of
a system over time. Classify the volatility of the tables in the schema.
Some of the tables may be designed to contain data that never change or
that change infrequently. The data for these tables should be
transmitted in an SDD Contents frame. Presumably the schema has one
or more tables intended to describe the real-time state of some part
of the system, and these are the tables that should be populated by
the SDD Data frames.
Choose a representation for the data frame
The format for the Schema frame must follow the definition of
the SDD Schema language, and the format for the Contents frame
must follow the definition of the SDD Contents language.
However, the format of the Data frame is flexible.
For most applications, the best choice will be to use the
SDD Contents language to describe the Data frame.
This frees the author of the transmitter from creating an
Extractor object, since the SDD Receiver already knows how
to interpret Data frames written in the Contents language.
Other applications may choose to define their own data format, for reasons such as:
- encapsulating a legacy data structure within the SDD protocol
- sending data unsupported by the SDD Contents language, such as images
- performance or bandwidth considerations
To enable these scenarios, the SDD protocol specifies a way to
describe the application’s data format in the Extractor frame.
Write the Extractor (optional)
The Extractor is a Java object with a well-known interface.
A transmitter that defines its own data format must also create
an Extractor object that understands that format.
The compiled Extractor object is packaged in a zip file and
sent with the SDD stream in the Extractor frame, and SDD
receivers know how to instantiate the object and ask it
to interpret the data frame. A drawback of this technique
is that it relies on Java’s “Class Loader” mechanism and
therefore restricts the choice of implementation language
for SDD receivers.
To create an Extractor, write a Java class that extends
its.backbone.domain.DataFactory. The name of the class must end with the text
“DataFactory” in order for the SDD receiver to recognize it. The class must reside
in the default Java package; that is, it must not contain a “package” statement.
See the DataFactory documentation for more details on implementing its three methods
and packaging the resulting class.
Extend the SddTransmitter base class
Write a java class that extends
its.backbone.sdd.SddTransmitter. The superclass
provides methods that format and synchronize the redistribution of the four
types of SDD frames. The application-specific subclass is responsible for
invocation of the program (providing a “main” method), and creating the
necessary schema, contents, data, and extractor information as described above.
This may involve a significant amount of processing to translate the system’s
state into Contents and Data frames that describe it. The typical SDD
transmitter specifies the Schema, Contents, and Extractor frames to its
superclass when it first begins running, then settles into a continuous process
of examining the state of the system and passing new Data frames to its
superclass at a fixed interval. Occasionally the transmitter may notice a change
in state that causes it to create a new Contents frame for its superclass to
distribute. The superclass implements the rules of SDD distribution protocol,
listening for connections, and sending connected listeners the frames it
currently knows about in the correct order.
Assessing data transmission frequency
Each data stream has different characteristics for the interval between data
frames. Usually for describing sensor-based systems, an SDD stream will
transmit the real-time state of the system at a fixed interval of less than a
minute. But it's also possible to build an SDD stream that transmits
information in an asynchronous way, for example, only when a change occurs, or
only when an operator enters some new information. The setTimeout
and startPulse methods of the SddTransmitter base class should be
used to ensure correct behavior of the data stream for any type of update
frequency.
Example transmitter: a system for monitoring parking garages
An effort is underway to distribute the real-time status of
several Seattle-area parking garages via SDD.
Potential clients for this information would be variable-message
signs on the surrounding streets and web sites for events taking place nearby.
The first step in the process was to define the schema.
Two tables were deemed necessary; one with a row to describe each garage,
and one with a row for each status report from a garage.
CREATE TABLE PARKING_GARAGE (
GARAGE_ID SMALLINT NOT NULL PRIMARY KEY,
DESCRIPTION CHAR(50) NOT NULL,
CAPACITY SMALLINT NOT NULL
)
CREATE TABLE PARKING_GARAGE_DATA (
GARAGE_ID SMALLINT NOT NULL,
DATA_TIME CHAR(18) NOT NULL,
STATUS CHAR(15) NOT NULL,
STATE CHAR(15) NOT NULL,
AVAILABLE_SPACES SMALLINT NOT NULL,
ADVISORY_TEXT CHAR(255) NOT NULL,
PRIMARY KEY (GARAGE_ID, DATA_TIME),
FOREIGN KEY (GARAGE_ID) REFERENCES PARKING_GARAGE
)
Next, a sample Contents frame was envisioned:
TABLE PARKINGLOT
COLUMN (ID, DESCRIPTION, CAPACITY)
1, 'First Avenue Garage', 656;
2, 'Mercer Garage', 1433;
3, 'Fifth Avenue Parking', 1000;
And for that Contents frame, a corresponding data frame would look like:
TABLE PARKINGLOTDATA
COLUMN (ID, DATA_TIME, STATUS, STATE, AVAILABLE_SPACES, ADVISORY_TEXT)
1, '19981023121416918', 'ok', 'full', 1, NULL;
2, '19981023121416918', 'ok', 'full', 2, NULL;
3, '19981023121416918', 'ok', 'full', 1, NULL;
Since the data frame is written in the SDD Contents language,
it was not necessary to create an Extractor. Given this
design, and an interface to the existing measurement system,
it was a quick process to write the transmitter in Java.
See the source code for its.app.DemoTransmitter, which
creates an SDD stream that contains simulated parking-lot
occupancy data.
4. Create new SDD streams with the SddTranslator program
In most cases, the main difficulty in creating a new SDD stream is designing the
interface to the existing system that produces the real-time data. Typically these
systems are written in C, and a network interface has to be implemented to communicate
the data updates to the java program that contains the SddTransmitter. This scenario
has been encountered often enough that a standard implementation technique has emerged.
Reliable networking code in itsframe.c
The objective is to provide
a socket interface to your C
program which will be used to provide real-time data updates. This will not
be an SDD stream, but will serve as the data source
for an SDD transmitter, which can be more conveniently implemented in java
with the SDD library.
-
The SDD software library contains a source file in the ItsFrame
directory, named itsframe.c, which can be compiled into your C program.
-
itsframe.c is written for Windows NT, using NT-specific
routines for multithreaded network programming. Please contact
sdd@its.washington.edu if you would like it ported to another
platform.
-
Although any type of data can be sent through this interface,
in most cases tabular data should be sent as an
array of simple structs, where a single struct in the array
represents one row of data in a table. Each struct field should
be either a primitive data type, a fixed-length character array, or a
timestamp as returned by the system call time().
-
The public interface to this code is simple and well-documented.
First call itsuwInitServer to start listening on a socket.
Next, call itsuwSendFrame at any time to provide updates for any
of your data types. The actual sending of the data is done on
a separate thread, so your program won't have to wait. When you
want to end your program, call itsuwTerminateServer.
-
Aside from handling the socket for you, itsframe.c also
provides important connection maintenance behavior. If
the client drops the connection and then reconnects (an event
which is likely to happen), copies of all the most recent
data you have provided with itsuwSendFrame will be re-sent.
See comments in itsframe.c for more information.
-
Your program controls the rate at which new data is
sent, by the frequency with which it calls itsuwSendFrame. It is also your
choice whether to include only data that has changed, or to send a complete
description of system state at a fixed interval.
SddTranslator
SddTranslator is a program within the SDD library which automates
the production of a new SDD stream, to the greatest extent possible.
It will convert the data structures provided by your system into an
SDD stream, or multiple SDD streams. The input data structures and
the output SDD stream are both described in a configuration file, and
the program takes care of everything in between. In addition to the
conversion of your data to SDD's textual representation, the data is
verified against
the schema to ensure the integrity of the data stream. Following is a
description of each of the sections in the configuration file; see
also the sample configuration files included with SddTranslator to
learn the syntax.
-
[InputSourceHost] - this is the name of the computer where
your legacy system is running and sharing its data as specified
above.
-
[InputSourcePort] - this is the port number the legacy
system is listening on (also specified when calling itsuwInitServer).
-
[InputByteOrder] - can be either intel or network. If intel is specified,
SddTranslator will perform byte-swapping on all integer values it reads
as part of an InputFrame. If this property is not present in the config
file, the default is to assume network byte order, meaning that no byte swapping
takes place.
-
[SerialNumberFile] - the name of a file that will be created by
SddTranslator to store internal state between sessions.
-
[LogFile] - the name of a file that will be created by SddTranslator
to log operating information, warnings, and errors.
-
[InputFrames] - for each of the frame types received from the
legacy system, a description of the fields in each record. The
contents of each frame must be, as described above, an array
of structs with fixed-length fields.
-
The first line of an InputFrames description looks like:
frame <id> <tablename> [skip_header=<length>]
where <id> is the id assigned to this data type when it was passed to
itsuwSendFrame, and <tablename> is the name of the table that will be
created in an SDD stream. The optional byte_order
specification instructs SddTranslator to perform byte-swapping on all
integer values if the input frame is in intel byte order. The optional
skip_header specification instructs SddTranslator to skip the given number
of bytes at the beginning of this input data type.
-
Subsequent indented lines
describe fields in this record, and have the format:
<fieldname> <type> <length>
where <fieldname> is the name of the field, corresponding to a
column name in the associated table, <type> is one of the datatypes
string, int, uint, float, double, skip, enum, timestamp, time_t,
and <length> is the length of the field in bytes.
-
int and uint correspond to signed and unsigned integer types
of length 1, 2, or 4.
-
string corresponds to a fixed-length character array. The string
may be null-terminated, in which case the remaining bytes in the
array (up to the given length) will be ignored.
-
float and double are IEEE floating-point numbers of length
4 and 8, respectively.
-
time_t is a 4-byte representation of a timestamp as specified
by the C standard library functions. it will be converted
into the stadard SDD timestamp: a 19-character string of the
format "YYYYMMDDhhmmssnnn".
-
timestamp signals that the SDD-format timestamp should be
generated for each record. the given length must be 0,
since no input is consumed to produce this field.
-
enum is the same as int, in that an integer value of length 1,
2, or 4 will be read from the input, but the integer will be
transformed into a string in the output by a given mapping.
The mapping is on subsequent indented lines, and each line has
the format:
name = value
-
[Transmitters] - entries in this section describe the data streams that
will be served by this instance of the SddTranslator program.
-
The first line of a transmitter description looks like:
transmitter <port> <name> [timeout=<tSec>] [pulse=<pSec>] [allow=<hostList>]
where <port> is the port on which this data stream will be
available to clients, and <name> is the name of the data stream.
The optional timeout and pulse specifications allow the data stream timeout
and pulse values to be set. For more information, see class library
documentation for SddTransmitter.setTimeout(int) and
SddTransmitter.startPulse(int). The optional allow parameter can reference
another section in the configuration file that contains ip addresses of hosts
that are allowed to connect to this transmitter. If not present, any host
is allowed to connect.
-
Subsequent indented lines describe the different parts of the
outgoing SDD stream. In general, they have the format:
<schema|contents|data> <static|dynamic> <name>
-
A static schema is required for each SDD transmitter. The
given <name> field must reference a section in the configuration
file that contains an SDD schema. An SDD schema contains one or
more CREATE TABLE statements as specified in the SQL data definition
language.
-
An SDD transmitter may have one or more static contents tables.
If so, the given <name> field must reference a section in the
configuration file that contains the static contents data,
in the format specified by the SDD Contents language.
-
An SDD transmitter may have any number of dynamic
contents or data tables. In this case, the given <name> field must
match a <tablename> field from the [InputFrames] section, thereby
linking dynamic incoming data to an outgoing table in the SDD stream.
When the configuration file is complete, it's time to fire up your new SDD stream.
Usage: java its.app.trans.SddTranslator <MyStreamConfig.txt>
It's likely when the translator is run for the first time against data produced
by your system that you will see some errors from the SDD parser. It checks
each row of data in the stream to make sure it matches the proper table definition.
Go back to your configuration file and make sure that each data type defined
in [InputFrames] correctly specifies the data written out by your system, and that
each field maps directly to a column with the same name in the schema.
If you don't see any error messages, then SddTranslator has successfully established
a connection to its data source, and is translating the input frames to an SDD stream
as specified in the configuration file. You can test the new data stream by
connecting to it with SddFlash, the standard SDD client program.
Example SddTranslator configuration file
Let's return to the example of the SDD stream for Seattle parking garage information.
The actual system that was put in place involved several C++ programs acting together
on a single Windows NT computer, monitoring interfaces to the sensor hardware and
providing an operator interface to change the variable message signs outside the
garages. In order to provide the system state as an SDD stream, another C++ program
was written that provided two types of data structures over a socket: in frame type
1, an enumeration of the parking garages, and in frame type 2, the number of currently
available spaces in each garage. It updates frame type 1 only once, and updates
frame type 2 every 20 seconds.
SddTranslator was then used to translate these input frames into an SDD stream with
a schema identical to the example previously shown. Here is the config file for
the parking data stream.
[InputSourceHost]
sdd.its.washington.edu
[InputSourcePort]
8499
[SerialNumberFile]
c:/temp/ParkingTransmitterSN.txt
[LogFile]
c:/temp/ParkingTransmitter.log
[InputFrames]
frame 2 parking_lot
id uint 2
description string 50
capacity uint 2
frame 1 parking_lot_data
id uint 2
data_time string 18
status string 15
state string 15
available_spaces uint 2
advisory_text string 255
[Transmitters]
transmitter 8413 Seattle Center Parking Lots
schema static ParkingSchema
contents dynamic parking_lot
data dynamic parking_lot_data
[ParkingSchema]
CREATE SCHEMA
CREATE TABLE PARKING_LOT (
ID SMALLINT NOT NULL PRIMARY KEY,
DESCRIPTION CHAR(50) NOT NULL,
CAPACITY SMALLINT NOT NULL
)
CREATE TABLE PARKING_LOT_DATA (
ID SMALLINT NOT NULL,
DATA_TIME CHAR(18) NOT NULL,
STATUS CHAR(15) NOT NULL,
STATE CHAR(15) NOT NULL,
AVAILABLE_SPACES SMALLINT NOT NULL,
ADVISORY_TEXT CHAR(255),
PRIMARY KEY (ID, DATA_TIME),
FOREIGN KEY (ID) REFERENCES PARKING_LOT
)