Harlequin RIP SDK
RIP Farm: API and Library to split jobs across multiple RIPs

The Harlequin RIP Farm API and libraries provide support for running a farm of multiple RIPs, on multiple host machines, and managing input to and output from the farm. More...

Modules

 RIP Farm Raster backend to Raster Manager API
 RIP Farm raster backend to Raster Manager API.
 
 Interface for sending messages to the RIP farm
 
 Interface for RIP farm control
 Interface for starting and stopping the DFE Interface reactor.
 
 Interface for receiving messages from the RIP farm
 
 RIP Farm Test Tool
 The RIP Farm Test Tool is provided as a source code example demonstrating how to use the RIP Farm API.
 

Files

file  rf_error.h
 Error structures and names for the Scalable RIP.
 
file  rf_library.h
 The RIP farm DFE Interface.
 
file  rf_location.h
 Functions to assist with RIP farm location IDs.
 
file  rf_log_const.h
 Constants for RIP Farm diagnostic logging control.
 
file  rf_mem.h
 Farm process memory and utility functions.
 
file  rf_reply_msgs.h
 The RIP farm DFE Interface.
 
file  rf_types.h
 RIP Farm API type definitions.
 
file  rf_version.h
 The RIP Farm API version.
 

Detailed Description

The Harlequin RIP Farm API and libraries provide support for running a farm of multiple RIPs, on multiple host machines, and managing input to and output from the farm.

When interfacing to the RIP Farm, a DFE (Digital Front End) or DBE (Digital Back End) will link to the RIP Farm dynamic library (libripfarm) and the required internal messaging libraries (ZeroMQ, CZMQ, and Jansson, all provided as dynamic libraries by Global Graphics). The RIP Farm API is self-contained, your DFE or DBE applications can use the RIP Farm API without including any of the Harlequin RIP or SDK include files or libraries. OEMs using a single RIP can ignore all of the RIP Farm API headers and libraries.

The RIP Farm API is based around asynchronous message passing. DFEs and DBEs will send requests to the RIP Farm, and then generally receive replies at a later time. The RIP Farm library includes thread and message loop abstractions and callbacks to make these interactions simpler. Multiple processes can connect to the RIP Farm simultaneously to submit and manage jobs, consume rasters, and/or monitor status of the RIP Farm.

For high-performance RIP Farm integrations, you will still need to build a raster output handler (see Outputting rasters) for the RIPs used in the RIP Farm. The RIP Farm API includes functions for the raster output handler to communicate with the RIP Farm, to inform it of the location, type, and naming of raster objects. You are responsible for storing and transport of rasters and all persistent data in the RIP Farm.

A test tool is provided as source code to illustrate interactions with the RIP Farm using the RIP Farm API.

The Harlequin Scalable RIP is implemented using the RIP Farm components.

Configuring the RIP Farm

The RIP Farm has various configuration options, defined using JSON, that can affect its behavior. The JSON configuration options are passed to the RIP Farm during the rf_iface_start() startup call. Several different methods are available to pass the JSON configuration options to the RIP Farm:

The default configuration file for the Scalable RIP contains a number of useful comments about RIP Farm JSON configuration parameters, and should be used as a reference for the configuration structure and types. The RIP Farm configuration parameters that are probably of most interest are examined in the next few sections.

Network options

The RIP Farm uses ZeroMQ to pass messages over TCP/IP sockets. These sockets are configured using a range of ports, starting from a base port number (9100 by default). If the ports used by the RIP Farm conflict with other software on your server, the base port can be modified by changing the BasePort option.

RIP spawn options

When using the RIP Farm API, the Scalable RIP sub-processes may either be explicitly started by your application, or automatically spawned by the rf_send_farm_start() call. Global Graphics recommends automatically spawning the processes using the API. To automatically spawn the Scalable RIP process, you need to change the SpawnBeforeFarmStart configuration object. This object needs to contain an Executable key with a string value defining the file path of the executable, a WorkingDirectory key with a string value defining the working directory name, and an Arguments key with an array value defining command-line arguments to pass to the executable.

Enabling the Raster output functions

The RIP Farm API's raster output functionality must be explicitly enabled if required, by setting the UseRasterManager parameter to true. If this functionality is enabled, a client must request and handle the raster information provided.

RIP Farm tiling options

The ScheduleIndividualTiles option affects how tiled jobs are managed by the Scalable RIP. If is true, tiled pages can have the tiles scheduled on any available Farm RIP; if false, all of the tiles are scheduled on the same Farm RIP.

RIP Farm progress options

The frequency at which the Scalable RIP reports progress can be configured by changing the value of the ProgressReportMillisec option.

RIP Farm default chunk sizes

The Scalable RIP splits PDF processing into "chunks", ranges of contiguous pages that it can send to each Farm RIP. The default size of each chunk can be configured by changing the value of the DefaultPageChunkSize option. The default chunk size can be overridden in the configuration submitted with each job, and there is also an override for automatically detected variable-data HVD jobs, because these jobs usually need each Farm RIP to process several pages at once to detect the common content between pages. Global Graphics has found that the best performance for non-HVD PDF files is typically achieved using a chunk size of 1.

Debugging options

The DebugParams sub-object in the RIP Farm configuration contains a number of parameters that are useful when debugging problems with a RIP Farm integration, especially when you want to step through your own raster backend code. Setting the FarmRIPDebugWait option to true and using a single Farm RIP will cause the farm RIP to wait after being spawned until a debugger is connected. The FarmRIPOpenConsoleWindow option is useful for making the Farm RIP output visible. The ModuleStoppedWaitMillisec and ReplyWaitMillisec options may be useful when stepping through code to make sure the controller does not time out waiting for a reply to a message while you step through code.

Starting and stopping the interface reactors

Every process that is going to interact with the RIP Farm API needs to start an interface context before calling the interface functions. All of the RIP Farm API functions take an RF_IFACE_CTXT reference parameter. The interface context is started and the RF_IFACE_CTXT reference created by calling rf_iface_start(). The interface context reference remains valid until rf_iface_stop() is called to destroy it.

The RF_IFACE_PARAMS interface parameters structure passed to rf_iface_start() allows you to configure the ports that the interface reactor will interact with, which are correlated with groups of functions in the interface; the method that the JSON configuration will be loaded; and a message callback function for receiving asynchronous replies and notifications. You must supply a message callback function to receive replies from the RIP Farm.

Due to limitations in the libraries used, the rf_zsys_shutdown() function should be called after stopping the interface reactor to fully shut down the RIP Farm ZeroMQ global context. This requirement may be removed in a future release.

Asynchronous messaging and replies

The RIP Farm API is based around asynchronous message passing. Most of the interface functions are called rf_send_request_name(). These calls will return almost immediately, but a success indication just means that the request message has been queued, not that it has been processed by the RIP Farm or succeeded. The function documentation describes the replies and notifications that you should expect to receive in response to the call. Replies and notifications are processed using the message_callback_fn function that was passed to rf_iface_start() in the RF_IFACE_PARAMS structure when starting the interface reactor. The message callback function is called with a reply structure that contains the message type, and an opaque payload pointer referencing data specific to the specific message type. The payload is usually extracted into a message-specific data structure using an an extraction function specific to the message type, called rf_extract_message_type(). The message-specific data structures contain the information you need to process the asynchronous message. The message callback function is responsible for destroying the payload when it has processed the message, so should ultimately call rf_free_payload() on the payload for all messages it receives. The message callback is also responsible for destroying any message-specific data structures it extracted. Most of these data structures have a type-specific destruction function called rf_free_message_type().

The message callback function runs directly on the RIP Farm API's interface reactor thread. This thread is responsibly for receiving and responding to messages from the RIP Farm. If processing of a message takes any significant amount of time, blocking this thread can result in queue backups and dropped messages. For this reason, it is good practice for clients of the RIP Farm API to forward messages received from the RIP Farm message callback to a worker thread. This is quite easy to do using the ZeroMQ reactor interfaces. An example of how to do this can be found in the RIP Farm Test Tool source code, provided with the RIP Farm API distribution. The testtool_reactor() function sets up a reactor loop, with a worker thread running the farm_reply_handler() function to receive and process messages forwarded from the message callback function.

Many of the interface functions are documented as receiving an immediate reply. Immediate in this context means a reply is sent as soon as the RIP Farm has received and processed the message. This may be on the order of milliseconds, or even longer: the RIP Farm may be interacting with other processes, and it may be on a different machine and so suffer network round-trip delays. Immediate also does not mean this will be the next message received by the message callback function: the RIP Farm itself or other processes interacting with it may cause it to generate notification messages that are broadcast to all processes interacting with the RIP Farm; or an eventual response to a previous request by your process may be delivered before the expected immediate reply. Your message callback function should be written with this in mind: do not assume that the next message callback after sending a request will be the reply to that request.

Many of the reply messages received by the message callback function will be sent only to the originator of a request message. However, messages documented as notifications are broadcast to all RIP Farm API clients that are configured to use the notification port. Notification messages may be received at any time, and contain information about interesting state changes in the RIP Farm. RIP Farm API clients can use notification messages to monitor progress in the RIP Farm, including triggering status queries for more information about the state changes observed.

Starting and stopping the RIP Farm

The RIP Farm may be started by a DFE by calling rf_send_farm_start(). If the RIP Farm API has been configured to spawn the Scalable RIP executable, then this will launch the Scalable RIP executable and connect to it. This call will return almost immediately, and a success indication just means that the start message has been queued. The RIP Farm is not fully started until the RF_FARM_STARTED message has been received.

The RIP Farm can be stopped by a DFE by calling rf_send_farm_stop(). The RIP Farm may take some time to shut down, so while there will ultimately be an RF_FARM_STOPPED reply, there may also be several RF_FARM_STOP_PROGRESS notifications before the farm stopped reply is received.

Individual machines participating in a RIP Farm may be requested to stop by calling rf_send_blade_stop() to a satellite blade controller. Stopping a blade aborts any activity scheduled on the blade, and prevents the RIP Farm from scheduling any more page ranges to it unless it reconnects. This can be used to disconnect a machine for scheduled maintenance, or to disconnect a machine while reconfiguring the blade. It is possible to dynamically change the number and configuration of RIPs available in a RIP Farm by using satellite blade controllers.

These functions for starting and stopping the RIP Farm are only available if the interface was started with RF_DFE_PORTS::startPort enabled.

Submitting and managing jobs in the RIP Farm

Clients can submit jobs to the RIP Farm using rf_send_job_start(), pause and resume processing of one or more jobs using rf_send_job_pause() and rf_send_job_resume(), and cancel processing of one or more jobs using rf_send_job_cancel(). These functions for managing jobs are only available if the interface was started with RF_DFE_PORTS::jobPort enabled.

Clients may request the status of one or more jobs by calling rf_send_job_status_request(). The job status request function is only available if the interface was started with RF_DFE_PORTS::statusPort enabled.

Managing output from the RIP Farm

Raster information supplied by raster backends to the RIP Farm can be managed by Digital Back End clients (DBEs). The raster management API in the RIP Farm API serializes raster data provided by the Farm RIPs, allowing the client to treat the raster output as a queue, regardless of the order in which Farm RIPs actually output rasters.

Clients that consume rasters will create a raster connection by calling rf_send_raster_connect(). Having connected, a client will receive RF_RASTERS_AVAILABLE messages when rasters are available for processing. The client can request details of one or more (or even all) rasters by calling rf_send_raster_request(). Once the client has processed the rasters, it will call rf_send_rasters_handled() to indicate that the RIP Farm need no longer hold onto information about the raster. The raster connection exists until disconnected by the client by calling rf_send_raster_disconnect(). You may connect more than one client to consume rasters produced by the RIP Farm at the same time. If no client is connected, the RIP Farm will save raster information produced until a client connects, so clients may disconnect and reconnect while the RIP Farm is running.

These functions for managing output are only available if the interface was started with RF_DFE_PORTS::rasterPort enabled.

The number and state of output rasters for one or more jobs may be requested by calling rf_send_raster_status_request(). There may be a large number of rasters in different states in a RIP Farm, so this request allows filtering of the rasters returned by job identifier and by raster state, to reduce the set to just those the client is interested in. The raster status request function is only available if the interface was started with RF_DFE_PORTS::statusPort enabled.

No raster information will be provided to this interface unless the functionality is enabled in the RIP Farm configuration.

Monitoring the RIP Farm

The RIP Farm API provides status queries to check the status of RIPs in the farm (using rf_send_rip_status_request()) and the blades in the farm (using rf_send_blade_status_request()). The status request functions are only available if the interface was started with RF_DFE_PORTS::statusPort enabled.

There are some notification messages that will be sent to all clients of the RIP Farm. These messages contain information about interesting state changes in the RIP Farm, that all clients may use to initiate monitoring, consumption, or shutdown behaviour of their own. The notification messages are provided to the message callback function. They are:

The notifications are only received if the interface was started with RF_DFE_PORTS::notificationPort enabled.

Errors in the RIP Farm interface

The payload structures extracted by the rf_extract_message_type() functions represent errors using a pointer to an RF_ERROR structure. If this pointer is NULL, then the message is reporting a successful operation. If this pointer is non-NULL, the structure contains:

RIP Farm raster backends

To get the best performance for a RIP Farm, you should implement your own raster backend for the Scalable RIP. The raster backends supplied with the "clrip" implementation of the Scalable RIP write their output to files. Writing a large amount of data to the file system and then reading it back for output and processing will limit performance significantly. The RIP Farm implementor is responsible for storing and transporting raster data between RIP Farm components. It is likely that using shared memory or a high-bandwidth network transport will give better performance than writing and reading files.

The Harlequin RIP SDK includes an interface for RIP raster backends to communicate with the RIP Farm library. The interface functions provided by the Harlequin RIP allow the raster backend to create a raster objects representing a raster, report the raster is ready for processing to the RIP Farm controller, and configure callbacks to destroy local references to the raster data when the RIP Farm controller reports the raster has been handled.

The raster backend API interface only sends raster metadata to the RIP Farm controller, not the raster data itself. The RIP Farm raster output functions similarly only provide raster metadata to clients of the RIP Farm API. The metadata passed between these components for each raster includes a raster name, location, data type, and user-provided metadata. The raster name, location, and data type are not semantically interpreted by the RIP Farm API, except that equality in all three fields imply the same raster. It is your responsibility to arrange transport of raster data between components: the name, location, and user metadata fields provided in the raster objects sent to the RIP Farm can be used to help components identify and access the raster data.

The RIP Farm library provides utility functions to help construct location metadata to pass to the RIP Farm raster backend API functions. The rf_location_id() and rf_vlocation_id() functions take a format string similar to the printf() family of functions, and construct a location string. These utility functions provide special format codes to insert identifiers for RIP Farm blades, processes, and other entities. The location parameter can be used by clients of the raster delivery API to determine whether rasters are co-located, or where they are located if the raster backend and client use a common location representation. Location strings created using this function should be destroyed using rf_mem_free() after passing to the RIP Farm raster backend's raster_create() function. The get_location() function in the Harlequin RIP SDK RIP Farm interface is implemented using these utility functions.

This interface must be enabled in the RIP Farm configuration if required.

Utility functions

As an alternative to extracting the payload of asynchronous messages from the RIP Farm to C structures using the rf_extract_message_type() functions, you may also extract the payload to a JSON string using rf_extract_json_string(). When you have finished using this string, it should be destroyed by calling rf_free_json_string().

The message payload references passed to the RIP Farm API's asynchronous message callback function must ultimately be destroyed by calling the rf_free_payload() function.

The RIP Farm library provides a pluggable interface to memory allocation. The rf_set_alloc_funcs() function may be used before any other call to the RIP Farm API to set allocation and free functions that will be used by the library. You must not call this function after any other RIP Farm API call has been made. If no allocator functions are set, the C library malloc() and free() functions will be used. The interface functions rf_mem_alloc() and rf_mem_free() are used in the library implementation to allocate memory using the functions set by rf_set_alloc_funcs(). They may also be used by clients of the RIP Farm library.

The RIP Farm library provides the utility function rf_parse_address_and_port() to extract IP address and port information from a string. This is useful when parsing command-line arguments or address:port pairs from other sources for configuring the RIP Farm network options. Parsed addresses must be destroyed when no longer needed by calling rf_mem_free().

The RIP Farm library provides utility functions for allocating and duplicating zero-terminated strings (using rf_strdup()) and strings with a specified length (using rf_strndup()) into memory allocated by the RIP Farm library functions. These are commonly needed functions when working with ZeroMQ messages. Strings duplicated using these functions should be destroyed when no longer needed by calling rf_mem_free().

The RIP Farm API aliases the macro rf_snprintf() to whatever version of the C library snprintf() function is available on the platform. You may assume this supports POSIX semantics, but check the compiler or operating system documentation for detail of any variance from that specification.