Pawn:Pawn Publishing
From Adapt
This is for PAWN v.5 and v.6
Pushing data out of PAWN and into more permanent storage is done by the receiving server using archival resources. The push is a 3rd party transfer triggered by a client. This transfer requires two parts. First the client that initiates the transfer. This client needs to specify where the data is to be pushed, and what data is to be pushed. (IE, what directory, ftp server, collection, from which package etc). Second, the receiving server has to have a driver for the archive resource that does the physical transfer. In addition to moving the data, the driver will also provide a handle back to PAWN containing information necessary to access the data. This handle will be logged with the package.
Archive resources and it's configuration will vary from domain to domain. In addition to domain level configuration, additional parameters specific to the package being pushed may also need to be supplied by the client. The domain level configuration and client specific parameters need to be supplied to the receiving server driver to complete the transfer.
Components in a resource
- Receiving server driver
- Receives a complete set of parameters and list of package objects that are to be transfered. The parameters should contain all information required to connect to the destination resource and location in that resource to push data.
- Client service
- Chooser GUI used to select where in the backend resource to store the data. Client also supplies parameters that may be specific to the package being ingested
- Driver configuration
- Configuration GUI to specify driver class, and any global parameters required for the receiving server driver, and any parameters needed for the client gui.
Configuration Design
The configuration is generated in two parts. First is general parameters used to configure the receiving server driver and client gui. Parameters supplied to each may be the same, or seperate (driver configuration from above). The second part of the configuration is a set of parameters supplied from the client to the receiving server that contain parameters specific to a given transfer (client service from above).
For example, a resource may supply a read-only account to allow clients to select a destination, while a read-write account is supplied to the receiving server to perform the physical move.
Workflow for configuring a resource
- 1. Global driver and client configuration
- Scheduler gui is used to configure global properties for an instance of a resource, and to configure properties that the client may need to know when using the resource gui.
- 2. Global driver configuration
- The receiving server gets a copy of global parameters for each configuration instance of a resource.
- 3. Client browse parameters
- Prior to displaying the resource client GUI, the client will download any necessary configuration options from the scheduler
- 4. Destination parameters and package selection
- After the client resource GUI runs, parameters are taken from the gui along with the selected part of a package and both are sent to teh receiving server.
- 5. Push to archive
- The receiving server connects to the resource and pushes data using the resource global parameters along with the client supplied package and parameters.
- 6. Handle to archived data
- The resource supplies some handle to the ingested files. PAWN then logs the user who triggered the transfer, handle to the archived data and attaches it to the package.
Loading new resources
There is another problem with resources. How does the required code propegate to the three components that use it? The solution is to register or load the resources drivers on the scheduler prior to performing any configuration or ingestion action. After registration, all necessary jar files will be downloaded by the gui's or receiving server as needed.
The workflow for registering and distributing packages follows.
- 1. Load resource jar files
- Any necessary jar files are loaded to the receiving server.
- 2. Scheduler gui
- User chooses to create a configuration for a driver in a particular domain. If the gui does not have a local copy of the driver in it's <nop>ResourceManager, a copy of the jar files are retrieved from the scheduler and loaded into local classloader.
- 3. Client gui
- Client chooses a particular resource configuration that will be used to load data. If the driver for the resource isn't loaded then it's retrieved and added to the local classloader.
- 4. Receiving server
- bulk loads all available drivers and gets configuration updates.
Component Details
Receiving Server Driver
This is a resource' implementation of edu.umiacs.pawn.resource.DataMover.
The receiving server driver gets a copy of the global parameters along with a set of client specified parameters that are specific to a given package transfer. These parameters, and a transfer context are passed into the datamover driver. In addition, pawn assembles a set of manifests and items that are to be ingested into the resource.
Depending on what needs to be ingested, one of the two processes are followed.
List of manifests | Item only ingestion |
When a list of manifests is to be processed, all children under that amnifest are also added to the list.
During transfer, the transfer context passed in during set parameters may be queried to find out information about the current item, metadata, or manifest. This should be used to discover the environment that an item exists in. Please refer to the TransferContext documentation for information on what is available.
From PAWNs perspective, the mover does not do any work until the processXX methods of the mover are called. At this time, it's expected the mover will supply PAWN with information about where the item was placed in the resource. This information is added to the packages audit log.
Driver Configuration
This is the resrouce' implementation of edu.umiacs.pawn.resource.ConfigurationGuiPanel.
This component is initialized with the client and mover level parameters (will be empty if this is a new resource). It returns seperate client and mover parameters.
Client Service
This is the resrouce' implementation of edu.umiacs.pawn.resource.ClientGuiPanel.
This component is initialized with the client parameters supplied from the driver configuration. It returns a set of parameters that is set as the client portion of the confing to the mover.
Quick guide to creating a resource
- 1. create classes that implement the three driver components
- ClientGuiPanel.java, ConfigurationGuiPanel.java, DataMover.java
- 2. Create your factory using the simpleresourcefactory
package edu.umiacs.pawn.resource.srb; import edu.umiacs.pawn.resource.SimpleResourceFactory; import edu.umiacs.pawn.resource.srb.gui.SRBClientChooser; import edu.umiacs.pawn.resource.srb.gui.SRBResourceConfiguration; public class SRBFactory extends SimpleResourceFactory<SRBClientChooser,SRBResourceConfiguration,Transport> { /** Creates a new instance of SRBFactory */ public SRBFactory() { super(SRBClientChooser.class, SRBResourceConfiguration.class, Transport.class); } }