New Dataformatting Users Manual

Fip – New Data Formats User Notes

this version: 004 22may12

Overview

The new version of the data formatting module, extends and replaces the existing version by adding

  • Separate test and production environments
  • Allowing significant changes to be made without interfering with production
  • Upgrade and rollback of jobs in production
  • Concept of Stages
    • Multiple stages rather than the old xchg->format->xchg model
    • Stages can be of any accepted program
    • Each stage can be tested separately
  • Multiple Publications working (semi) independently on the same (pair of) servers
  • Much better security

If you are an existing user and want to know the significant changes, zoom to the end of this document

This guide aims to explain and detail the steps are needed.

Contents

%TOC%

Links to other Data Format documentation

How does it work and Terminology…>

  • A ‘process’ file contains all the ‘jobs’ which need to be done for your publication
  • Each job is expecting some sort of data in, a program to format the data and then for the newly created data to be output. So you have to be able to define
    • What metadata or data to check in order to run the job
    • What formatting – which you define as a series of stages where a program or script is run at each stage
    • Where to send the file afterwards

How do we go about it ….

Logon

  • Your logon defines …
    • Whether you can access data formats
    • and, in a multi-publication house, which publication you can access

After logging on, click on “Data Formatting” link, this will take you to the Status Screen

Status screen

  • List of all the jobs showing the time and date when last changed :
    • If in production
    • or If still testing

df_status.gif

  • By clicking on the admin link, you can administer the job – see the ‘Administration Screen’ section below
  • By clicking on the test link, you can test the job – see the ‘Test Screen’ section below
  • Click on view for a ‘look but don’t touch’, view of the stages for this job
  • By clicking on the Create New Job link, you can do just that!

    Administration Screen

    • You got here by clicking on the ‘admin’ link in the Status Screen
    • This shows the history of the job – when it was last updated
      df_admin.gif

    • You can :
      • Upgrade the current test environment, to production at this point
      • Rollback to a previous version if there were any problems
      • Disable a job temporarily (and enable it again later) if there is a temporary glitch
      • Modify the selection criteria

    Test Screen

    • This shows a list of stages/programs with their parameter files
    • You may test the whole job or just a stage
      df_test.gif

    • By clicking on the ‘Create Stage’ you can do just that !
    • Data –
      • By clicking on the ‘Add data’ you can do search the data area for any data which has been copied over for your publiction from the wires etc. (This data is stored in /fip/data/form/name_of_pub)
      • select data file to use to test with.
      • running and testing
      • Click on “Run Job” link at top to run the data through all stages. This will open a new window showing the output
      • Click on the links under the “Input” or “Output” columns to see the data at that particular stage in the job.
      • Click on the “BrkOut” link to see an ipformat debug view of the data
      • To modify either an xchg parameter file or the data format, click on the links under the “Parameter file” column. This will open the file in a a browser text area for editing. Be sure to save the file after modifying. NB: this will change only a TEST version of the parameter file. The changes will not go into effect until the status is upgraded from Test to Production in the Admin screen.
      • To disable a stage without removing it from the job, click on the “enabled” link under the Status column.
      • Click on the number under the “No” column (first column) to change any information for that stage.

    Testing In more detail – How to create a new job

    1. What do you need?

    • Raw data
      • use the FIP browser to locate the raw data from the wire. Your logon should allow you to copy to the data formatting test area.
    • Parameter files
      • data format – you will probably want to copy from an existing data format or template
      • xchg? – will you need any cleanup before running the data format?

    2. Create and set up a new job

    The following example will be for creating a job for formatting an AP agate file for an editorial system.
    Typically you need an xchg before and then one data format.

    • Click on the “Create New Job” link on the top of the Status window
    • Name the job – this will bring you into a new Test window
    • Click on “Add more data files” in the “Test Data” section at the bottom
    • This will bring you into a window listing available test data. Choose the raw file for this job.
    • Click on the “Return to Testing…” link to get back to the Test window. You should now have a data file selected in the “Test Data” section.
      df_rawdata.gif

    • To add the xchg first, Click on ‘Add a new Stage’
    • give it a number which is the sequence to run this stage. (number 1!)
    • select the ipxchg program from the drop-down list
      df_newstage.gif

    • Click “Please click here to choose” to get a list of current xchgs
    • Click on the xchg you need
    • Leave the input and output names as the default. Input should be “original” and output should be “normal.”
    • Now click on the “Add” link to add the new stage
      df_stage1.gif

    • For adding stage 2 (also the last in this job), just as with stage one:
      • click on “Add a new stage”
      • give it a sequence number of 2
      • select the ipformat program from the drop-down list
      • click “Please click here to choose” to get a list of current data formats
      • find the default or template you want to copy from and copy it to a new name.
      • click on the data format you need
    • Now for “input” choose “output1” from the drop-down and for “output” choose “form2fip” from the drop-down
      df_stage2.gif

    • Now click on the “Add” link to add the new stage
    • Your job should now have 2 stages and be ready for modifying and testing.
      df_ready.gif

    What is all this about input and output files ?

    Input files can be :

    • original – the original input file
    • outputX – the output file from stage X where X is any previous stage
    • custom – use the Other/Switches to specify eg -i^i

    Output files can be :

    • none – there are no outputs from this program or we cannot trap them
    • normal – the output file will be the usual one specified for this program
    • form2fip – (for ipformat ONLY) the output file will be sent to spool/2go to be processed by other Fip programs in the normal fashion
    • custom – use the Other/Switches to specify eg -o^o

    How do I run a test and see the results.

    • just click on RUN JOB at the top
    • OR click on any file on any stage to see the state at that stage
  • When viewing data, the options are :
    • ReRun the job
    • Show unprintables
    • Show/Dont Show the FipHdr
    • Dump the data in decimal, hexadecimal or octal
    • Show the data in NITF/HTML format
    • Show the data in FipSetter format

    Normally only the 1st 200 lines are shown, so click on the ‘All Data link’ to show all

    What can I put in Switches

    a series of variables are allowed in the Switches box

     ^i - input file name
     ^o - output file name
     ^p - parameter file name
     ^c - copy of the output file (where possible)
     ^s - custom switches (optional)
     ^n - stage number
     ^j - job name
     ^u - publication name
     ^x - a single ^ (also ^ followed by any other chr is a single ^)
    

    More complex formatting using sorts and non-standard programs

    Sorting …

  • Sorting stages can be added just like other – or as non-standard ones if you are copying an existing workflow and do not wantto change anything.
  • You can use ‘sffsort’ where a file has a FipHdr. In this case the FipHdr is saved, the data is sorted using ‘sort’ and then the FipHdr reapplied at the top of the new data
    • Normally, add the stage to your job :
      • program sort or sffsort
      • parameter file (leave blank)
      • input file original (or whatever the input to that stage is)
      • output normal
      • other/switches (specify the sort switches here)
    • Or if you want to use non-standard jobs
      • program nonstd
      • parameter file (leave blank)
      • input file none
      • output none
      • other/switches (specify the whole string/path/switches you need)
      • eg /bin/sort -t+ +2n -3 -o formsave/STATS.srt formsave/STATS.orig
        In this case the next stage should have the input of ‘custom’ and type in ‘formsave/STATS.srt’
    • Remember to replace any carats with ^x (and the usual double * backslash for ‘\’)

    Why cannot I add my rinky-dink perl script on the fly.

    • Ever heard of security gulag where hackers are incarcerated ?
    • Each program you want to use has to be in the setup file.
    • This allows the sys admin types to make sure no one is abusing the system
    • However some sites DO allow these – pls check your Systems Admin (or Add a New Stage and see if there is a ‘nonstd’ option under ‘program’)
    • If you are allowed, add the stage to your job :
      • program nonstd
      • parameter file (leave blank)
      • input file original (or whatever the input to that stage is)
      • output normal
      • other/switches (specify the whole string/path/switches you need, with ^i for the input file and ^o for the
        output)

    df_nonstd.jpg
    Note that the output file from the script (out=^o) is picked up automatically by the next stage – ipformat – using an input of ‘output1’

    • There is a second method where you need to specify the input and output files. In this case use :
      • program nonstd
      • parameter file (leave blank)
      • input file none
      • output none
      • other/switches (specify the whole string/path/switches as a standalone program)
      • eg /fip/local/sports_stats.pl $i > /fip/spool/testarea/SSTATS
      • The next stage can then pick up the output file by specifying the input file as ‘custom’ and ‘/fip/spool/testarea/SSTATS’

    Note you must MANUALLY copy a new or changed ‘nonstd’ script to all the other Fip systems as there is nothing inside DataFormats that will do that automatically (ie in this case, scp /fip/local/sports_stats.pl to the companion systems)

    Differences – old version and new

    * Parameter files
    * Old – one set of parameter files meant testing on the live (and only) version
    * New – 2 main sets of parameter files – PRODUCTION and TEST
    * When running tests, parameter files are a copy-and-branch of the Production
    * So the Test environment can be completely different from Prod as it is completely independent
    * When ready for production, go into ADMIN and UPGRADE the Test versions to live Production
    * PROCESS file is now generated automatically
    * It is recreated every time an Upgrade or a change to !SortKey or a Selection Criteria
    * It also means the Selection Criteria is sorted (alphabetically ascending) on the name of Job
    * To resort (such as having the default for a wire service as he last one) change the Sortkey!
    * (This catches most people out so take care when naming – and use the !SortKey Test screen to check that the right Job is being run againt the data file.)
    * Rollback – you made a mistake in the current version of a parameter file, so you can rollback to the last production version
    * See the results – input and output files – of each stage.
    * In the old version, only the original, raw data file and the final output are visible
    * Compare changes between `Prod and Test versions of each Parameter file