—+ The Guide to the FIP Data Formatting Module
%TOC% needed
Relevent Program documentation.
* form
* ipformd
* ipformat
* ipformbl
* ipformch
* ipfsel
* ipfprep
* ipfchk
* ipsetter
HOW TO FORMAT !
Time for Some Golden Rules:
- 1: Study the INPUT file in detail.
Try to get several input files as you need to know EXACTLY what variations
there can be in the input.
Try to get a specification for the input if it exists, as that can save hours trying to guess. - 2: Study the OUTPUT file in detail.
Again a spec is useful. - 3: Talk to people about why they did it that way or why they want it that way.
If you are trying to copy an existing format, check with the Users what they REALLY need. Often things were done in a strange way because of system shortcomings or user’s personalities. Get a different person in charge and they may have different emphases. You may find that there is a subsquent system written on the system you are delivering to that is making a goodmany changes that you could include in your file processing. Similarly you may find that a lot of manual work is done on the file which you could circumvent. - 4: Remember to eat Lunch.
No system is worth missing food and drink.
Now the Steps :
1. Sort out Copy Flow for both testing and live running.
- Do you need an ‘xchg’ before formatting ?
It is always a good idea to rationalise the input file as far as possible
– running a preliminary xchg may help. - Do you need an ‘xchg’ after formatting ? Again there may be some tidying up needed once the file is processed that is better
suited to ipxchg than to ipformat. - What about special processing like needing to ‘sort’
output ? Do you need to sort the copy inbound ? Do you need to resort it outbound ? Do you need to select specific records ? - Does it need mutiple datafomats – occasionally you get a file that can be sorted one way, then formatted to say pick up items between consecutive records, then sorted another way and reformatted for output.
2. Map out what you want to format.
- It is usually easier to make notes BEFORE starting to test.>
- Just a few notes is all that are required.
3. Setup and start testing the main Format.
- Often is easier to copy an existing Text file from form/text, use it as a template and rework it to meet the new requirements. There are two ways of testing copy ‘offline’. At the Unix command line you can use the ‘form’ program to set yourself up a testing environment.
- Alternatively you can use the form module in W4.
4. Work out the other parameters in the Copy Flow.
- See the relevant section in this guide for an example of Copy Flow.
5. Release
- Dual running over a number of days is always advised.
6. Check
It is always worth going back to the users and seeing how things can be improved for them. There may be things you have missed, or just things that they now realise you can do, that they have never thought to ask for before.
Pointers
- If you need to sort the output file : use the ‘job’ parameter in the text/PROCESS file (see sections Copy Flow and [[Ipformd][pformd]] in the Program Section).
- if you need to select only certain records from the input file matched against a standing list of keys, use [[Ipfsel][ipfsel]] (see description in the Program Section).
- if you need to track several input files have arrived over a period of time, flag if they are missing and process if they are there (for instance, a number of postscript files for an OPI), use [[Ipformch][ipformch]], (see description in the Program Section).
- if the input file(s) are very dirty and inconsistent, clean them up beforehand with [[Ipfprep][ipfprep]] (see description in the Program Section).
USING ‘FORM’ TO TEST AND TUNE
There are two interfaces to test & tune your dataformats … form is a fast command and basic line driven interface that runs under Unix. There is also a web based interface available through W4.
FORM, the user interface for the Data Formatting module, can be used to test
and tune a process ‘internally’ – without actually sending the output file anywhere nor deleting the input.
It can also be used to run an XCHG before and/or after the format.
Several tests can be run and their settings are held in a series of Settings files in tables/form/test.
Note at the moment ‘form’ will NOT scan the PROCESS file and run any [[Ipformd][pformd jobs]] (qv) as it is used purely for getting the parameter file in form/text 100% correct.
Firstly you choose an existing Settings file – if there is one.
Then you need to get the input file into the queue in spool you are using for testing.
If it is the first time you will need to ‘mkdir’ your test area. Generally one has been setup called ‘spool/test’.
If the input file is on diskette or on another system, just copy it over. If the file came from a wire service or dialup and is in an FIP Archive log, you can add a destination in the sys/USERS file and resend to it. Generally, if FIippies have been onsite, there is a default one already setup called ‘test’ which goes through ‘ipedsys’ to cleanup the filename BUT does not go through any ‘xchg’ :
sys/USERS : test= DP:com2 DQ:2edsys EQ:test SC:no DC:no DF:testing
This uses edsys/TESTING to place the file in spool/test.
To get into the test bit of ‘form’, type ‘x’ at the form prompt. This will reveal firstly which Settings files are available which you can choose from.
The current settings are displayed before the list of options available.
All options are a single character (case-insensitive) followed by ‘enter’.
When changing settings, a ‘*’ can be used to list all the files in the relevant directory. Case is only sensitive for filenames.
root @ com_server2/ > form form-com_server2:x ** Test FORM - Looking for Settings files -- List of Files in the Queue : /fip/tables/form/test total 2 -rw-rw-rw- 1 root 93 Mar 17 10:53 RESRAC -rw-rw-rw- 1 root 94 Mar 17 19:52 RESTEST ** Hit Return to continue .. ** Test FORM - Choose a Settings files (or return to ignore) :restest ** Test FORM - Existing settings are : Working Queue : /fip/spool/test Input file : rem737 TEXT Param file : RESTEST XCHG before : PA2FORM XCHG after : PA2ATEX DU for Live Tests : sunres ** Options are : Change Settings : C List the Working Queue : L Look at the Input File : I .. .. the Output File : O .. .. the BreakOut : T Edit TEXT file : E .. Xchg before file : B .. Xchg after file : A .. PROCESS for jobs : P Help : H Run a Test : R Send Live Test : S Quit : Q :
In (slightly) more detail, the options are :
Change Settings- change any of the 4 inputs
Listing the working queue | ‘ls -l’ of the working queue | ||
Look at the Input file | More, Dump, Edit, Tail the input which MUST BE in the working queue | ||
Look at the Output file | More, Dump, Edit, Tail the output which is hidden in the temp queue. | ||
Look at the BreakOut file | More, Dump, Edit, Tail the breakout file which is that created by IPFORMAT showing how the input is split into fields and records | ||
Edit the TEXT parameter file | IPVI the parameter file in form/text | ||
Edit the Xchg before file | IPVI the (optional) xchg file to be used BEFORE the format. | ||
Edit the Xchg after file | IPVI the (optional) xchg file to be used AFTER the format. | ||
Help | More this file ! | ||
Run a Test | run IPFORMAT and the look at the output file. | ||
Send Live Test | copy the test file and send to the DU (destination) specified. | ||
Quit | No documentation available on this |
In more more detail …..
-- Change Settings : ** Change Settings - Existing settings are : Working Queue : /fip/spool/test Input file : rem727 TEXT Param file : RESTEST XCHG before : PA2FORM XCHG after : PA2ATEX DU for Live Tests : sunres (At each prompt, Use '*' for list of files) Change Working Queue : W Change Input file : I Change TEXT Param file : T Change XCHG before : B Change XCHG after : A Change DU for Live Tests: D List the Working Queue : L Quit : Q or enter :
Note that a ‘*’ ‘enter’ at any of the Change file lines will list the relevant queue before reprompting.
Type ‘-none-‘ (or in fact just ‘-‘) to set an optional field to ‘-none-‘.
Note that any reference to ‘jobs’ should be ignored in this version – I do.
Looking at any of the files :
** Look at file : rem727 : Options are : More : M or enter - This does a 'cat -v' before to show ControlChr Dump : D - Essentially an 'od -ab' Edit : E - Using 'vi' Tail : T - Last 20 lines of a 'cat -v' Quit : Q :
Running a Test :
– A simple format with no xchgs –
** Running please wait ... /fip/bin/ipformat -i /fip/spool/hoswrk/rem727 -p ATEXHORSES -o /fip/xFORM.XX.DEFAULT.XX -s -D -l -xo ** Ok Done Hit Return to continue ..
This will now go directly into the Looking at Output file menu.
– A more complicated run with an xchg before and after –
** Running please wait ... Saving the input in /fip/x/FORM_rem727 ipxchg -1 /fip/x/FORM_rem727 -D PA2FORM -o /fip/x -F ** Ok : Xchg Before finished /fip/bin/ipformat -i /fip/x/FORM_rem727 -p RESTEST -o /fip/x/FORM.XX.DEFAULT.XX -s -D -l -xo ** Ok : Format finished ipxchg -1 /fip/x/FORM.XX.DEFAULT.XX -D PA2ATEX -o /fip/x -F ** Ok : Xchg After finished Hit Return to continue ..
Again this will now go directly into the Looking at Output file menu.
Note that ‘form’ allows you to save the setting you have chosen in a Settings file, so that the next time you go into form it should display the settings from the last time.
Sending a Live Test :
This will copy the input test file and send it to the Destination (DU) called the ‘Live DU’ above.
It then tails the Item Log of that system underthe assumption that ‘ipformat’ will give a message when the file is through. – Cntrl C to Stop and return to the main Form Test prompt.
*Prerequisties for sending a Live test (if that is possible) :*
In form, you need to specify :
- an input file
- a working queue
- a destination (DU) which MUST be in the USERS file.
- Also remember to check your SC/DCs for your xchg’s.
- Also remember to add the selection lines in tables/form/PROCESS.
—-
USING THE WEB BASED VERSION OF ‘FORM’ TO TEST AND TUNE
The web interface also allows you to run offline tests of your dataformat.
It is similar to form, but gives you slightly nicer views of the file (for
instance hidden characters, high characters and line endings are all high-lit
in red.)
i) *You will need to have the form option enabled in your W4 logon*
This is normally done by adding the line
;;;;; Data Formatting
options:Data Formatting:/fip-pages/form/dftest.html:_blank
to either your W4 logon, or, more usually, to a template called by your W4
logon.
ii) This should be a fairly intuitive interface (feedback welcomed though)
You first select or create a test ….. a test is a definition of how you are
planning to test a file, so it will define the input file, any xchgs you plan
to run against it, and the format you plan to run.
![]() |
The top section of the left hand pane will describe the parameters you choose, and the links in the bottom selection can be used to select a sequence of xchgs, sorts and formats. Having selected the parameter files to run |
COPY FLOW
How do you route copy into, and out of, the data formatting module ?
– Using the normal FIP routings !
An simple example is the Horse Racing Cards which arrive via dialup modem from the course administrators :
Stage | Program | Input Queue | Parameter tables and Remarks |
---|---|---|---|
Input | VWIRE | – | wire/RM where RM is the name of the incoming feed |
Routing | IPROUTE | 2brouted | route/RM Actual routing line is : 1 z="*HORSE RACING*" +horses ie make an 2nd copy of the file and send to destination ‘horses’ |
Dist’n | IPWHEEL | 2go | sys/USERShorses= DP:localhost DQ:form SC:NO DC:NO ie process on whichever system it arrived on, send the file directly to spool/form with NO chr translation (ipxchg). |
Format Selection | IPFORMD | form | form/PROCESS Selection of actual Format prarameter file ; Selection criteria for Horses ie if the DU field is exactly equal to ‘horses’ do the ‘atexhorses’ job. |
Actual Processing | IPFORMAT | – | form/text/ATEXHORSES The original file is deleted at the end while the new, formatted file is sent back to ipwheel with a new destination (DU) of ‘atexhorses’. |
Dist’n of output | IPWHEEL | 2go | sys/USERSatexhorses= DP:atex1 DQ:junk-wir SC:HORSES DC:ATEX DF:DATAFORM ie send file to 2atex queue via ‘xchg’. |
Chr Xchg | IPXCHG | xchg | xchg/HORSES2ATEX Clean up the data |
Send to Atex | IPGTWY | 2atex | gateway/DATAFORM Send it to junk-wir |
EXAMPLE 2 : STOCKS : More complicated examples are for the various Stocks.
These large Hong Kong tables follow the same path as ‘horses’ except that the names of the parameter files are obviously different.
Format text file | form/text/HKSTOCKS | |
output xchg | xchg/HKSTOCKS |
For the Regional Stocks, there are two different input formats (plus Manilla which is different but simpler) which need to be used to create almost the same output.
The Input variations are :
RIC | DISPLAY NAME | LAST | TODAY’S HIGH | TODAY’S LOW | HISTORIC CLOSE |
AAH.AS | ABN-AMRO HLDGS | 59.9 | 60 | 59.3 | 59.5 |
ACHN.AS | ACF HOLDING | 35.7 | 35.7 | 35.5 | 35.5 |
AEGN.AS | AEGON NV | 112.4 | 113 | 12.2 | 112.8 |
*FASTCLOSE | RIC | 950301 | 1900 | KLS | |
SECURITY | DATE | HIGH | LOW | LAST TRADE | PREVIOUS CLOSE |
AYER HITAM TIN STK | 950301 | 4.100 | 3.960 | 4.100 | 3.960 |
AYER HITAM PLANT STK | 950301 | 14.300 | 14.300 | 14.300 | 13.500 |
ACIDCHEM STK | 950301 | 6.350 | 6.100 | 6.350 | 6.000 |
NAME | CLOSE | HIGH | LOW | PRECLOSE |
A Soriano | 3.2 | 3.2 | 3.15 | 3.2 |
A Soriano B | 3.2 | 3.2 | 3.2 | 3.15 |
There are two outputs which are identical EXCEPT London and New York prices are quoted in fractions NOT decimals and so they require a different Character Xchg.
The processing is more complicated than for ‘atexhorses’ as some of the names of the Stocks are changed AND the output is sorted alphabetically. For example ‘INTL BUS MACHINE’ needs to be ‘IBM’, but as ‘IBM’ it will start the ‘I’s and NOT appear after ‘Inland Steel’ as in the input feed.
So we use the ‘job:’ keyword in the PROCESS file to get IPFORMD to run a series of jobs rather than just start IPFORMAT as in the example above.
Look at the processing for London and New York :
While the selection remains similar to ‘atexhorses’ :
SN=LON.TXT >fracstocks ; London
we select on the filename this time.
But At the top of the PROCESS file there are a series of parameters for the job ‘fracstocks’ :
; ; Job sequence for London and New York - Fractions job:fracstocks /bin/rm -f formsave/FRACSTOCKS* job:fracstocks /fip/bin/ipformat -p fracstocks -i $i -D -S FRACSTOCKS job:fracstocks /fip/bin/ipxchg -1 formsave/FRACSTOCKS -D fracstocks -F -o formsave job:fracstocks /bin/sort +0 -3 -o formsave/FRACSTOCKS.s formsave/FRACSTOCKS job:fracstocks /bin/mv formsave/FRACSTOCKS.s 2go/#SN:\SN#DU:atexstocks
So the copy flow for London Stocks is that the FORMAT stage is replaced by ALL these job lines in sequential order :
- Remove any files starting FRACSTOCKS in formsave
- Run IPFORMAT with the input file and parameter file FRACSTOCKS leaving the output in formsave/FRACSTOCKS
- Run IPXCHG once on formsave/FRACSTOCKS overwriting the input with the xchged file. This is to get the correct StockNames eg:
x/Hong Kong Telecm/HK Telecom
- Sort formssave/FRACSTOCKS on the first three words, creating an output file called formsave/FRACSTOCKS.s
- Move formsave/FRACSTOCKS.s to 2go with destination (DU) ‘atexstocks’ and preserving the original filename (SN)
Please refer to the documentation on IPFORMD in the programs section for more information on jobs.
One further note on the Stocks is that all the output files pass through the STOCKS2ATEX xchg.
This is used to add column headers before certain Stock names. eg :
x/Northern Elec/\n{M1Stock\rClose\rHigh\rLow\rPrev\r\n{M0Northern Elec
4. PARAMETER FILE REFERENCE GUIDE
This is the reference and hints section describing the main Parameter file used for processing.
Overview
- Record Processing Lines
- Reserved Names
- Structure of the Parameter File
- Comments and the binary version of the Parameter File
- Processing Loop
Part 1. Define what the input file looks like plus a general section covering fixed information.
- define the type of file.
- define the record and field (if any) separators or sizes.
- define the record and field keytypes (if any).
- define all ‘sets’ – short forms.
- define all partial field structures – ‘partial’.
- define all localised searches – ‘match’.
- define strings placed before and/or after the data.
- define alternate names and headers.
Part 2. Output Section
- Record Processing Lines
- Spaces, End of Lines, Double Quotes
- Builtins
- Tests
- Using Flags
- Save Areas
- Using Counters
- Using Calculations
—-
Overview
Record Processing Lines
This is flagged as beginning with the ‘output:’ keyword. It describes what processing should be done for each input record.output
r=X
lines to describe the output.
Record lines can have system variables, input fields, tests, builtin formatting etc.
Each one of the keywords is described below.
Each line is a self-contained item ending with a NewLine (Unix) or CR LF (PC) or CR (Mac). The text parameter file can be edited by any word-processor on any (normal) platform AS LONG AS the end result is a pure, raw ascii file with no Presentation or fancy graphics embedded.
Comments are the usual semi-colon in front.
; comment
Reserved Names
The list of keywords is the list provided above plus a series of tests and builtins described below.
Note that it is possible – but not advised – to override some of these keywords. So these names should be considered reserved. In addition a few other names have been reserved for future use :
- blksep:
- blkkey:
- blksiz:
The Structure of the Parameter File
The text file is split into 2 main parts as described above. The OUTPUT section must the second and is marked by the keyword ‘output:’ on a line on its own.
By common consent, the first part starts with the definitions of the file, records and fields but this is NOT strictly necessary. The advice is – do whatever is easiest for you.
Comments and the binary version of the Parameter file
Comments are lines STARTING with a semi-colon. You can have millions of comment lines and, except for the first run, they will have no effect on run-time speeds.
This is because ‘ipformat’ uses a compact, binary version of the Text Parameter file which is built automatically by ‘ipformbl’ every time you modify the Text version.
The only time you need touch the binary versions (in tables/form/bin) is that they should be deleted during every software upgrade of the DF module.
Comments are encouraged
The Processing Loop is…
The actual processing cycle is :
For each input file hitting queue spool/form :
- ‘ipformd’ will select the correct Parameter file using form/PROCESS
- Normally this will mean starting ‘ipformat’ with the chosen file
Once started by ‘ipformd’, ‘ipformat’ will go through the following steps :
- – Preprocessing
- – Create a Fip style header unless not required with ‘nohdr’
- This can be the standard one or can have extra fields added using ‘hdr’
- – Add Data at the beginning of the output file if the ‘before’ keyword has been specified.
- – Processing input file
- – Split the input file into records and for each record :
- – Start at the ‘output’ section of the Parameter file
- – Check each record specification line. If it is for that record type, process it
- – Loop around for the next record
- – Postprocessing
- – Add Data at the end of the output file after the processed data if the ‘after’ keyword has been specified.
- – Create a Fip filename
- This can be the standard one or can be replaced if the ‘name’ keyword is specified.
– Send the file spool/2go for ‘ipwheel’ to distribute usually via ‘ipxchg’.
That’s it !
—-
Syntax of Each Keyword
filtyp: (type).
Where type is
text | – Ordinary text file with each record having a defined separator. |
fixed | – fixed record sizes |
variable | – variable record sizes |
If the filtyp:f or v, you will need to specify the size (or for variable maximum size) of each record.
For most applications, filtyp:textmeans you have to also define the record separator, ‘recsep’ too.
Syntax for ‘recsep’ and ‘fldsep’
Syntax:
recsep: (FipSeq string) fldsep: (FipSeq string) eg: recsep:\036
Normally the separator will be \n or \r\n for NewLine or Carriage NewLine.
Note that if you just put ‘\n’, ‘ipformat’ will automatically take any combination of CR and NL.
Syntax for ‘reckey’ and ‘fldkey’ – Define the record or field key
This defines the type and size of either the record or the field key.
Normally keys are positioned at the beginning of the record/field but optionally these can be at the end or at an offset from the beginning or end.
Syntax:
reckey: (length) : (type) : (posn) : (EndChr) : (delY/N) fldkey: (length) : (type) : (posn) : (EndChr) : (delY/N)
where
- length or size of key – This can be 0 for any length
- type (optional) can be
- a-alphabetic
- u-uppercase
- l-lowercase
- n-number
- p-printable
- s-space(or tab, CR, NL or FF)
- b-binary (ie anything)
- x-alphanumeric
- c-control (ie < 040 or >= 0177),
- z-anpa hdr field (ie alnum plus non-quad/format punct.
- t-punctuation
- posn (optional) is the offset on the key from the start of the zone If negative count from the end of the zone
- endchr (optional) is a single chr which terminates the key
The separator can be any punctuation chr as long as the same chr is used for each field. eg the following two are equal :
fldkey 3,n,,| fldkey 3|n||\174
Ie the end Chr is a pipe but as that is used as a separator, use the octal value
What is a key ? The key is used really as a TYPE of PROCESSING flag for the output section.
It can be a unique record key – such as a stock code – but if you have several thousand, it is going to be unwieldy specifying all of them.
So generally we are trying to classify records into general types. For example of a text file containing schools results like :
School Pinky High Head James Pinky Pupil Ramsay Macdonald 31.6 Pupil Thatcher Margret 77.3 Pupil U-Dones Helen 99.3
We can use the first field, which is alpha and variable length followed by a space.
reckey:0:a
This can be signalled in the output section as :
r="school" (do the school bit) r="head" (do the head bit) r="pupil" (do the pupil bit)
Syntax for ‘recsiz’, ‘fldsiz’ – Define the length of fixed size records & fields
Syntax :
recsiz: (length) fldsiz: (length)
where length is the size of key
Syntax for ‘keycasesens’ – Field and Record keys can be upper AND lower
Normally all keys – record or field – are considered to be case insensitive. So a key r = ‘aaa’ will pick up both AAA and aaa.
Use this command to force the difference.
Syntax :
keycasesens:yes
Syntax for ‘number’ – Change the number system for specifying non-printable chrs
When a non-printable chr is specified in the form ‘\012’ the ‘number’ system can be changed to decimal or hex.
The default number system is octal.
Syntax:
number:dec or number:hex or number:oct
The change takes effect for all lines in the parameter file lower down until changed again by another ‘number’ keyword (Why you would specify different number systems for different parts, I have no idea).
So a New Line chr will be
\012 octal \010 decimal \0a hex
Syntax for ‘wild’ – Allow wild card strings using a particular chr
Syntax :
wild: (Chr to use to signify a wild string) wild:*
This allows a wild string to be used when specifying record keys.
Note there is NO automatic wild string chr – you always have to specify it.
For example:
wild:$
Allows us to specify in the output section :
r=$ "This is done for all records"
Syntax for ‘wchr’ – Allow a wild card character using a particular chr
Syntax :
wchr: (Chr to use to signify a wild character) wchr:?
This allows a wild character to be used when specifying record keys.
Note there is NO automatic wild chr – you always have to specify it.
For example :
wchr:?
Allows us to specify in the output section :
r="abc?e" "This is done for all records abc(something)e"
Syntax for ‘stripeol’ – Do NOT strip multiple blank records
Syntax :
stripeol:no
Where the ‘recsep’ is some combination of CR and NL, normally blank lines or multiple occurances of CR and NL are stripped.
This command is used to turn that option OFF and to treat all lines as valid records, even ones with no data.
Syntax for ‘startkey’ – Force the record Key or type of the BEFORE-FIRST record
Syntax :
startkey:yes
This forces the key BEFORE the first record to be ‘x1594’ which can be used in the ‘ifprv’ test – if previous key.
Syntax for ‘set’ – Short forms or format names
Syntax :
set (name) (any fixed text) set pagehdr Stock<t>Close<tr>High<tr>Low<tr>Prev<qr>#\n
Set lets us specify easy-to-remember names to reference strings
Sets can NOT be split over several lines.
All leading and trailing spaces are stripped. So use wither double quotes to embed or the ‘\s’ escape string. To specify a double quote, use the octal string.
FipSeq strings are useable but note that ‘set’ are parsed when the parameter file is chnaged/created so that any Variable data will be of that time and any FIP hdr data meaningless.
eg set timedate \$d-\$m-\$y \$h:\$n
will produce the date and time of when the parameter file was last changed.
To get run time date/time, specify the same string in the output section record line NOT the ‘set’
The name of the set – timedate in our example – is case-INsensitive. So it may be called as TIMEDATE or !TimeDate or any other variation known to man.
Syntax for ‘include’ – Include another file of instructions
Syntax :
include (filename) include quark.tags
This will include another text file in tables/form/text with more Data Formatting commands.
The filename is force UPPERCASE as normal so in the above case the file will be :
/fip/tables/form/text/QUARK.TAGS
Generally include files will have commands such as ‘set’, ‘calc’ which are common to a series of Data Formats – like a quark styles for example.
Note that if you update the include file, ipformat will NOT rebuild its binary so you need to ‘touch’ all the other text files in form/text to get the new version. Normally of course ipformat realises a change has been made to the main text file and rebuilds its binary automatically. :
ipt form/text touch *
Syntax :
calc:(name) (The calculation) calc:percent 100*(c1/c2)
Define a calculation where cX are used as variables. The ‘name’ is that used in the ‘output’ section.
Calculations can NOT be split over several lines.
Variables are loaded at run-time using savnum, savbyte, savint, savswint, savlong or savswlong.
The default precision for a calculation is 2 decimal places. This can be overridden using the syntax :
calc:percent:0 100*(c1/c2)
where the ‘0’ after the name is the number of decimal places in the range 0-6.
Calc can also be used to change the output format of a number by specifying the precision. If the raw data is in 4 dec places and you only want 2:
calc:dec2:2 c22
and in the output section, load c22 from data – in this case field 3 :
savnum=22 f3
Operators can be :
* + plus
* – minus
* * multiply
* / divide
Take care when dividing by zero !
Use round brackets to denote how the calculation should be worked out. Ie the deepest level is calculated first, and then the calculation is worked from left to right.
For example :
(c1/c2)*(c3/(c5+c4))/100 will add run through the following order : Step 1 (c5+c4) Step 2 (c1/c2) Step 3 (c3/result of Step 1) Step 4 (result of Step 2 * result of Step 3) Step 5 (result of Step 4 / 100)
Syntax for ‘fraction’ – Define a Fraction
Syntax :
fraction:(name):(precision) (Output style WITH & WITHOUT fraction)
where
* name is the function to be used in the output
* precision is smallest denomiator – 2, 4, 8, 16, 32, 64, 128 or 256 – default is eighths
- + is used as separator – which can be any punctuation
- first style is parse string used if THERE IS A FRACTION and integer
- + is used as separator
- second style is parse string used if THERE IS ONLY AN INTEGER and NO FRACTION
- + is used as separator
fraction:star:16 +\ZI \ZD/\ZN+\ZI.0+ fraction:stox:32 |\ZI (\ZD-\ZN)|\ZI|
Fraction takes a field, partial field, saved field or calculation and split into 3 portions which can then be used as the normal FipSeq.
- ZS is the sign + or –
- ZI is the integer
- ZD is the fraction amount
- ZN is the fraction denominator
Specify two styles – the first for if there is a non-zero fraction amount and the second if there is none.
Syntax for ‘base’ – Define a Base number
Syntax :
base:(name):(precision) (Output style WITH & WITHOUT fraction)
Base is exactly the same as fraction except base will NOT attempt to ‘reduce’ the fraction.
- ie 2/4 is left as ZD=2, ZN=4 while fraction will force ZD=1 ZN=2
Syntax for ‘date’ – Generate a date and/or time from a number
Syntax :
date:(name):(data order) (Output style) date:mono:dm +Last \ZW was \ZD-\ZN-\ZZ+
where mono is the name which is used in the output section
The Order of the raw data is important. So we divide it up into a series of 0 or more 2-digit numbers (or space digit) whose order is specified using :
- d – day — 1-31
- m – month — 1-12
- y – year — (last 2 digits only, must be after 1970)
- c – century — 19 or 20 only
- h – hour — 0-23
- t – minute — 0-59
- x – padding — ignore 1 or 2 numbers at this position. Only 1 padding is allowed.
So for the 21st March 2010, the incoming data must be :
if it is : 21032010 | data order is : dmcy |
if it is : 210310 | data order is : dmy |
if it is : 032110 | data order is : mdy |
Obviously only one format of data can be handled by a single ‘date’ but you can have as many ‘date’s as needed.
Note that spaces, letters and punctuation are stripped and/or used as field delimitors.
So the following are all equivalent :
* 200896
* 20th 8-August, 96
* This day 20th August (the 8th month) in the Year 96
The Output format is defined in normal FipSeq between two deliminators (‘+’ in our example above, but can be any punctuation except semicolon ‘;’).
The new data is added to a series of extra FipHdr fields :
- ZD – 2 digit day of month
- ZM – 2 digit month
- ZY – 2 digit year e.g. 92
- ZZ – 4 digit year e.g. 1992
- ZW – Day of week e.g. Monday, Tuesday etc
- ZS – 3 character day of week e.g. Mon, Tue etc
- ZN – Full name of month e.g. January, February
- ZL – 3 character month e.g. Jan, Feb, Mar etc
- ZJ – Julian day of year
- ZH – Hour 00-23
- ZI – Hour 00-12
- ZT – Minute 00-59
- ZP – am/pm
- ZA – Week Number 00-53
- ZB – Week Number 01-53
- ZC – Day of the week 0-6
(see also manual page for ‘strftime’ for slightly more information)
Note that actual Day and Month names depend on the LOCALE of your shell/computer.
The Default output format, if none is specfied, is :
+\ZW, \ZD \ZN \ZZ+
Which for (order dmcy), (data) 16111997 English LOCALE gives
Sunday, 16 November 1997
Note that if any information is NOT supplied, the run time date/time is used.
Syntax for ‘partial’ – Further subdivide a record or field
Syntax :
partial:(name) (type):(length):(startchr):(endchr)
where
- type can be
- a : alphabetic
- u : upper case alpha
- l : lower case alpha
- n : numeric
- t : punctuation
- p : printable (generally Spc to~ })
- s : space ( tab, space, ff, cr, nl, lf)
- z : anpa type string (alphanumeric plus hyphen)
- b : binary
- length can be zero for unlimited.
- startchr and endchr are optional.
There can be up to 100 partial fields as required.
The contents of a Partial field are accessed by specifying pX where X is the sequential no from the start of the partial – ie first is p1, then p2 etc
Syntax to partial a field in the output section record line is
(partial name) (field name)
and then you can use any partial fields. For example:
* To pull apart a data in the form : 26th March 1997
; comment 26 th March 1997 partial:pdate s:0 n:2 a:0 s:0 a:0 s:0 n:4
In the output section, assuming a field 3 of record type 99 contains the date, we can out put just the year by :
r=99 pdate f3 p7
will produce “1997”
Syntax for ‘match’ – search and replace on a single field
Syntax :
match:(matchname) /(search string)/(replacement) match:(matchname):c /(search string)/(replacement)
where
- matchname is a unique alphabetic name
- c (optional) is for case-SENSITIVE searches
- the delimiters – ‘/’ in the above example – can be any punctuation as long as they are are the same for that match line
This is a localized search and replace or search and zap function which is applied ONLY to a record, field, partial field or save area.
Normally the search is case-INSENSITIVE but can be forced so with the ‘:c’ after the matchname.
No wild chr or wild strings are permissable at present.
To specify in the output section :
(matchname) (zone)
where
- matchname is the SAME as specified in the ‘match’
- zone is the record, field, partial or save area.
Up to thirty matches can be specified for a single zone.
For example :
match:jan /1/January/ match:feb /2/February/ match:mar /3/March/ etc .. etc match:dec /12/December/
In the output section – let’s pretend field 4 of record type 23 contains a number we want to translate to a month :
r=23 dec nov oct sep aug jul jun may apr mar feb jan f4
Note that dec nov and oct are done first else if ‘f4’ was “11” then the jan match will replace “11” with “JanJan” !
‘match’ complements the character xchg in IPXCHG. However they are slightly different in that ‘match’ is localised in that you apply it ONLY to a single field whereas IPXCHG works on a whole file (or using flags, to a selected paragraph, line or section of the file).
Syntax for ‘style’ – Reformat a data zone
Syntax :
style: (name) (a single printf conversion syntax) style:twonum %.02d
- – Uses printf which is nasty but a standard (of sorts). Do a man printf for fuller information if you need.
- – Always starts ‘%’
- – If the expression does not end with an ‘s’ (‘d’ for integer for example), then the string in the header field is first converted to that type.
- – Specify One and ONLY one expression (can not have %s%d%f) – as it takes the first only
- – do NOT use for fixed data, ONLY the conversion string.
- – Types are :
- string : s
- char : c
- long : d,i,o,p,u,x,X
- float : f,e,E,g,G
- % : print a % !
- type n is ignored ??
Examples
– to trim a string, use a dot | : %.5s |
– To pad a string with spaces | : %5s |
– To pad a string with spaces (left justified) | : %-5s |
– To pad a number with leading zeros | : %.06d |
Syntax for ‘name’ – Overwrite the default name of the output file
Syntax :
name: (Fip Hdr strings)
Remember that the ‘name’ is a list of Fip Hdr fields which will probably include the SN field which is the original name.
The default name on output is :
#SN:(original filename)#DU:(name of the paramfile)s#SC:FORM#DF:FORM
Where DU is forced lowercase.
The ‘hdr’ and ‘name’ keywords have the same syntax and roughly the same use so further information is in under ‘hdr’.
Remember ‘name’ is done AFTER all processing of the data – it is the last thing done before the file is sent on for further distribution. So information gleaned from the input, perhaps left in Save Areas, can be used in the name. ‘hdr’ however is done FIRST, BEFORE any data is touched.
Syntax for ‘hdr’ – Add extra fields to the FIP header of the output file
Syntax :
hdr: (Fip Hdr strings)
Remember that although the specification MUST be kept on a single line, HASH may be used as a delimitor for Header fields.
Note also that as the output file is a completly NEW file and has no physical connection to the input file, NO Fip Hdr fields are transfered from input to output UNLESS specified in the ‘hdr’ or ‘name’.
Generally use ‘hdr’ to preserve fields you need as for ‘name’ there is a limit to the number of characters – and the type of characters : no meta characters or slashes ‘/’etc – such as the Source Header for example :
hdr:#SH:\SH#SN:\SN#DF:roger
will transfer the SH and SN fields and force DF to roger.
As ‘hdr’ is processed BEFORE the data, no information generated by IPFORMAT during processing is available. However the ‘name’ keyword is processed AFTER so such data may be added then.
The default Fip Hdr put on an output file has the following fields : | ||
SU:form | – ie Source is form | |
HS:form_0_95-3-9_17:41:40_4_67 | – for tracking the file | |
HT:794792500 | – date and time !! | |
‘hdr’ supplements but does not replace these. |
Other useful fields can be :
- DF – output format for ipedsys, ipgtwy, ipout, ipprint etc. This must be in the ‘name’ parameter as by default it contains ‘DF:form’ eg :
DF:albert
will pickup tables/print/ALBERT if ipprint is the program sending to the final destination.
- CX – force the xchg name to be used to the contexts of this field. eg
CX:STOCKS
will override the SC2DC fields normally used for ‘ipxchg’.
Syntax for ‘nohdr’ – do NOT add a Fip header to the output file
Syntax :
nohdr:
This is used when the output file is to be used immediately by a Unix program which does not understand the Fip Hdr – sort for example.
Syntax for ‘chrset’ – force the SC or Source Character set
Syntax :
chrset: (name)
This fills in the SC: Fip Hdr field. The default is FORM.
Syntax for ‘before’ – Insert data BEFORE any output from the input file.
Syntax :
before: (Fip Seq strings and Record processing commands)
Syntax for ‘after’ – Insert data AFTER ALL output.
Syntax :
after: (Fip Seq strings and Record processing commands)
Both ‘before’ and ‘after’ can have tests, builtins, contents of save areas etc although obviously for ‘before’, most of these may have nothing in.
In the Output section – Record processing lines
The actual data is processed and output using the Record processing lines.
The Syntax of each line is that the first bit specifies which record type or key the rest of the line applies to :
r=(key) (output) For example : r=abc "Fried fish starts " s1 spc f4 " and the rest ..."
There can be multiple lines for the same record type. The following two lines will give the same result as the one above :
r=abc "Fried fish starts " s1 r=abc spc f4 " and the rest ..."
For lines where you want to process for all EXCEPT a particular record type/key, use teh syntax ‘r#’ :
r#35 "Nobody wants record 35s !"
How to specify you want to use, format and/or output zones ? | ||
records | r3 | or r=”abc” if not numeric |
fields | f99 | or f=Z if not numeric |
partial fields | p22 | – always numeric |
save zones | s1 | – always numeric |
flags | x199 | – always numeric |
calculations | c4 | – always numeric |
counters | z4 | – always numeric |
blocks | b77 | – always numeric |
set name | — | specify name as in the ‘set’ |
fixed text | — | ” some fixed text ” |
A ‘*’ can be used in certain cases to signify ‘ALL’ zones ie | |
clrflag=* | clear all flags. |
f* | output all fields from this record. |
Use double quotes for alphabetic keys and those with embedded spaces.
You should try not to use ‘sets’, ‘partial’s or ‘match’s with names in the form ‘z999’ where z is one of the single letters above and 999 is a number in the rangle 1-999.
Note that blocks are ‘super records’ but should be ignored for now.
Note that case is IGNORED in keys in the current version .
When ‘ipformat’ finds a name in the record processing line, it does the following sequence :
- – check to see if it is an already specified constant ‘set’ name.
- – if not, is it a zone – record, field, partial or block (eg p44, f3)
- – if not, is it a builtin command (see below)
- – if not, is it a save zone, flag or counter (eg s1, x33, c7)
- – otherwise it is considered some FipSeq string and saved as such.
Spaces, End-Of-Lines and Double Quotes in the output
One common failing when putting together a new parameter file is to completely forget about spaces (or other separators) and end-of-lines (CR or NL or CR NL or whatever) in the output file.
The point is – you have to specify them as NOTHING is implicit in the output file. There is no hidden magic which suddenly realises that you want an end-of-line when you need it. You have to state where and when you want them.
Generally this will be done by either putting them as constant/’set’s or specifying them in the record processing line. The following are exactly the same :
Either
set spc \s set ql \n output: r="BIG" f5 spc f3 spc f99 spc f5 ql Or output: r="BIG" f5 \s f3 " " p99 \s f5 \n
As you can specify a space as either ‘\s’ or in double quotes, to output a double quote character, you need to specify it as an number : \000.
Builtins
There are a number of builtin conversion routines for formatting zones – records, fields, save areas etc.
These are called by placing the name of the conversion BEFORE the name of the zone eg :
zapspcextra p5
which means :
- ‘zap all leading, trailing and multiple spaces from partial field 5’
A single zone can be subject to several builtins :
zappunc zapspc caps f=Z
which means :
* take field “Z” and zap all punctuation and zap all spaces and force uppercase before outputting.
Builtins for case conversion : | |||
---|---|---|---|
caps | force zone uppercase | ||
lwrcase | force zone lowercase | ||
idicase | force zone idiot upper and lowercase | ||
upper1 | force first letter of every word uppercase | ||
initial | only display first letter of each word followed by a full stop | ||
Builtins for removing spaces: | |||
zapspc | remove all spaces from zone | ||
zapspcextra | remove all leading, trailing and multiple spaces from zone | ||
zapspclead | remove all leading spaces from zone | ||
zapspctrail | remove all trailing spaces from zone | ||
Builtins for removing punctuation: | |||
zappunc | remove all punctuation from zone | ||
zappuncextra | remove all leading, trailing and multiple punctuation from zone | ||
zappunclead | remove all leading punctuation from zone | ||
zappunctrail | remove all trailing punctuation from zone | ||
Builtins for Counters: | |||
setctr | set a counter | ||
incctr | add one to a counter | ||
decctr | subtract one from a counter | ||
clrctr | clear a counter or set it to zero | ||
Builtins for Calculations: | |||
savnum | save a printable number in a variable | ||
savbyte | save a single byte in a variable | ||
savint | save a binary integer (2 bytes) in a variable | ||
savswint | save a binary integer (2 bytes swapped) in a variable | ||
savlong | save a binary long (4 bytes) in a variable | ||
savswlong | save a binary long (4 bytes swapped) in a variable | ||
Miscellaneous: | |||
strlen | returns the length of the string which can be output or saved or tested | ||
zapleadzero | removes leading zeros from zone | ||
zapctl | remove all control characters from zone | ||
incfile | include standing file at this point
r=99 incfile /home/standing/ s4 |
||
newfile | finish this file, send it and start another
r=abc newfile if any more information is specified AFTER the ‘newfile’ on the record processing line, it will be added to the FIP Hdr unless ‘nohdr’ has been specified. eg: r=abd newfile #DF:newform#QQ:\$Z |
||
log | log message in the Item Log | ||
continue | ignore all other tests for this record and continue with the next data record | ||
stop! | stop processing now. If there is an ‘after’ section it is done before the program finishes. (please note the exclamation mark !) | ||
reckey | output the actual record key. This is useful where wild cards are used for all records but you still need to output what the key was. |
Tests
There is a further selection of tests which can be made one zones inside the date.
These enable you to select even finer some processing depending on actual data. If and ONLY if the test is true is the rest of the line continued with.
Syntax for Tests
(ifxxx) (first string) (second string if required)
where strings can be fields, partials, saves or fixed text
Actual tests can be : | |||
ifprv/ifnprv | – test previous record type/key or not | ||
ifeq/ifne | – test if 2 zones are equal or not | ||
ifgt/iflt | – test if a zone is greater than another or not | ||
ifflag/ifnflag | – test if a flag is ON or OFF | ||
ifnul/ifnnul | – test if a zone is empty or not | ||
ifspc/ifnspc | – test if a zone only contains spaces or not | ||
ifalpha | – test if a zone only contains letters a-Z or not | ||
ifnum | – test if a zone only contains number/digits 0-9 or not | ||
ifcon/ifncon | – test if a string is (not) found within another | ||
ifpunct | – test if a zone only contains punctuation or not |
Note that sequence is important for comparing two fields that may be different lengths as ifeq will be true if the first field is complete ie :
1st=AAA 2nd=AAABC will be true 1st=AAABC 2nd=AAA will be false
Example 1 :
r=24 ifprv r=35 "Last record was type 35 and this is 24"
Only if the previous record type was “35” will the string be output
Example 2 :
r=24 f3 ifnul f3 " _ " x99
For record type 24, output field 3 and if there was nothing in it, output a (spc) (dash) (spc). Flag 99 will also be set if there was nothing there.
Example 3 : When using numeric data, please ensure that all extraneous characters are stripped from the zone before the test. In particular strip commas, plus signs, currency symbols etc.
For example, if field 7 has data like p9300.0007 and save field 9 has 10,000 compare the two by :
match:mnop /p// match:mnocomma /,// output: r=99 ifgt mnop mnocomma f7 mnop mnocomma s9 "Field 7 > Save 9"
Using Flags
Flags are a really useful means for deciding type of processing to do – or NOT to do.
Commands for setting, clearing and testing flags are | ||||
---|---|---|---|---|
To set a flag | x999 | where 999 is the flag number | ||
To clear a flag | clrflag=999 | |||
To clear all flags | clrflag=* | |||
To test a flag is ON | ifflag x3 (rest of the commands on line are done ONLY if true) | |||
To test a flag is OFF | ifnflag x5 (rest of the commands on line are done ONLY if false) |
For example, let’s use flag 3 to test if record type ‘abc’ has Richard, Helen or George in the first field. Print out ‘New name is (name) (newline)’ if it does :
r=abc clrflag=3 r=abc ifeq "Richard" f1 x3 r=abc ifeq "Helen" f1 x3 r=abc ifeq "George" f1 x3 r=abc ifflag x3 "New name is " f1 nl
Save areas
Save areas may be used to store strings – either in their original state or after conversion/formatting by other built-ins. The maximum save number is 299.
Commands for setting, clearing and testing save areas are : | |
To output a save area | : s299 — where 299 is the save number |
To clear a save area | : clrsave=299 |
To clear all saves | : clrsave=* |
To save data in a save area | : save=299 (string)
eg save=1 f3 save=5 caps f7 |
save the contents of field 7 in save area 5 AFTER forcing to Uppercase | |
To append data to a save area | : savcat=88 (string)
eg save=77 "ABC" |
save zone 77 holds ABC
savcat=77 "DEF" save zone 77 now holds ABCDEF |
Save areas may be used in the normal ‘if’ tests, eg :
ifeq "AAA" s1 x88if the contents of save area 1 starts “AAA., set flag 88 ON
Using Counters
Counters are integers (ie proper numbers with no decimals or fractions in the range -32000 to +32000.
They are signalled by ‘zX’ where X is a number.
They can be used to count the number of occurences of a record or field or even types of data and act accordingly.
All counters are set to zero when the program starts and by using the builtins :
- incctr
- decctr
- setctr
in the Record processing lines, you can manipulate them.
For example, to add some random markup every 10th line of a record type AB using counter 26 :
r="AB" incctr=26 ifeq 10 z26 clrctr=26 "[pt9][font99]"
ie : For all records type AB, add 1 to counter 26, then test if ctr 26 is equal to 10; if so reset ctr 26 back to zero and output string ‘[pt9][font99]’.
setctr=99 345 | – set ctr 99 to a fixed number 345 |
setctr=297 p3 | – set ctr 297 to the contents of partial field 3. |
In the second example, if the p3 is NOT a number, ctr 297 is set to zero. Also if p3 is a decimal number like ‘123.456’, only the main number is saved.
Using Calculations
Calculations are defined in the first part of the parameter file and used in the record processing part :
For example :
calc:mktcap c1*c2 output: r=BC savnum=1 f5 savnum=2 f7 mktcap
In this example we define ‘mktcap’ to be variables 1 and 2 multiplied together. Then in the output section, for record type BC. field 5 is saved in variable 1 and field 7 in variable 2 before we do the calculation and output the result.
A quick word about BINARY numbers.
Normally fields will hold printable data – such as in the example above – and we use the builtin ‘savnum’ to take that number for use in the calculation(s).
However some data is already in a binary form. Use builtins ‘savbyte, savint, savswint, savlong and savswlong’ to load these numbers. Often these will be derived from a partial field using the ‘b’ for binary field type. eg:
partial:bindata b:2 b:4 b:2 b:4
What is a swapped integer or long ? Some computers – like the PDP-11 and most Intel 16+ bit chips – hold the data in reverse byte mode.
– So if the data has been generated on a SPARC OR rs6000 or a Mac the data is ‘normal’ – use savint or savlong.
– While data from PDP-11s or Intel based PCs could well need to be swapped.
Loading Variables :
* Save a printable zone as a number variable – use ‘savnum’
savnum=5 p4
– save the contents of p4 as a number. So if p4 held the string ‘789’, c5 would be the number ‘789.
* Save a fixed number in a number variable – use ‘savnum’ again
savnum=7 1234
– loads the number ‘1234’ into c7.
* Save the contents of a single byte – use ‘savbyte’
savbyte=33 p7
Note that the contents of the variables, c1, c2 etc are not amended by the calculation UNLESS you specifically save it, eg :
r=BC savnum=1 f5 savnum=2 f7 savnum=3 mktcap
will load c3 with the result of the ‘mktcap’ calculation.
Examples of Builtins :
STRLEN
; test the field 2 is greater than 44 chrs (ie 44 is less than strlen of f2) r=HH ifnnul f2 iflt 44 strlen f2 "Big Field 2 here over 44 chrs long" \n r=KK "Save Field for Name (s55) is " strlen s55 " chrs long"
ZAPLEADZERO
; data - field 99 is 00000330303, field 101 is 00000000.00 r=3 "This outputs 330303=" zapleadzero f99 ", while this is 0.00=" zapleadzero f101
Putting it all together – some examples
EXAMPLE ONE
; file is variable text type filtyp:t ; each record is separated by CR NL 2 letter type recsep \r\n ; There are NO fldsep - we will use partials ; There are NO reckey or fldkey - we will test strings for the type of processing ; allow wild cards wild:* ; set qc \004\n set topbit \n{M2Processing Date : set dash " _ " ;Partial a Class line which contains the Class/Name/Length of race ; eg : Class 2 - ATV Anniversary Hcp. - 1000 M partial:pclass p:0::\s s:0 n:0 s:0 t:1 p:0::- t:1 p:0 ; localised matchs - search and replace match:mhcp /(Hcp.)// match:mhcp2 /Hcp.// ; replace M with meters match:mmeters ?M?meters? ; ;******************** output section ***********************8 output: ; Start by clearing flags 99 and 1 for each input record... r=* clrflag=99 clrflag=1 ; Now test for ONLY those lines which match our needs... ;| all |if field1 start|partial field1 |if partial |set |set flag 99 on ;| recs |with Class |according using|field 3 is not|flag 1 |too ;| | |pclass |empty |one | r=* ifeq "Class" f1 pclass f1 ifnnul p3 x1 x99 ; Print out only the names of a new race - only process if flag 1 is ON ; Use flag x101 to output [rf3] for the FIRST race only - which is the 1st class r=* ifflag x1 ifnflag x101 [rf3] x101 ; partial f1 again using pclass, if partial field 6 is NOT empty, remove extra ; spaces, Do the two search and Replaces and output followed by a 004 NL r=* ifflag x1 pclass f1 ifnnul p6 zapspcextra mhcp mhcp2 p6 qc ; remove extra spaces from partial 1 and output it, output partial 2 and 3, then ; if partial field 8 is NOT empty, add (spc) (dash) (spc) etc r=* ifflag x1 zapspcextra p1 p2 p3 ifnnul p8 dash mmeters zapspc p8 r=* ifflag x1 qc
FipSeq
Many keywords in the DF module can have variables as well as fixed text for parameters.
These ar generically called FipSeq strings and can be :
- Normal Ascii printable text : remember that leading and trailing spaces are always trimmed so use double quotes to embed : " Some leading spaces and some trailing " Also in the record specification ALL spaces between fields are stripped; again use double quotes to embed or Unix escape chr \s - Unix style escape chrs : backslash then lowercase chr : Carriage return CR : \r New Line NL : \n Space SPC : \s Backspace BS : \b Tab TAB : \t Backslash : \\ Form feed or Vertical Tab FF or VT : \f Wild chr (if specified) : \w Hexadecimal number :\x99 CR NL : \l - Octal numbers : backslash and 3 digits zero padded : \001, \377 These can be decimal or hex by using the 'number:' keyword. - Internal FIP header fields : backslash and 2 uppercase chrs :\SN, \DQ to extract fields from the Source Header ( Fip field SH) use \X? ie \XP for Priority. - System variables : \$D : day of month in 99 format \$M : month in xxx format \$I : month in 99 format \$Y : year in 99 format \$H : hour (99) \$N : min (99) \$B : sec (99) \$J : julian date (3digits, Jan1 is 001) \$S : 3 digit ascending sequence number \$Z : 4 digit ascending sequence number \$A : atex orig field (SOURCE;06/06,14:35) \$C : number of chrs in file \$W : number of words in file (IP_WORD_LEN) \$R : Random letter \$O : end optional text \$X : strip trailing spaces of buffer so far Fip Header fields can be further manipulated using pseudo-fields : fixed: QZ 1234543 partial:QT ST,3,2,U,<,> combie:QZ ep|na,(000000)a option:QT ep,11,7,s For fixed fields : fixed: QZ 1234543 ie If QZ is specified, replace with 1234543 Syntax fixed: [newfield] [tab/space] [fixed text] For partial fields. An example : partial:QT ST,3,2,U,<,> ie If QT, take ST header field posn 3 for 2 chrs, UPPERCASE. Syntax partial: [newfield] [tab/space] [existing field] [comma] [startposn] [opt comma length] [opt comma processing] [opt comma start chr] [opt comma end chr] where : Start and Length start from 1 not 0. Length can be zero or not defined for all characters in the field Processing is U-uppercase, L-lower, N allow only numbers, P-printables The Start Chr can be used to start the string. If there is also a length then this length is FROM the Start Chr. The End Chr can be used to end the string when it is undefinite length. For combinations : combie:QZ ep|na,(0000000)a ie Use EP header field, if not there use NA field, if not use the fixed text '(0000000)a'. Syntax combie: [newfield] [tab/space] [existing field1] [|] [existing field2] [opt comma] [opt default fixed text] For optional fields (used in conjuction with the \$O flag): option:QT ep,11,7,s ie If EP header field exists and has a space in the 7th position, send this text else strip text until the \$O flag. Syntax option: [newfield] [tab/space] [newfield] [?] [existing field] [comma] [size] [opt comma] [opt posn of test chr] [opt comma] [opt posn to send remainder of fld] where size is minimum size of field. The send parameter will send contents of the field from that position onwards. If not present, the field is used ONLY as a test and NOT to send chrs. Note that both size and test are start from 1 not 0. A single chr can be tested to be non-space as in the example above. If either the size or the test is FALSE, all text and sebsequent data whether fixed or variable (including more Optionals) is ignored until the EndOpt flag is met - '\$O' (see below).
Watch out using FipSeq strings in ‘set’s
Note that ‘set’ are parsed when the parameter file is chnaged/created so that any Variable data will be of that time and any FIP hdr data meaningless. eg:
set timedate \$d-\$m-\$y \$h:\$n
will produce the date and time of when the parameter file was last changed.
However FipSeq variables specified in record output lines will return run-time data eg :
r=* "And now the date and time : \$d-\$m-\$y \$h:\$n"
will produce date and time when that record was processed.
Importance of your LOCALE
Unix allows you to play around with character sets – called Locales – and this can have repercussions for the data formatting module.
These are defined as part of the ENVIRONMENT.
- look at the man pages for ‘setlocale’
- check you own settings with ‘env | more’
for LANG, LOCPATH etc.
For any non-English environment, it is important to define :
- What is a Alphabetic chr ? – normally a-z, A-Z
- remember all the accented characters
- What is a control character ? – normally octal 0-037
- sometimes these can also be octal 200-237.
- What is punctuation ? – normally “,.!@#$%^&*()-_=+[]{};’:”<>
- if you want to use ‘zappunc’, make sure.
If you get it wrong, you may find that an accented chr you consider to be alphabetic is processed by ‘ipformat’ as a binary chr. So take care.
Current Version
Current version limits are
flags - 300 allowed in the range 1-300 counters - 300 allowed in the range 1-300 saves - 300 allowed in the range 1-300 calculations - 300 allowed in the range 1-300 partials - 100 allowed in the range 1-100 matches - 30 allowed for any one field record length - 64k maximum fields in a record - max 100 there can be up to 1000 'set's and other constants There is also an internal buffer size for the size of the binary of the parameter file which is 16k - however most binaries are under 1k and the biggest seem so far is about 5k. keysize must be less than < 20 chrs ipformat misses 1st record if the initial sep is missing split keys are not allowed all keys are case INSENSITIVE
Save areas must be less than 16K each. In versions from 040, the program should handle many changes, additions etc. However if you do use buffers which are TOO big, an error message to the fact is logged in the Item Log and data MAY be ignored.
Until modified, note that a clrsave=* will reset everything.
POSTSCRIPT DRIVER - ipsetter
Please see [[Ipsetter]]
------------------------------------------------------------------------------
PROGRAM DETAILS
This section covers the following programs :
- form
- ipformd
- ipformat
- ipformbl
- ipformch
- ipfsel
- ipfprep
- ipfchk
form
Manual interface to the data formatting package
Allowable commands are :
x go into test mode
l look at log
m look at log - for 'form' items
c check crontab - for items about to go
- c all for ALL of root's crontab
t look at the individual Parameter file in tables/text/text
and show the contents
p look at main form files in tables/form - PROCESS, SETTER, SETPAGE etc
g go auto
h help
v version
q quit
-----------------------------------
ipformd
Please see informed
This is the daemon for data formats.
It uses a parameter file is used to route and process incoming files. This parameter defaults to tables/form/PROCESS.
It first uses a selection table to decide what the job really is. As the list is top down, only the first valid selection is processed.
The 'jobname' found is usually the name of a parameter file in tables/form/text.
IPFORMD will automatically start IPFORMAT with parameters of the input file and the jobname/parameter file.
However, optionally IPFORMD can be used to run a sequence of 'jobs' specified for a particular jobname.
The syntax for the PROCESS file is :
; comment
; the following is a selection line
(hdr field) = string [opt (tab) & (hdr) = string ...] (tab) >jobname (nl)
job: (jobname) (program to run)
trace: (jobname)
test: (jobname)
To describe the Selection syntax in detail :
(hdr field) = string [opt (tab) & (hdr) = string ...] (tab) >jobname (nl)
Each selection is on a single line. If necessary, multiple conditions
can be specified with the '&' to 'and' them.
The operation equal, '=', can also be NOT equal '!='.
Source Header fields (in SH) are preceeded by X, ie XC for category.
A '*' is used a wild card string; a '?' is single wild card chr. To
search for a string/chr embedded somewhere in a field, uses a '*'
before and after.
If embedded spaces are needed in the string-to-be-searched, use an '*'.
Note that the search string is case_insensitive.
Both the selection file and the main file are scanned completely, so
that one file may be sent to none, one or several destinations
according to the same or different criteriae.
For the 'job' parameter :
- '$i' refers to the input file name (Note \$i is still the FIP
System Variable 'month')
- all queues and files are assumed to be under /fip/spool
- Never assume however that the path environment has been setup, so we advise
you specify full pathnames for the programs.
- all 'job' lines MUST precede the selection - ie be above.
- FIP System variables and Header fields can be accessed.
- there can be one or many or very many job lines.
- any program can be run
- if a script/program returns an error, it is logged in the Item log and
further processing stops.
If a 'job' exists for a jobname, ipformd will NOT run ipformat but will run what is specified - which may be ipformat of course.
The 'trace' parameter is used for setup, tuning and testing a new job. All it does is tell IPFORMD to log each line in the Item log. EG:
trace:shares
Trace MUST always be specified in the PROCESS file BEFORE the jobs (ie on a line nearer the top of the file) and all jobs must be before the selection lines.
The 'test' parameter does the same as 'trace' BUT NONE of the programs are actually run. This allows you just to check syntax etc. EG:
test:racecards
Test MUST always be specified in the PROCESS file BEFORE the jobs (ie on a line nearer the top of the file) and all jobs must be before the selection lines.
First examples - simple jobs :
; Sports copy to Tranmere Rovers FC.
XC=S* & XK=*Football* >tranmere
EP=TAR. >tarmac
RD=*Broken_Hill* >bhp
where TRANMERE, TARMAC and BHP are all various parameter files in tables/form/text
Second example of a sequence of jobs. Here, if the file starts 'borsen', IPFORMAT is strted twice (sequencially, serially - ie NOT at the same time) using two parameter files in tables/form/text : WKENDSHARES and DAILYSHARES .
; shares : for both weekday and weekend
job:shares /fip/bin/ipformat -p dailyshares -i $i
job:shares /fip/bin/ipformat -p wkendshares -i $i
; Selection for job called 'shares'
SN=borsen* > shares
Third example shows how to sort, xchg and basically really screw around.
; Reformat, sort and generally destroy the horses ...
job:geges /bin/rm -f formsave/NAGS*
job:geges /fip/bin/ipformat -p geges -i $i -D -S NAGS
job:geges /fip/bin/ipxchg -1 formsave/NAGS -D geges -F -o formsave
job:geges /bin/sort +0 -3 -o formsave/NAGS.done formsave/NAGS
job:geges /bin/mv formsave/NAGS.done 2go/#SN:\SN#DU:nagsdone
; Selection for job called 'geges'
SU=RACEWIRE & SN=horse* > geges
Fourth Example is where the 'test' parameter is added to the 'geges' job in the Third example (Remember the 'test' must be specified on a line at the top of the PROCESS file BEFORE any 'jobs' for the same jobname) :
test:geges
Going into the MUI, ip, and doing a 'l' to list the log (or 'm' to more) gives:
Sat Mar 4 11:52:30 ipformd !i : Incoming File : geges : : geges
Sat Mar 4 11:52:30 ipformd !f : Test/NotRun : /fip/bin/ipformat -p geges -i /fip/spool/form/geges -D -S NAGS
Sat Mar 4 11:52:30 ipformd !f : Test/NotRun : /fip/bin/ipxchg -1 formsave/NAGS -D geges -F -o formsave
Sat Mar 4 11:52:30 ipformd !f : Test/NotRun : /bin/sort +0 -3 -o formsave/NAGS.done formsave/NAGS
Sat Mar 4 11:52:30 ipformd !f : Test/NotRun : /bin/mv formsave/NAGS.done 2go/#SN:geges#DU:nagsdone
Other Points worth noting (ish) ..
Break out - If either the input parameter -x or a header field FZBO is
present, the input file is 'broken apart' into blocks, records and fields. The
resultant file is called (dest)_(SN) in spool/formtest where dest is as above
and SN is the filename.
If an incoming file matches none of the tests, it is deleted and an error logged.
In the selection file, remember to specify long names first. In the following example, job 'sunrac2' never gets processed as all files will be jobbed as 'sunrac'
XK:RAC* >sunrac
XK:RACING* >sunrac2
Input parameters are (all optional) :
-c : name of a queue into which copies of all incoming files
are made. default: no copies
-f : file creep time default: 0
-i : queue to scan default: spool/format
-l : do NOT log every incoming file/destination default: log
-n : run the program at reduced priority default: nice 5
-p : processing file to use default: tables/form/PROCESS
-s : run files serially (ie one after the other) default: parallel
-t : scan time of directory default: 3 secs
-T : Always trace jobs. This is the same as the 'trace' parameter
used for setup, tuning and testing a new job. All it does
is tell IPFORMD to log each line in the Item log. def: no
-x : debugging ON - ALL incoming files will be 'broken out' in formtest
parameter is 'o'ctal, 'd'ecimal or 'h'ex. default: off
-z : calm down time default: 5 secs
To attempt to let ipformat finish one job before the next
-v : display version number and exit.
ipformat
Ipformat is the key formatting program. It can read and split text, csv and
xml files into
records and fields and reassemble them using conditions and calculations.
Please see [[Ipformat]]
ipformbl
Please see ipformbl
IPFORMBL takes the text parameter file and builds a binary so that IPFORMAT can run faster.
ipformch
This is the checker for data formats. It is started usually by crontab.
Please see Ipformch
ipfsel
This program is used to Select and sort lines from an incoming data files
using
a selection file.
A parameter file is used to determine where the files should be sent and other
parameters. This file is in tables/from/select and default to SELECTION. This
may be overridden by the contents of the Fip Hdr field 'DF'.
Please see ipfsel
ipfprep
Please see ipfprep for up to date information
This program prepares incoming data to tweak it before IPFORMAT is run against it.
ipfchk
Please see ipfchk for up to date information
This program scans a directory and checks the incoming files for CRC errors.
If found the lines are stuffed into an error file and sent to an errdst
while the resultant file is flagged with the HE field as ERROR.
------------------------------------------------------------------------------
-->
© FingerPost Ltd. 1996 and beyond