ipxrtf

ipxrtf

Takes RTF files and strips all markup etc except styles.

It can be run as a spooler against a queue where the input files are deleted
after use or moved to a done queue. Or it can be run once against a single file
which is NOT deleted or moved for where operation in a script is required.

A parameter file is used to determine where the files should be sent and other
parameters. This file is in tables/setup and default to XRTF.

Keywords for XRTF parameter file are :
    ; comment line
    dest:   One or more FIP destinations as per the USERS file
            The default is 'xrtf'
    script: Script to run on the new file before sending to the output q
        The first parameter will be the outputfile containing the correct data using
a TMP filename. The second is the name requested.                   default: none
    chrset: Name of the Source Character Set.       default: ascii
    source: Name of the Source of this data         default: XRTF
    noarchive: Do NOT archive the data in IPWHEEL       default: do
    filename: FipSeq for a replacement filename for the output file
                                default: as input
    nohdr:  Flag for no Fip Header on the output file.  default: allow hdr
    extra:  Extra Fip Header information to be added    default: none
    preserve-fiphdr: Keep any FipHdr from the raw file - if there is any. default:
strip it
    nodata: Flag to ask for NO output file at all       default: allow data
        If a script is flagged, a file is created for the script
        and is then zapped.
    before: String of FipSeq to stuff at start of the file. default: none
    after:  String of FipSeq to stuff at end of the file.   default: none
    noblanklines:   Strip all extra blank lines.        default: leave in
    noleadspaces:   Strip all leading spaces on lines.  default: leave in
    strip-top-spaces: Strip all spaces - Tabs, Newlines default: leave
        etc from the top of the data
    fontbefore: String Before a font change         default: fonts ignored
    fontafter:  String After  a font change         default: fonts ignored
    par:    String to replace the RTF \par          default: new line
    eol:    CR or NL for carriage return or New Line    default: new line
    hardreturn: yes/no Should a \(eol) be a SPC or EoLinei? default: space
    supercede: For single files, supercede any existing output file. def:preserve
    multiple-file-marker: or
    interfile: Separator for between files          default: \n(next file)\n
        This string is used to split multiple {rtf1 .. } { rtf1 ..} files.
    alldata: Normally ONLY data within the {\rtf1.. is included.    def: no
        Use this flag to state that any preceeding or trailing data is to be kept.
    escape-hex: y/n escape and RTF hex chr with a '\'   default: no
        CHANGE - pre ver 05k this was default: yes.
    chr:(chr in FipSeq):(FipSeq string)
        Replace this character with the string
        This can be a printable chr or an escaped number. The number is octal/dec/hex
depending on the preceding 'number' keyword (if any).
        eg  chr:\313:&pound;    chr:<:&lt;
            Note that the ';' is part of the string and NOT a comment as it does NOT
start the line.
    style: All extra text before and/or after a new RTF styletyp.
        There are no defaults at all for styletyp.
        Syntax  style:(StyleNumber) (Text to output)
        Sub keywords are
            before
            after
            para    replacement for a \par inside this style

        Eg: style:10040 before:[HH]\n   after:ZZ
            style:10039 before:[TT]\n   zap:
            style:10099 before:"abc def ghi"
    or, using the same syntax, you can use the names of styles - note the double
quotes as we are embedding spaces.
            stylename:"Heading 1" before:<hl4>  after:</hl4>

    tag:(name)  add data before or strip any data associated with this tag
            tag:fonttbl zap:
            tag:colormap    zap:
            tag:stylesheet  zap:
            tag:rdquote before:'    zap:
            tag:endnote before:<EndNote>
            Sub keywords are
                zap:        strip any data and do the same for following tags on this level
                zap-this:   strip any data for this tag ONLY,
                        revert to prevous setting for following tags
                allow:      pass data and do so for any following tags on this level
                allow-this: pass this data but
                        revert to prevous setting for following tags
                before:(FipSeq) add this data where the tag was
                after:(FipSeq)  add this data at the end of the level of this tag
                toggle:     see below

            The default is 'allow' for all tags not otherwise specified

            Certain tags (or in RTF parlance, control words) are 'toggles' and can have
an 'after:' keyword too. The RTF spec on the Microsoft web site (try
msdn.microsoft.com and search for RTF)

            ; bold on= \b, off= \b0
            tag:b   before:<b>    after:</b>    toggle:
            tag:i   before:<i>    after:</i>    toggle:
            These are mainly font toggles and are automatically reset on a \plain tag
            unless the 'no-toggle-reset-on-plain:' parameter is present

    no-toggle-reset-on-plain:
        DO NOT reset all the toggles on a \plain tag

    bbc-header-file: file name of a parameter file describing a
        BBC-M Companion file. This file has the syntax of
            ; comment
            fiphdr:A3   size:55 type:a,n,b
        The fields are consequetive so any gaps (or unwanted fields should be flagged
as a FipHdr field of '$$'
            fiphdr:$$   size:3
        Fields which repeat are labelled :
            fiphdr:TT   size:3  repeat:
        The filename and other NUL terminated variable
            length fields are labelled:
            fiphdr:FF   size:0  variable:

        Standing files :
            For fiphdrs, syntax is 1st is the lookup, 2nd is new
            country-fiphdr:JC   LC
            service-fiphdr:JZ   LZ
            language-fiphdr:JL  LL
            country-codes:/fip/tables/setup/country.txt
            service-codes:/fip/tables/setup/services.txt
            language-codes:/fip/tables/setup/language.txt
        companion-file-queue: queue name for the companion file
            default is the same as the input file.

    bbc-kill-file: File whose contents (in FipSeq) will replace
            the data part of the incoming file  default: none
    bbc-kill-style: Kill style number.          default: 25
        All text in styles LESS than this style is stripped from a XKILLed file. Once
triggered, all subsequent text is passed.

    error-dest: Destination to send files in error to.
        default is 'xrtferror'
    allow-text-files:
        Normally only RTF files are processed and any non-rtf files have their data
stripped - zero length data files are created.
        This allows the data from non-rtf files to be processed too,
        BUT only if there are NO "{rtf" start tags.
    balance-store: Name of the balance group for sending the input file to remote
systems. If BBCM flag is on, the companion file is also copied. This uses the
done queue - so files are balanced ONLY if a done queue has been specified.
    log-line: Line of FipSeq to replace normal logging
            default is log-line:\SN \DU
    primary-host:   Primary host for this service
        The default host is the one we are running on. Use this to force
redun_balance files to be output on a different one.
        (NOT WINNT)
    timing-stats: yes/no
        Turn on timing statistics       default: no

Where sections of FipHdr fields are required or changes to the output style,
use keywords : fixed, partial, combie, optional, repeat and/or style. (see The
SysAdmin manual for more information).

    They are normally specified :
        fixed:QZ    1234543
        partial:QT  ST,3,2,U,<,>
        combie:QY   ep|na,(0000000)a
        option:QE   ep,11,7,s
        repeat:QK   XK,-,3
    or  repeat:QP   PK,,4,#X
        fipseq-style:QS XN,%.03d
        repeat:QL   XL  "99"="abc" "101"="def

Note the fipseq style has been renamed to 'fipseq-style' for this program only.

Pls never allow any of your scripts to loop as ipxrtf waits for completion
before continung !

Input Parameters are (all optional) :
Either  -1 : name of a single file      default: spooler
Or  -i : input queue            default: spool/xrtf
        If this does NOT start with a '/', it is assumed under spool.

    -2 : Second copy done for input file        default: file is deleted
        If this does NOT start with a '/', it is assumed under spool.
    -d : done queue for input file      default: file is deleted
        If this does NOT start with a '/', it is assumed under spool.
    -D : do not delete the data or the BBC companion file if there is one.
    -h : extra FipHdr information to add    default: none
    -k : Filename mask. Send only files containing this string.
                    default: send all files
    -K : Filename unmask. Do NOT Send files containing this string.
                    default: send all files
    -l : do NOT log files in        default: log
    -o : output queue           default: spool/2go
        If this does NOT start with a '/', it is assumed under spool.
    -S : Run thru the input file and    default: dont
        display all tags, level and data as it is being processed.
        Often there is an oddie in the data - and you just need to set that tag to
'zap:'
        Use this (with the -1 switch) to debug troublesome files.
    -u : owner if not that of the logon default: logon at start
    -v : print version number and exit
    -w : file wait for files arriving across a network. def: 0 secs
    -X : do NOT keep the incoming DU,   default: incoming DU
        use the "dest" parameter.
    -z : parameter file in tables/setup default: tables/setup/XRTF

testing Example
ipxrtf -1 /fip/spool/test/sud3696.rtf -D -S -o xo -z TEST | tee RTF.TEST
ipxrtf -1 /fip/z/data/xrtf.ahref.rtf -z vics.xrtf.tags -o xo -N vicsTest.\\\$u
-S | tee /tmp/Xrtf-19-1144

Version Control
;5p7    29jan02 added NON-rtf files for BBCm
    ;a/b 19mar02 cleanups
    ;c 25aug02 tracking levels for font/toggles
    ;d 09jun03 added stylesheet names
    ;e 23apr05 strip leading Spaces/CR/NL from stylenames
    ;f-j 10aug06 added timing-stats and added -h (;h bugette in stylesheet)
    ;k 18nov06 added escape-hex: for \'xx notation RTF hex chrs=CHANGE to DEFAULT
    ;l-m 20nov06 added chr:
    ;n-o 24may07 added -2 for 2nd don que
    ;p3 13dec07 made dest FipSeq ;2 added preserve-fiphdr ;3 is \(nl) is
hardreturn ? ;456 minor cleanups
    ;p7 19jan22 added -N to overwrite output name

(copyright) 2024 and previous years FingerPost Ltd.