sffutf8
sffutf8
Convert/show/split into/out of UTF8 <-> Unicode
eg
sffutf8 20ac
gives the result :
2utf8.input.e.dec.8364.hex.20ac.oct.020254 -> output.3.bytes ?.226.e2.342
?.130.82.202 ?.172.ac.254
Format a string into/out of utf8
Either -2 : convert the input number TO utf8 (default)
or -X : convert the input string FROM utf8
sffutf8 -X -o \\302\\243
normally you need to double the backslash as it gets parsed twice
or -S : file of UTF8 chrs to split into unicode numbers
or -s : split an existing UTF8 into unicode numbers (dec, hex or oct) sep
-e : splitter chr/string in FipSeq (default is '^')
for splitting, output is decimal by default
sffutf8 -s
'三井住友FG、中島副社長が社長に昇格 12月1日付' -e
\\001 -n
sffutf8 -s 'abcdef -e '^'
sffutf8 -S /fip/x/TMP.w4.widths -e '^'
Optional
-d : input is decimal if just a number default: hex
-h : input is hex if just a number default: hex
-o : input is octal if just a number default: hex
-p : input is printable if just a number default: hex
-D : output in decimal
-H : output in hex
-O : output in octal
-P : output as printable
-A : output ALL varieties (default)
-n : do NOT add a NewLine at the end
-v : version and exit
Normally we add a NewLine After the string, use the '-n' input switch to strip.
Note that if you use 'echo' pls remember to NOT add a NL at the end.
Only other input parameter is -v : version and exit
To give options, type just 'sffutf8' on its own
Version Control
;0i-m 39nov23 ;i added -i -s -S ;j(bad) ;k -X working again ;m -s -S working
better
;0h 23mar05 chris original version ;f redid -2 ;g minors
(copyright) 2025 and previous years FingerPost Ltd.