SHFtool - example implementation for the S Hexdump Format


So what is it?

SHF is the RFC 4194 S Hexdump Format. We like to think of the S as in "Standard" but IETF barfed at this and said they don't like self-entitled standards. Then it is S as in S-Records, or as Larry Masinter suggested "Stupid Hexdump Format". SHFtool is a test implemention of a sort of basic hexdump swiss army knife based on the Expat XML parser.

General idea

Input   Output
-=> SHF

Done so far: SHF and S-record input and output. Textual hexdump output. Next: no more formats. If I implement this in the BFD library used by GNU binutils and friends instead, the objcopy program (part of e.g. Cygwin) will do all these things instead. Also, contributing things to GNU has much better karma.


Some questions have popped up again and again during the development of SHF:

Q: My PDP-8 has 12-bit arithmetic. SHF will only allow wordlengths (wl, in bits) where wl MOD 8 == 0 and 12 MOD 8 = 0.5. Why didn't you make SHF support arbitarty word lengths?

A1: Oh my GOD, you have a PDP-8, sweet! Can I have it?

A2: 12 bit means you probably write in numbers using octal, does it not? Are you looking for the RFC for S Octaldump Format or what? OK we could have done otherwise and then we would have to change the name of this little format to something unsexy like S Textual N-ary Dump Format, that sounds real academic and boring.

A3: If you're thinking of something like arbitrary bit length then stop thinking of it now: we wouldnt quite know which characters to encode for high numbers in bizarre architectures using a large prime as their wordlength, like the 113-bit wordlength machine, or, as we're talking real weirdo stuff here, the largest thing you can get out of the sieve of erastothenes... Even if we use all Unicode there is (including runic, ugaritic and hieroglyphs, and how do we avoid the "holes" in the Unicode spec?) we still hit the roof when we get to prime number word lengths larger than 2^32.

A4: OK I know it wouldn't be so hard to add just octal too, but we really want to keep things as simple as possible, not only implementation-wise but also human-readability-wise, i.e. "Hm, it says this is address 7337 but was that hex or octal now, I must check the header again".

A5: SHF was intended to solve practical problems for hexdumps (remember IETF motto "rough consensus and running code"), especially in embedded systems. It was not intended as a swiss army knife for computer archaeology. If you can show me some system in wide use that actually would have good practical use for octal, decimal, or whatever multiplicator for a word length, we might think again and we will make some RFC, with support for octal and what have you, that obsoletes the current SHF some sunny day. We're not that hard to move.

A6: Esoteric, non-practical use features in RFCs are unlikely to be implemented not tested so the result may become a platonic specification which does look cool but does not resemble the reality. (See Plato's allegory of the cave.) We're programmers, not mathematicians.

A7: That's feature-creep! Next thing you want is a kitchen sink...

A8: OK it might not be so hard to add just 12 bit because that still can be easily supported by using 3 hex numbers, and it is still hex numbers. Help us out on SHFv2!

Q: Why can't SHF be used to address individual bits, so I can send it in as a configuration file and have it fiddle some bits here and there in some system core memory?

A1: Are you sure you are not looking for the S Universal Bit- and Bytefield Configuration Script Format known as SUBBCSF?

A2: Yeah, cool feature, the next version of SHF (whenever that arrives) may support that stuff. Would you like to join us in persuing the IETF to start up a working group for that, provided you need it so bad?


Written by Linus Walleij, /<-r4d s0pHtw4r3.