Giter Site home page Giter Site logo

dformat's Introduction

DFORMAT - A Program for Typesetting Data Formats

This is Jon Bentley's dformat program, reconstituted from a PDF version of the original memo describing the program.

Introduction

dformat reads descriptions of data formats and turns them into pic specifications. It's intended for use as yet another troff preprocessor.

It's of interest to me since it's written in awk. It's been around since 1988, but I've only ever seen either PostScript or PDF versions of the memo.

I've often wanted to at least extract the awk program and make it usable, but never got “a round tuit.”

In the fall of 2019 I came across a PDF copy of the memo and saved it so that I could reconstitute the original troff -ms input for it, as well as the awk code. I finally stole some time to do this in June of 2020.

Process

I started by simply copying all the text from the PDF into a text file via copy/paste.

The next step was to get the awk script put back together enough to actually be run by gawk (my favorite awk interpreter). Once that was done, I used gawk --pretty-print to format the code nicely.

I then started on the memorandum itself, inserting troff -ms requests, formatting the text into lines of reasonable length, and getting the document into shape to match the original as much as possible.

Fortunately, the memo contained the dformat input for all of the figures displayed, so it was simple enough to copy/paste the displayed input into the real input to be processed, and then the result could be compared visually to the original.

Along the way, I had to re-read the original documentation on tbl so that I could format the tables properly. There was additional fun here, as copy/paste of a table dumped the text by columns, not by rows. I ended up saving each column to a small text file and then writing a throw-away awk script to read the files and merge them back into lines.

Finally, I hand-edited the awk program to match what was in the original memorandum.

Bugs In the Document

Along the way, I found a few bugs in the document. There were two cases where input for a figure was shown, but the figure itself was not. I restored the actual figures.

Another figure needed an additional directive in order to be drawn correctly; this directive was missing in the dformat input shown in the PDF file. This too I corrected.

Finally, the dash alias for dashed did not work. I noted this in a footnote and simply replaced dash with dashed in both the real input and in the sample input shown in the memo.

Other Files Here

The original version of the memorandum was published as Bell Labs Computing Science Technical Report 142. The file 142.ps is this original memorandum.

Jon Bentley was kind enough to send me his original troff and awk source; they are included in the jlb directory.

Jon also pointed me at an article he wrote, published in the AT&T Technical Journal, called “Little Languages for Pictures in awk,''. I've included a copy here for convenience in the file little-languages-for-pictures-in-awk.pdf.

Creating the Document

If you have GNU troff installed, with its preprocessors, it should be enough to just type make to create dformat.pdf.

Conclusion

I'm pretty pleased with the result. I'm making everything available on GitHub so that others may also take advantage of this nice little program.

Last Modified

Wed Feb 24 17:53:51 IST 2021

dformat's People

Contributors

arnoldrobbins avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.