Giter Site home page Giter Site logo

chris-y / zmakebas Goto Github PK

View Code? Open in Web Editor NEW

This project forked from os2world/app-emulator-zmakebas

12.0 5.0 7.0 135 KB

A Spectrum BASIC tokenizer (text file → .TAP/.P) [Public domain]

C 80.23% Makefile 1.82% Shell 1.17% Roff 16.78%
spectrum-next zx-spectrum zx-spectrum-next

zmakebas's Introduction

Name

zmakebas - convert text file into Spectrum Basic program

Synopsis

zmakebas [-hlp3rv] [-a startline] [-i incr] [-n speccy_filename] [-o output_file] [-s line] [input_file]

Description

zmakebas converts a Spectrum Basic program written as a text file into an actual speccy Basic file (as a .TAP file, or optionally a raw headerless file). By default, input comes from stdin, and output goes to 'out.tap'.

Using zmakebas rather than (say) writing the Basic in an emulator means you can write using a nicer editor, and can use tools which work on text files, etc. Also, with the '-l' option you can write without line numbers, using labels in their place where necessary.

The program was originally intended to be used simply to make little loader programs, so they wouldn't have to be sourceless binaries. However, I went to a fair amount of effort to make sure it'd work for bigger, more serious programs too, so you can also use it for that kind of thing.

Options

option comments
-a Make the generated file auto-start from line startline. If '-l' was specified, this can be a label, but don't forget to include the initial '@' to point this out.
-h Give help on command line options.
-i In labels mode, set line number increment (default 2).
-l Use labels rather than line numbers.
-n Specify filename to use in .TAP file (up to 10 chars), i.e. the filename the speccy will see. Default is a blank filename (10 spaces).
-o Send output to output_file rather than the default 'out.tap'. Use '-' as the filename to output on stdout.
-r Write a raw headerless Basic file, rather than the default .TAP file.
-s In labels mode, set starting line number (default 10).
-p Output .p instead (set ZX81 mode).
-3 Write a +3DOS compatible Basic file, rather than the default .TAP file.

Input format

The input should be much as you would type into a speccy (a 128, to be precise), with the following exceptions:

Lines starting with '#' are ignored. This allows you to insert comments which are not copied into the output Basic file.

Blank lines are ignored.

Case is ignored in keywords - 'print', 'PRINT', and 'pRiNt' are equivalent.

You can optionally use 'randomise' as an alternative to 'randomize'.

You can get hex numbers by using 'bin' with a C-style hex number, e.g. to get 1234h you'd use bin 0x1234. (It appears in exactly that form in the speccy listing, though, so don't use it if you want to be able to edit the output program on a speccy.)

You can get a pound sign (character 96 on a speccy) by using a backquote ( ` ).

One input line normally equals one line of Basic, but you can use backslash as the last character of a line to continue the statement(s) on the next input line.

User Defined Graphics (UDG)

Rather than literally inserting block graphics characters and UDGs as you would on a speccy, you should use an escape sequence. These begin with a backslash ( \ ). To get a UDG, follow this backslash with the UDG's letter, in the range 'a' to 'u' ('t' and 'u' will only have the desired effect if the program is run on a 48k speccy or in 48k mode, though); both upper and lowercase work. To get the copyright symbol, follow it with '*'. To get a block graphics character, follow it with a two-character 'drawing' of it using spaces, dots, apostrophes and/or colons. (For example, you'd get character 135 with '\':', and character 142 with '\:.'). To get a literal '@', follow it with '@'. (This is needed only if the '-l' option was given, but works whether it was or not.) To specify a literal eight-bit character code to dump into the Basic output file directly (to use for embedded colour control codes and the like), use braces and a C-syntax number e.g. '\{42}' for decimal, and '\{0x42}' for hex. Finally, as usual with such things, you can get a literal backslash by following the first backslash with another.

Automatic line numbers

If the '-l' option was given, line numbers must be omitted. Instead these are automatically generated in the output, and you can use labels where necessary as substitute line numbers for 'goto' commands etc. A label is defined with the text @label: at the beginning of a line (possibly preceded by whitespace). It can be referred to (before or after) with @label. Any printable ASCII character other than colon and space can be used in a label name. Here's an example of how labels work, showing both the input and (listing of) the output - first, the input:

goto @foo
print "not seen"
@foo: print "hello world"

Now the output:

10 GO TO 14
12 PRINT "not seen"
14 PRINT "hello world"

Note that case is significant for labels; 'foo' and 'FOO' are different.

You can specify a line number in labels mode. This can be useful when defining subroutines that we want to be at specific locations. You can also write:

9000+2 DATA "Blah"
DATA "Foo"
DATA "Bat"

to tell it to start at 9000 and increment by 2 for subsequent lines.

Bugs

There's almost no syntax checking. To do this would require a complete parser, which would be overkill I think. What's wrong with 'C Nonsense in BASIC' as a syntax check, anyway? :-)

Excess spaces are removed everywhere other than in strings and rem statements. I think this is generally what you'd want, but it could be seen as a bad thing I s'pose.

Labels are substituted even in string literals. That's arguably a feature not a bug - the problem is, the label name has to be followed by whitespace or a colon or EOL when referenced, which is fine for more normal references but is less than ideal for references in strings.

In the label-using mode, two passes are made over the input, which usually means the input must be from a file. If you like making one-liner Basic programs with 'echo' and the like, I'm afraid you'll have to use line numbers. :-)

The inline floating-point numbers which have to be generated are not always exactly the same as the speccy would generate - but they usually are, and even when they're not the difference is extremely small and due to rounding error on the speccy's part. For example, 0.5 is encoded by the speccy as 7F 7F FF FF FF (exponent -1, mantissa approx. 0.9999999997672) and by zmakebas as 80 00 00 00 00 (exponent 0, mantissa 0.5).

zmakebas has most of the same (parsing) problems, relative to the original basic editor, that the 128 editor has. Specifically, you can't use variable names which clash with reserved words, so e.g. 'ink ink' doesn't work; and certain tightly-packed constructions you might expect to work, like chr$a, don't (you need a space or bracket after CHR$). These can be more of a problem with zmakebas though, due to the lack of syntax checking.

The way tokenisation is done is sub-optimal, to say the least. If you ran this code on a Z80, even the 128 editor's tokenisation would seem quick in comparison. (Here's a hint of the full horror of it - program lines take exponentially longer to tokenise the longer they are.) However, since I never had a conversion take more than about a second on my old 486 (it took a second for a 10k program), it hardly seems worth the effort of fixing.

zmakebas has no problem with translating BIN numbers of more than 16 bits, unlike the speccy, though numbers with more than 32 significant bits can only be approximated, and on machines where 'unsigned long' is no more than 32 bits they'll be very approximate. :-) (If this sounds confusing, you should note that BIN numbers are translated when entered, and only the 5-byte inline form is dealt with at runtime. This also explains why the speccy tolerates the 'bin 0x...' construction.)

On machines without FP hardware, zmakebas will be rather slow (this is due to the need to generate inline FP numbers).

Since Basic is an acronym, pedants will doubtless insist I should write it as 'BASIC'. But we live in a world with 'laser' etc., and at least I can be bothered to capitalise the thing, right? :-)

AUTHOR

Russell Marks ([email protected]).

See ABOUT for more info on forks and versions.

zmakebas's People

Contributors

chris-y avatar ryangray avatar ohnosec avatar flatduckrecords avatar imneme avatar cpressey avatar martiniturbide avatar polluks avatar

Stargazers

Andrew Davidson avatar  avatar Mikael O. Bonnier avatar  avatar Steven Robertson avatar Claudius avatar Robson França avatar Spec-Chum avatar Marian Veteanu avatar  avatar  avatar Marcos Cruz avatar

Watchers

James Cloos avatar  avatar Pep avatar PLIROF avatar  avatar

zmakebas's Issues

"+num" or "-num" fails to insert FP when a number comes before the operator

This is a result of to the attempt on line 1225 in commit 6e71924 to check if char to the left of the number is a letter or digit. The problem is when the number is recognized starting at an operator of "-" or "+", then the char to the left of that may legitimately be a digit, causing the FP insertion to be bypassed erroneously. The test code is:

4 PRINT "1: ";(2)-1
5 PRINT "5: ";2+3
6 PRINT "-1: ";2-3
7 PRINT "1.3: ";1+.3

This fails on lines 5, 6, and 7 with Nonsense in BASIC errors, but not on line 4 because of the ")" being there rather than a digit. Sometimes, I get no error but an incorrect result. For example, this input with the same lines 4-7 but adding 8 and 9:

4 PRINT "1: ";(2)-1
5 PRINT "5: ";2+3
6 PRINT "-1: ";2-3
7 PRINT "1.3: ";1+.3
8 LET x=5
9 PRINT "4: "; x-1

gave this mostly incorrect output but with no error:

1: 1
5: 8
-1: 6
1.3: 6
4: 4

I imagine it depends on the bytes that happen to be where it is looking for the FP number as to what you get.

I have worked out a fix that is a more strict check up front that avoids starting a number on the + or - after a digit and passing on the +/- char to start the number on the digit. This makes the isalnum() check added to line 1225 not needed to avoid adding FP bytes to a '$' or '@' prefixed number for Spectrum Next. Pull request to follow.

CodeQL action failing

I got a notice on my fork that the CodeQL scanning workflow failed, and it says it's due to v1 being deprecated in favor of v2 (see here). It says it was triggered by me pushing commit de323cb, but that seems to be your mod, and it looks to have happened when I synced my fork after you merged my pull request. I did look in .github/workflows/codeql-analysis.yml, and I did see references to @v1 for github/codeql-action, so I guess those need changing to @v2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.