The basis for an xtalk engine [I/we] control

A place to discuss any and all xTalk implementations, not just LC LCC Forks, but HyperCard, SuperCard, MetaCard, Gain Momentum, Oracle MediaTalk, OpenXION, etc.
Forum rules
Please limit any bashing/harping on any commercial interests to a minimum, thanks!
Post Reply
User avatar
tperry2x
Posts: 3488
Joined: Tue Dec 21, 2021 9:10 pm
Location: Somewhere in deepest darkest Norfolk, England
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by tperry2x »

I'm not disagreeing with any of your points above, but you do know the regex version is already capable of this don't you?
regex-matching.png
regex-matching.png (35.76 KiB) Viewed 11231 times
You can already be a bit fuzzy/esoteric in your script too.... just not as much as HyperCard.
a-bit-fuzzy.png
a-bit-fuzzy.png (10.31 KiB) Viewed 11227 times
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

Do I go with a Tokenizer approach, or do I stick with a Regex approach?
A tokenized version still needs to employs regex for splitting out a line into tokens, right? Then it needs to process each token, resolve the values in order... so in the Hello,world example after the command 'put' it expects an expression that evalutes to a string, it would first try find that as 'all text between two quotes' and use that as string-literal value, but then if there's no quotes in that spot it would need to also check if it is something that evaluates to a string, could be a property name (expession begins with 'the'), a chunk expression (expression begins with word,char,line,paragraph,etc.), or container (begins with field/fld, btn/button, etc.). So it needs to check multiple things to see if expression evaluates to a string.

In the case of the missing space between "Hello, world" it simply did not tokenise 'correctly.
Put (command) should be the first token and 'an expression that evaluates to a string' ("Hello, world") should be the second token.

I don't really think of tokenised version or regex version as separate , you are tokenizing using regex, it's just not centralized or nested, some code I think is going to be essentially repeated. For example you have 'ask' and seperately 'ask password' or other forms of ask command. It could be one 'ask' evaluation that uses case/switch to look at the second token for another 'form' ('password') and branches out there, or checks to see if it's an expression that fits the next value it might be looking for (an expression that evalutes to a string).
User avatar
tperry2x
Posts: 3488
Joined: Tue Dec 21, 2021 9:10 pm
Location: Somewhere in deepest darkest Norfolk, England
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by tperry2x »

Yes, the tokenized version is still splitting things up using regex pattern matching.
The thing with using a tokenized version is you kind of have to guess at how the interpreter might be.... well... interpreting what you've just written :D

However, with a regex match used on a portion of a string, (rather than lots of little tokens), it's a lot easier to make this relay to the corresponding function.

The disadvantage of using just regex is that you can't be as 'wooly' with your scripting - like that example of HyperTalk you showed me before.

For example, I can't use:

Code: Select all

put round(char 1 of the time /0.25) into toutput
put toutput
but it DOES understand what I'm trying to do at least, and comes back with:
Error: round: argument must be a number
So, this works:

Code: Select all

put char 1 of the time into tNum
put round(tNum /0.25) into toutput
put toutput
Because it's less 'wooly' - I specified that tNum contains a number by putting a number into it on line 1 first.
Yeah, I know - not strictly necessary you'd think - and that's the drawback of purely using regex.

Plus the tokenizer has no idea how to process it because it's getting the equivalent of a 'bag-o'-words' and can't make sense which order they go into at the moment, so doesn't know how to match that up to the relevant function. It's then failing with the "t is null" error, because it doesn't know what it should have run in the first place.

That's what I'm currently struggling with: you'd still end up with regex matching being parsed from the tokenizer, so I don't know if it makes much difference? (this is my first go at ever building this kind of thing from the ground up, after all)

Both approaches have their advantages and disadvantages as far as I can see.
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

I do know the non-centralized tokenizer version works already, for many things. And it is really great work, it is certainly usably fast interpreter so far already. But I have had some parsing errors with things like mixing in chunk or container expressions. I'm not sure which version I was trying. Could not set the left of btn X to the left of button Y (it might have been 'the loc' that I was trying). I know it's all work in progress so I tend to chalk problems up to that.
Screen Shot 2025-04-01 at 2.09.25 PM.png
Screen Shot 2025-04-01 at 2.09.25 PM.png (29.72 KiB) Viewed 11211 times
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

and can't make sense which order they go into at the moment,
The 'bag of words' isn't it an list/array, shoould be able to itterate over them in order by index number?
So with put "hello, world' the bagOfTokens[1] element would be 'put' and the second element bagOfTokens[2] world be 'Hello, World'
User avatar
tperry2x
Posts: 3488
Joined: Tue Dec 21, 2021 9:10 pm
Location: Somewhere in deepest darkest Norfolk, England
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by tperry2x »

But this is where the thing like that winkler test comes in. HyperCard / Hypertalk was perhaps too forgiving in this regard.
I mean, you can already be pretty random:
pretty-random.png
pretty-random.png (29.74 KiB) Viewed 11202 times
Where you really confuse a tokenizer is with something like this:

Code: Select all

put word 3 to 4 of "word of word with word and word"
It splits up all those "word of word with word and word" into separate tokens and gets properly confused, whereas the regex version returns:

Code: Select all

word with
Which is correct.

The tokenizer JUST needs to see the thing between quotes as a single string, but then I'd be using regex matching to find the quotes... so it's kind of parsing each thing twice, as it's effectively looping back over itself, which would make it slower.

But as I say, the suggestion to tokenize everything does make sense, and I'm sure it's the better approach in the long run. This is why I'd really like to get it working properly, rather than using regex pattern matching. (Even if regex seems easier and faster at the moment). This is all new to me, so kind of finding my way as I go here.
dandandandan
Posts: 16
Joined: Thu May 05, 2022 9:02 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by dandandandan »

You don’t tokenize begin quote, each individual token, end quote. The whole quoted string is one token.
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

tperry2x wrote: Tue Apr 01, 2025 6:31 pm Where you really confuse a tokenizer is with something like this:

Code: Select all

put word 3 to 4 of "word of word with word and word"
It splits up all those "word of word with word and word" into separate tokens and gets properly confused, whereas the regex version returns:

Code: Select all

word with
Which is correct.

The tokenizer JUST needs to see the thing between quotes as a single string, but then I'd be using regex matching to find the quotes... so it's kind of parsing each thing twice, as it's effectively looping back over itself, which would make it slower.

But as I say, the suggestion to tokenize everything does make sense, and I'm sure it's the better approach in the long run. This is why I'd really like to get it working properly, rather than using regex pattern matching. (Even if regex seems easier and faster at the moment). This is all new to me, so kind of finding my way as I go here.
Yeah what Dan said (and I eluded to earlier), anything between two double-quote marks should be single token, tokenizer should not try to parse anything in between them (at least not until we get to syntax like 'do <script>'). Once a scanner/tokenizer hits a double-quote character it should go into a with a repeat loop and continue scanning forward, chomping off char/bytes (bytes because unicode) and appending them to the current token until it runs into another double quote mark which terminates the string literal. I think it should first check that the expected expression is a quoted string literal, and if it is then chomp off the whole string and move on, this way the scanner/tokenizer would need only go into checking to see if the parameter passed is some type of container/variable/chunk only if it's not a string-literal. So it would need something like case/switch, and the first case would be a check for string literal (begins with double-quote mark).

The tokens of:
put word 3 to 4 of "word of word with word and word"
should be:
[1] put -- xTalk lines usually start with a command/action verb
[2] word -- keyword that begins a chunk expression
[3] 3 -- expresion that evaluates to a number
[4] to -- keyword indicating current chunk is range rather than a single word
[5] 4 -- expression that evaluates to a number (OXT can use -negatives to scan a string reversed / from right to left
[6] of -- expression that evaluates to a container to follow
[7] word of word with word and word -- the container in this case is a string literal
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

If speed is a concern, i think that regardless of parsing / tokenizing system used there could be optimizations specific to xTalk.
For example I think it's accurate to say that every line in xTalk starts with one of:
1) an action verb (put, set, answer, speak, play, go, etc.)
2) or is part of a control structure (if, then, else, repeat, case/switch, etc.)
3) or is a handler definition/handler termination (on, function, end)

So I was thinking a good strategy might be to check the first word for an action verb from a list started from most commonly used keywords to least commonly used keywords. So 'put' would be at the top of that list. When you know what command a line is executing then you have a sort of formula for what parameters would be required for that command.
For example 'put' would always be followed by a parameter that is 'an expression that evaluates to a string' (even if that string contains numeric characters) followed by optional 'into' and then a 'a container that can store a string' (variable, property, fld, etc.). So the formula there is something like:
<command> <string or container> (optionally) 'into' <container>
'Put' is a little odd because there's the automatic 'default' container of msg box (or to 'std out' / console log in some cases) if no 'into' container is specified.

Dan did mention in that doc we could use his BNFtoLPEG JS tool used for HC sim.
The docs on Dan's site:
https://hypervariety.com/BNFToLPEG/
Source on Github:
https://github.com/hyperhello/BNFToLPEG
Probably a good idea to work with that and improve on it if possible, but it already seems to work very well for HyperTalk.
I really like what Dan done with the 'SimSCript' which makes it super-easy to expand on the interpreter's vocabulary.

https://www.jaedworks.com/hypercard/scr ... k-bnf.html
https://en.wikipedia.org/wiki/Backus–Naur_form
https://en.wikipedia.org/wiki/Parsing_e ... on_grammar
https://peggyjs.org
https://github.com/peggyjs/peggy
https://github.com/pegjs/pegjs/tree/master
https://coderwall.com/p/316gba/beginnin ... ith-peg-js
https://pest.rs/book/examples/csv.html
https://medium.com/@gvanrossum_83706/ad ... e00fa1092f
https://www.youtube.com/watch?v=XR36rbD6tRM
https://www.inf.puc-rio.br/~roberto/docs/peg.pdf
https://berthub.eu/articles/posts/pract ... g-parsing/
https://stackoverflow.com/questions/334 ... use-peg-js
https://stackoverflow.com/questions/524 ... arser?rq=4

My eyes start to glaze over when I look at those sorts of grammar expression definitions and try to make sense of them, but you can see in the Wikipedia PEG examples that there is sort-of-regex-like pattern matching mixed in there.
Expr ← Sum
Sum ← Product (('+' / '-') Product)*
Product ← Power (('*' / '/') Power)*
Power ← Value ('^' Power)?
Value ← [0-9]+ / '(' Expr ')'
I just noticed the L in BNFtoLPEG stands for Lua, that's interesting.

oh and another point I was thinking about with parsing/tokenizer, eventually we'd want it to maintain a record of the scanner/parser 'coordinates' (line/character) so we can find and highlight an offending bad script line when debugging our scripts.

You may have already seen these, but here's some links pertaining to Parsing Expression Grammars:
https://nathanpointer.com/blog/introToPeg
https://itnext.io/create-a-custom-parse ... e697313926
https://www.youtube.com/watch?v=EubNzfhZS_E
https://medium.com/@gvanrossum_83706/bu ... 869b5958fb
Here's somethings that may be alternatives:
https://ohmjs.org https://ohmjs.org/pubs/live2016/
and xTalk related Node.js thing built on that looks similar to what's brewing here:
https://github.com/dkrasner/Simpletalk
https://simpletalk.systems
Antlr4 HyperTalk Grammar:
https://github.com/antlr/grammars-v4/bl ... perTalk.g4
https://news.ycombinator.com/item?id=2331234
Peg for C++ https://berthub.eu/articles/posts/pract ... g-parsing/
dandandandan
Posts: 16
Joined: Thu May 05, 2022 9:02 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by dandandandan »

Don’t bother thinking about the parsing speed. If you’re optimizing for performance at all, the parsing is done once and the execution is streamlined in more detailed ways to be faster inside loops.
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

I think it should first check that the expected expression is a quoted string literal, and if it is then chomp off the whole string and move on, this way the scanner/tokenizer would need only go into checking to see if the parameter passed is some type of container/variable/chunk only if it's not a string-literal. So it would need something like case/switch, and the first case would be a check for string literal (begins with double-quote mark).
Quoting myself here. This wouldn't help anything anyway, because parsing would still need to do more checks for like a compound expression because there can be concatenation like so:

Code: Select all

put "H" & "ello" & comma && "world!"
With the OXT or LCS interpreter you can also use comma a bit like & &&
for example

Code: Select all

put the short name of button 1, the first item of the backColor of button 1 -- would put something like "My Button Name,255"
User avatar
tperry2x
Posts: 3488
Joined: Tue Dec 21, 2021 9:10 pm
Location: Somewhere in deepest darkest Norfolk, England
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by tperry2x »

Certainly suffering with information overload now.
Leave it all with me - see you in about a week.
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

Sorry for that flood of article/links, to some degree I posted them for myself to look at later when I have time.
That one research project is particularly interesting to me:
https://simpletalk.systems
That's a live editable 'stack', right click on something on that page and you'll see,
It's definitely an xTalk that does xCard / UI stuff using JS / web tech. Might be worth looking at the source code (although it's UI is a little sluggish compared to HC sim, and there's some weirdness to how they've implemented properties)
https://github.com/dkrasner/Simpletalk
User avatar
tperry2x
Posts: 3488
Joined: Tue Dec 21, 2021 9:10 pm
Location: Somewhere in deepest darkest Norfolk, England
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by tperry2x »

Just a quick progress update, I have implemented the Parser expression grammar (PEG) and I've also (using Dan's BNF as inspiration), created an independent expressionParser - together with the tokenizer, and the PEG - this means you don't end up with a huge monolithic interpreter.js file that solely uses Regex (regular expressions). Now, each function has it's own js (which is dynamically loaded and unloaded) - so no need to declare the *.js files in the html file either.
(I'm making this as modular, extendable, simple to modify, and most importantly: to diagnose issues, as I can)

What I'm currently working on is modifying the functions I've already written, as they need some slight tweaking.
There's not many apparent visual changes, Paul - not from what I shared previously, but one of those is the addition of a "parserLogic" field on the index page. This will show you in realtime what the tokenizer and parser are 'thinking'. I'm going to make this a toggle option (same as the message box) with a keyboard shortcut.

More to follow in the next few days...
v133-screenshot.png
v133-screenshot.png (61.12 KiB) Viewed 10798 times
edit: (meant to add earlier)
This also resizes correctly now for mobile/tablet devices too:
dynamic-resizing-and-chunk.png
dynamic-resizing-and-chunk.png (41.76 KiB) Viewed 10663 times
TerryL
Posts: 130
Joined: Sat Oct 16, 2021 5:05 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by TerryL »

https://pepa.holla.cz/wp-content/upload ... ition1.pdf
Remarkable progress. I've been reading a javascript beginners guide.pdf. It's well written with examples and detailed chapters on arrays and regular expressions. Maybe it will give Tom and Paul some ideas.

Not that you already have your hands full, and juggling life-issues too, so forgive me. I checked WebTalk Doc-128 and couldn't find if you've worked out...
- the target, target() --similar to 'me' but probably hard to translate
- result() --the result synonym
- global/local (script local and within a handler) declarations for container names.
- switch/case/break
- arrays --maybe include split and combine
- repeat while <condition>, repeat until <condition>, next repeat --example: repeat while intersect(fld "A", grc "Ball")
- specialFolderPath("desktop") --desktop, documents, resources
- window.find() //launch browser's find dialog
- window.confirm() //answer dialog with cancel and ok btns
- Can js be coaxed to translate:
answer "Please select a color." with "Cancel" or "Red" or "Green" or "Blue" --oxt: max = 7 buttons
Kdjanz
Posts: 40
Joined: Mon Sep 13, 2021 5:02 am
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by Kdjanz »

https://lynxjs.org

This may be a complete red herring, but I ran across this today and it has just become open sourced and moved to GitHub. I'm not sure if it has any relevance to what the new direction is here, but they talk about building for the web and native Android and iOS as well. It is based on .js of some sort with a custom JS engine claimed to be lightening fast while still being tiny. What makes all of this somewhat credible to me is that is being offered by ByteDance who are famous for TikTok - so they obviously know something about doing things at scale and with serious quality control. Not your usual coders in a basement. I'd be very interested to have the gurus take a look at it and report back (explain it like I'm 5 please) on whether this is useful or going off in another direction. I don't know how this ties into Emscripten etc. and whether this could cut through that knot. But I hope that 15 minutes on their site will either give a quick thumbs down as useless or a big desire to take a closer look.

Hoping this might be the Ferrari engine that slides in under the sleek new body that is taking shape as we watch.
Kdjanz
Posts: 40
Joined: Mon Sep 13, 2021 5:02 am
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by Kdjanz »

https://bare.pears.com

I promise I will quit now! 8-)

But here is another ?engine? that makes relevant sounding noises to this novice:
Actually
Run Javascript Everywhere


Bare is a small and modular JavaScript runtime for desktop and mobile. Like Node.js, it provides an asynchronous, event-driven architecture for writing applications in the lingua franca of modern software. Unlike Node.js, it makes embedding and cross-device support core use cases, aiming to run just as well on your phone as on your laptop. The result is a runtime ideal for networked, peer-to-peer applications that can run on a wide selection of hardware.
The modular aspect means that you only add to the tiny core what you actually use - HTML, file system access, or whatever. So only the essentials have to ship underneath our code, not the monster of Node or Electron.
So could someone tell me if this is useful or not?
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

Kdjanz wrote: Fri Apr 04, 2025 4:46 am The modular aspect means that you only add to the tiny core what you actually use - HTML, file system access, or whatever. So only the essentials have to ship underneath our code, not the monster of Node or Electron.
So could someone tell me if this is useful or not?
Ideally I'd like a wrapper 'app' that uses a webview with whatever web engine the OS comes preinstalled with as I believe most OSes do nowadays. On macOS and recent 'Buntus that would be Webkit. I don't think it would be too difficult to make a shell app that creates a web view and loads the 'ide' / index.html, but Electron includes APIs for doing standalone .app type things, like direct file-system access, shell(), etc.

It could be. I'm certainly interested in alternatives to Electron that use Webkit or Gecko instead of full blown and rather large Chromium Embedded Framework.
Perhaps this subject (web app to standalone app) should be it's in it's own topic.
User avatar
OpenXTalkPaul
Posts: 2798
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by OpenXTalkPaul »

tperry2x wrote: Thu Apr 03, 2025 7:09 am Just a quick progress update, I have implemented the Parser expression grammar (PEG) and I've also (using Dan's BNF as inspiration), created an independent expressionParser - together with the tokenizer, and the PEG - this means you don't end up with a huge monolithic interpreter.js file that solely uses Regex (regular expressions). Now, each function has it's own js (which is dynamically loaded and unloaded) - so no need to declare the *.js files in the html file either.
(I'm making this as modular, extendable, simple to modify, and most importantly: to diagnose issues, as I can)
Fantastic! Thanks, looking forward to testing it out.
In the meantime I've been doing some tinkering around with 132

I was wondering why 'play' command wasn't working in Safari (but worked in Chrome and Firefox). I thought it was the 'user interaction required' problem (not sure if you're familiar with that mostly Safari issue). But I WAS interacting with the document so that shouldn't be an issue. it turns out it's just that Safari has no built-in support for playing .ogg files. I copied my boing .wav into the sounds folder and the play command works fine with that. For compressed samples, mp3 or MPEG4.m4a work fine too.

Eventually it would be good to employ WebAudio API there, you use it to make audio-processing graphs (a LOT like Apple's CoreAudio), which means we can have audio effects like reverb, pitch shift sounds, stream audio, have multiple input sources, etc. Of course I'm also going to want playPMD's extended 'playSentence' and MIDI I/O ... eventually :-D :lol:
User avatar
richmond62
Posts: 5234
Joined: Sun Sep 12, 2021 11:03 am
Location: Bulgaria
Contact:

Re: The basis for an xtalk engine [I/we] control

Post by richmond62 »

So? How on earth one can get anything to run on ALL browsers on ALL operating systems ALL of the time . . . might be a very tall order.

Oregano on RISCOS?
https://richmondmathewson.owlstown.net/
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests