We use Kaitai at work, and it seems like quite a clever tool at first glance, but it feels weirder the more I use it.
There's the small stuff: YAML is convenient as a universal language, but it means that declarations are very verbose - you can't use any sort of syntax sugar, you're having l basically writing the underlying AST that a more convenient language would parse to. It's also still YAML, with all the problems that come from that.
I also can't make head nor tail of the IDE. The big selling point is that it can parse test data you provide automatically as you build your Kaitai structure, but you can't edit the test data at all, and last I checked it didn't provide any sort of completion or hinting as you write your parser. So I might as well edit the structure in my editor of choice and occasionally copy it into the IDE if there's something I need to check, except then I also need to generate an interesting binary test case file and upload it, because, like I said, I can't edit those test cases in the IDE.
But the most important thing is that it literally only does parsing. Kaitai is a poor way to describe a binary API because if you do that, you'll get parsers for free, but you'll have to write all the generating code yourself - at which point, you have to wonder what advantage Kaitai is actually giving you. As I understand it, this isn't some missing feature that's in the works - the Kaitai team do not want to make this a two-way format, that is a complete non-goal.
Which means, as far as I can understand, the main purpose of Kaitai is to parse binary file formats like .PNG or .PDF files without the need for a special parser. It's basically a binary PEG (but, as mentioned before, in YAML).
Yeah, I've used Kaitai to parse binary game files. But ended up dropping it when I wanted to round trip back into binary. Might as well have the decode live next to the encode stuff in the same file at that point.
The nice thing about Kaitai is that it also comes with visualizers. Though the online IDE chokes on repeating frames in large files, the Ruby library can handle it, and it's decent for figuring out what's going wrong.
Finally, I wish Kaitai offered a better dev experience. Its errors are surprisingly barren for a tool that has one job.
I was sort of doing language to do similar tasks as kaitai, before I knew kaitai was a thing, once I found out about kaitai I sort of lost interest.
https://cloudef.pw/fspec-guide.html
Assuming this is a Show HN thread, cool stuff! Consider using IBM CP437 or Braille instead of periods in the hex editor. I talk about it here on my web page. https://justine.lol/braille/ It's hard to truly understand binary without a binary alphabet.
I am finding the choice of braille characters confusing: I would have assumed that one would use the 8 dots to represent the 8 bits, and that seems to be happening for the very high values--like 0xfX--but then 0x81 should have two dots... what is the logic? (I did verify that Unicode seems to have a "full set", as I could have seen them deciding to only encode ones that were "used" or something, and maybe braille had been designed explicitly to be readable upside down or something and so only used half the options, but I found a chart mapping the characters for all 256.)
It's not a Show HN, Kaitai is pretty well established and used pretty widely. Lots of people behind it, and it gets posted here periodically. It's also open source, you could open an issue with them. They're pretty receptive to ideas, given my limited experience with them.
Works fine on Firefox on Win10 for me. Might have something to do with Let's Encrypt. I'll have a chance to poke into max length and server logs a week from now. Thanks for letting me know!
We use kaitai at Libre Space Foundation (https://libre.space) to parse data from satellites transmissions collected by the SatNOGS (https://network.satnogs.org) ground-station network.
Parsing this data allows us to build Grafana powered dashboards (https://dashboard.satnogs.org, click on the list of active satellites).
I recently considered using Kaitai for a project but I am not sure whether it can handle the following:
Certain parts of the binary I need to parse are compressed using some custom algorithm. I do have an implementation for compression and de-compression in C++. Is it possible to provide this to Kaitai? Maybe via some C FFI mechanism. From the looks of it, Python Construct can cover this; however, I'd really like to use Kaitai's IDE and therefore Kaitai directly.
Technically, it is possible, but my approach is generally to use Kaitai only for uncompressed/deobfuscated stuff. So I just do the decompression ahead of time and feed that into Kaitai. That way, I can use WebIDE/etc.
At first I wasn't quite sure what I was looking at -- but wow, this project rocks!
Kaitai is like protobuf, but for binary formats. It's an incredibly clever way to not write n*m-many parsers for each language. You write a spec for your binary format and the Kaitai libraries in each language can then decode (and presumably encode) the payloads. Gif, wav, pdf, zip, fbx, your custom format, whatever.
This web IDE makes it easy to visualize and develop parser specs against actual data.
The concept is great, there are a handful of limitations that become more apparent with actual use. The main one is there is no encoding, just decoding. The generated code and representation of the parsed data also sometimes feel idiomatic to the language used, particularly in cases where I found I had to add extra levels of indirection to satisfy kaitai.
Yeah, and there are subtle "bug"-type behaviors that are actually correct but incredibly inconvenient. I've gone back and forth on using it (I'll reach for it if I'm just parsing something mostly trivial but not for much else these days).
Can KaiTai parse a binary file/stream containing heterogeneous binary structures? Documentation seems not that clear regarding this. Also the structure definition seems to needs compiling, what if there is change of structure at runtime? Can it handle it?
Is it possible to for an application to have editable structures and KaiTai to parse them without needing to recompile the whole application (I'm speaking here in a java context).
There's the small stuff: YAML is convenient as a universal language, but it means that declarations are very verbose - you can't use any sort of syntax sugar, you're having l basically writing the underlying AST that a more convenient language would parse to. It's also still YAML, with all the problems that come from that.
I also can't make head nor tail of the IDE. The big selling point is that it can parse test data you provide automatically as you build your Kaitai structure, but you can't edit the test data at all, and last I checked it didn't provide any sort of completion or hinting as you write your parser. So I might as well edit the structure in my editor of choice and occasionally copy it into the IDE if there's something I need to check, except then I also need to generate an interesting binary test case file and upload it, because, like I said, I can't edit those test cases in the IDE.
But the most important thing is that it literally only does parsing. Kaitai is a poor way to describe a binary API because if you do that, you'll get parsers for free, but you'll have to write all the generating code yourself - at which point, you have to wonder what advantage Kaitai is actually giving you. As I understand it, this isn't some missing feature that's in the works - the Kaitai team do not want to make this a two-way format, that is a complete non-goal.
Which means, as far as I can understand, the main purpose of Kaitai is to parse binary file formats like .PNG or .PDF files without the need for a special parser. It's basically a binary PEG (but, as mentioned before, in YAML).