Wonderful Toolchain project update - October 2024

Published on October 17, 2024

It’s been a while. The last year has not been as hectic as the previous few updates, but I have done some things nonetheless.

First of all, I bought a WonderWitch! Despite the cries of my wallet, this will enable me to look into its ecosystem deeper now.

WonderWitch

… Oh, right. The toolchain! Right. Let’s talk about it.

Windows support

Visual Studio Code (top left), a terminal window showing the compilation process (bottom left) and the Mednafen emulator showcasing an example program (right, obscured).

Thanks to the bring-up work done by Generic, the Wonderful toolchain is available on Windows as of September 2023, opening up WonderSwan homebrew development to many more people!

Porting WonderWitch applications

libwwcl used to port JBKun’s Starship demo.

Many former WonderSwan hobbyist developers may be familiar with the API functions provided by the WonderWitch’s libww and libwwc. While Wonderful has had a (partial) reimplementation of these libraries for some time, it still relied on calls to and the presence of FreyaBIOS. This means they were only usable in the wwitch target, which builds .fx-format binaries and requires a WonderWitch cartridge to run homebrew created in this manner.

As one way to help resolve this problem, I’ve decided to repackage my work on an open-source reimplementation of FreyaBIOS in the form of a library - libwwcl1. This allows using many of the graphics/sound-related functions while still creating a “bare metal” cartridge build.

While on the subject of WonderWitch support, Wonderful’s libww has recently been rewritten to generate its 120+ FreyaBIOS interrupt wrappers from XML definition files. This happened to fix a few bugs along the way. Many missing definitions and functions were also added. However, the reimplementation effort is still not complete - most notably, indirect library and file I/O support remains absent.

Mesen 2

This part doesn’t have anything to do with the toolchain’s development, other than that I got to assist in its development a little as a kind of consultant2. However, it is important news for its users!

Mesen 2 emulator running the Wondercell homebrew game in its WonderSwan core, with the debugger and profiler open.

Last month, an update to the Mesen 2 emulator added a new WonderSwan core. It combines competitive emulation accuracy with one of the more beloved debugger interfaces. In addition, support for importing symbols from ELF files generated by Wonderful has been added to aid in debugging.

You can read more and download builds here.

New wiki

The documentation used to be scattered across many pages and files, which was frustrating for users and myself alike. As such, I set up a wiki to centralize all documentation - it’s available here.

As part of this work, I’ve also published documentation for the versions of binutils-ia16 and gcc-ia16 packaged with the toolchain. As a reminder, low-level WonderSwan hardware documentation is available on the WSdev Wiki instead.

Data-in-SRAM targets

The WonderSwan has two types of RAM available:

  • internal RAM (IRAM), 16 KB (or 64 KB on Color), shared with graphics and audio data, full access speed.
  • cartridge RAM (SRAM), up to 64 KB available without banking, reduced access speed3.

However, the 16-bit x86 architecture can only address up to 64 KB of memory at a time for static data, as well as up to a separate 64 KB of memory for the stack. The Wonderful toolchain made use of this as follows:

  • on the bare metal (wswan) target, static data and stack were both placed in IRAM. In this model, SRAM requires explicit “far” pointers to access;
  • on the WonderWitch (wwitch) target, its official layout was used, where the stack was placed in IRAM, but static data was placed in SRAM.

However, observers noted that retail games written in C generally follow the latter convention. While less performant, this makes some sense - on the “mono” WonderSwan in particular, a graphically heavy game might end up using 12-13 KB for graphics data, leaving very little room4 for the game state. As such, I have added targets which take advantage of this:

  • wswan/small-sram: up to 64 KB of code, static data in SRAM, stack in IRAM;
  • wswan/medium-sram: up to 768 KB of code, static data in SRAM, stack in IRAM.

Note that you still need to enable SRAM in wfconfig.toml - otherwise, the game will end up having an incredible 0 KB (!) of SRAM available.

Static data can still be placed in IRAM by using the __wf_iram modifier (for example, int __wf_iram number_in_iram;), but cannot currently be pre-initialized in any way. This may be improved in the future.

Library changes

Many minor changes were done to the libraries comprising the WonderSwan target:

  • Added BCD helper functions: wsx_bcd8_to_int, wsx_bcd16_to_int.
  • Added dynamic heap allocation on wswan targets: brk(), sbrk(), malloc(), realloc(), free(), free_sized(), calloc(), strdup().
  • Added helper functions for hardware DMA: ws_dma_set_source, ws_dma_set_length, ws_ptr_to_linear.
  • Added ws_screen_put_tiles_ex(), which allows using a rectangle from a source tilemap.
  • Added ws_system_get_model() and Pocket Challenge V2 keymap defines.
  • Fixed many bugs in EEPROM handling.
  • Fixed memcmp(), memmove(), strcasecmp(), strcat() and strncat() implementations.
  • Fixed support for placing pre-initialized memory in the Color-exclusive 48 KB of IRAM.
  • Improved hardware.h definitions.
  • Optimized memcpy(), andstrlen().
  • Optimized LZSA, ZX0 and planar unpackers.
  • Wrote tests for much of the above.

Tool changes

Many minor changes were also done to tools:

  • Linker: Added support for setting a different final ROM bank than the last one (rom_last_bank).
  • Linker: Fixed --trim.
  • Linker: Fixed .rodata handling.
  • Linker: Fixed clearing sections which only contain relocated data.
  • Linker: Fixed ia16-elf-objdump section list - SHT_NULL sections are now preserved.
  • Linker: Fixed some edge case crashes.
  • bin2c: Minor optimizations.
  • compile_commands.json helper: Rewrote in C for improved performance.
  • wf-process: Added support for .s output.
  • wf-process: Added support for outputting headers for files over 32767 bytes.

That’s it for the WonderSwan changes, but there is one more thing.

BlocksDS

BlocksDS is a fork of the homebrew NDS toolchain, created by AntonioND5. While my toolchain project provides packaging and a compiler for it as a courtesy, this is not the remaining project to discuss! Despite me contributing to it occasionally, it’s a separate project to Wonderful and I don’t control it. if you’re curious about its progress, I recommend reading Antonio’s blog post from May, as well as the extensive changelog.

Maybe I’ll end up writing a blog post about my work on this project separately.

Experimental GBA target

There we are.

The WonderSwan is not the only handheld I’ve played with throughout the last fourteen months. Among them is the device6 which I began my homebrew journey on, as a ten year old who had just discovered a copy of the then-early devkitPro updater: the Game Boy Advance.

To be fair, it wasn’t something I planned to do; GValiente of Butano fame directly led to its creation. While there are many alternate GBA toolchain compilations these days (the CMake-based gba-toolchain or meson-gba, for example), none of them provide binaries. Meanwhile, I was already packaging an ARMv4 C compiler as part of my above-mentioned assistance with BlocksDS.

There’s not much that goes into providing a bare-bones GBA toolchain. Any toolchain which can emit bare-metal ARM binary with a specific memory map and header format is well on its way to becoming one. After providing startup code which initializes the stack pointers and copies data from ROM to RAM, it was pretty much ready for use with Butano. However, I don’t personally like starting work on targets which only do what had already been done. I like to toy with new ideas and see how they play out in practice, and thankfully I managed to implement a few of the things on my wishlist of experiments.

The first one is the ability to set the default data region to either the 32KB of IWRAM (on-chip RAM) or the slower 256KB of EWRAM (external RAM). Contrary to the community standard, Wonderful places data in EWRAM by default7, but it allows you to change this.

The second and more notable one is IWRAM memory overlays. This feature provides an easy way to define overlapping code overlays in IWRAM - by listing all combinations that the user expects to be available simultaneously, the linker wrapper figures out how to arrange them in memory:

// In the following configuration:
[memory.overlay]
iwram = [
  ["one", "two"],
  ["one", "three"],
  ["four"]
]
// overlays "one" and "two" can co-exist, or "one" and "three", but not "two" and "three".
// alternatively, overlay "four" can be loaded, but not the other ones.

// a function can be assigned to an overlay like this...
__attribute__((noinline, section(".iwram_one"))) void func_a(void) {
    // ...code...
}

// ...and launched like this:
gba_overlay_load_iwram(one);
func_a();

There are other features I’d like to explore, such as:

  • compressed IWRAM and VRAM data as part of a multi-boot binary; after decompressing to IWRAM and VRAM, the area can be reused for BSS/noinit sections.
  • FatFs-based built-in filesystem driver with support for modern storage devices.

Unfortunately, I haven’t worked on any GBA-related project in a while. As such, the target has withered in this bare-bones state, without much in the way of libraries other than a hastily packaged port of libtonc. I’m busy with other hobby projects, but maybe one day…

The end

Thank you for reading! While I have seen less adoption of my WonderSwan toolchain than I had hoped for, I still enjoy tinkering with it from time to time. Onwards to another year of hacking!

Footnotes


  1. WW Compatibility Layer. ↩︎

  2. Both ways - Sour’s questions helped create many new test ROMs and fix mistakes on the WSdev Wiki. ↩︎

  3. IRAM can be accessed on a 16-bit bus with zero wait states. SRAM, conversely, can only be accessed on an 8-bit bus, and there is one cycle of additional wait state per access (which can be disabled on Color models). ↩︎

  4. On the other hand, 2 KB was enough for NES developers… ↩︎

  5. Built on the back of dozens of contributors over the past twenty years, who are listed here↩︎

  6. NO$GBA, actually. If I recall correctly, I only got a DS Lite some time later. ↩︎

  7. My rationale here is that “hot code” is typically tuned by hand by the programmer, while static data may end up being pulled in from external libraries or be scattered across a codebase being ported. ↩︎