Introduction
This is an YM player for the Neo Geo Pocket Color, which allows the NGPC to play music created for the YM2149 and AY-3-8910/12 programmable sound generators. These chips were found, for example, in the Atari ST, the ZX Spectrum and the Amstrad CPC.
The package consists of some playback code and a ROM builder tool (nymptool). The purpose of the ROM builder tool is simply to inject YM files into the player binary at the correct location (this operation can be performed by some other tool if that is preferred).
One important restriction is that both the ROM builder tool and the player supports only uncompressed YM files (maximum size: 384 kB). You can also not use songs that make use of any of the extra features of the YM format, such as digi drums.
Most YM files found on the internet are compressed using the LHA compressor. These files would have to be decompressed first before they can be used.
If 7zip is installed this should simply be a matter of right-clicking the YM files and selecting "Extract here..".
I'm not aware of any emulator in existence that runs my player correctly, although I haven't tried all of them. It does work fine on a real NGPC, as you'll see if you've got a flashmasta, pocket flash, or Bung cart (or some other means of running ROMs on an NGPC).
Why am?
YM files are register logs for the YM/AY chip, taken at a regular interval (typically 50 times/second).
These registers control, among other things, a number of oscillators, a mixer, and a shift register used for noise generation.
The YM/AY chip has the following features:
- Three channels.
- Each channel can output a square wave with a 50% duty cycle (aka "tone") and/or noise, as determined by the mixer register.
- The tone of each channel is controlled by its own 12-bit period counter, making them capable of generating frequencies down to about 30 Hz when the chip is clocked at 2000000 Hz.
- The 5-bit noise period counter is shared between all three channels. The noise waveform is generated by a 17-bit LFSR with two taps.
- There's a shared envelope generator with a 16-bit period counter and sixteen preset waveforms (e.g. triangle and sawtooth).
- Each channel can either be modulated by the EG, or they can use a flat envelope at one of sixteen different levels.
Second cousin
Meanwhile, the NGPC's sound chip has the following features:
- Three tone channels outputting square waves with a 50% duty cycle.
- One noise channel capable of generating white noise at 3 different frequencies. A separate 10-bit counter can also be used for finer control over the noise frequency.
- The noise channel can also be set to output a square wave with a 6.25% duty cycle.
- Each tone channel is controlled by its own 10-bit period counter. With a 3,072 MHz clock, this allows for frequencies down to about 94 Hz.
- There's no hardware EG, so only flat envelopes can be set (at one of sixteen different levels). The NGPC has stereo sound output through its headphone jack, and separate levels can be set for left and right on each channel.
- The noise waveform is generated by a 15-bit LFSR with two taps.
The differences in the two chips' feature sets makes it difficult to faithfully emulate the YM/AY by means of mapping register values
to corresponding settings for the NGPC's sound chip.
Make your own kind of music
Luckily, the NGPC provides another way of outputting sounds: if the sound chip is switched off, the main CPU can write directly to the NGPC's two 8-bit audio DACs.
Doing this requires us to have some 8-bit unsigned PCM data, generated at as high a sample rate as possible to be able to reproduce a wide spectrum of frequencies.
Let's review some of the other properties of the NGPC:
- Main CPU: TLCS-900/H @ 6,144 MHz. Sound CPU: Z80 @ 3,072 MHz.
- The TLCS900 has access to 16 kB of work RAM connected through a 16-bit data bus, and up to 4 MB of ROM connected through an 8-bit data bus.
- The Z80 can be switched on/off. The only memory it has access to is the top 4 kB of work RAM, plus some MMIO registers. The Z80 doesn't have access to the audio DACs.
- Having both processors accessing the shared work RAM area at the same time will impair the performance of both of them, but more so for the Z80.
- Each pair of cycles on the TLCS900 is known as a state. The fastest instructions have execution times of 2 states, yielding a theoretical maximum of 1,536 MIPS,
- The TLCS900 has four 32-bit general-purpose registers in four banks (i.e. sixteen in total). In addition there are three 32-bit index registers and a stack pointer which are shared across all banks.
- Most instructions can use any 8-bit, 16-bit or 32-bit part of any register, in any bank. However, using registers other than those in the current bank require extra bytes in the machine code sequence, and will incur an extra state when executed.
- There are four hardware timers which can generate interrupts at frequencies up to 48000 Hz.
- There are four microDMA channels which can be set to transfer a small amount of data whenever a given interrupt occurs (e.g. a timer interrupt).
- There's a hardware watchdog which needs to be cleared periodically, or the system will shut down. This is typically handled in the vertical blanking interrupt service routine.
..192 CPU states on the wall
Based on this I designed my YM emulation code as follows:
- Only the TLCS900 is used. The Z80 is switched off.
- The emulation code runs entirely in work RAM, as the wider data bus yields better performance.
- Samples are generated at 16000 Hz, giving me 192 states per sample.
- Data is written to a 4 kB circular buffer in work RAM, from which data is transferred to the audio DACs at 16000 Hz by one of the microDMA channels.
- A simple accumulating mixer is used. Each channel is allowed to output at slightly less than 1/3 of the maximum output level, and the sum of the three channels is written to the PCM buffer.
- For the sake of performance and simplicty, the EG is always emulated as having 16 steps even though the YM2149 has 32.
- The code uses all registers in all banks. There's a total of 4 memory accesses in the emulation loop, excluding instruction fetches.
- No explicit filtering is used. The period counters are represented as 32-bit fixed point numbers (16.16), except the noise period which is represented using 16 bts (8.8 fixed point).
The result turned out quite ok, given the limited performance of the system. Although I'm sure there are a lot of tunes that won't play back correctly, due to some of the shortcuts taken in the YM emulation.
Some earlier videos: