If you're too lazy to read this article and just want the complete source code, go ahead and get it here.

If you see me writing SPC700 or SPC I'm talking about the sound processor. If you see me writing .SPC or .SPC file I'm talking about the files or the data they contain (you can read more about .SPC files here). If I'm writing SNES I'm talking either about the console as a whole, or about the 5A22 and its memory.

Having obtained a SNES flash cart (or four of them) I wanted to write a small program for the SNES that would upload the contents of .SPC files to the SPC700 so that I could listen to them on the real thing.
I looked around on the internet and found this page on the SNES Programming wiki which appeared to be what I wanted. However, even though the code sort-of-worked in Snes9x, it failed in the more accurate BSNES. More importantly, it failed miserably on an actual SNES.
After many hours of debugging I managed to track down nearly all of the flaws in the original code and fix them to the point where my loader works with every .SPC file I've tested it with. This article is my attempt to document my findings and the reasoning behind my solutions.

One key issue that the original code didn't deal with was the DSP's echo function. When this function is active, the DSP will write echo buffer data to SPC RAM, which obviously can cause problems if it overwrites data that you need during the loading process.

One could set up the ESA and EDL registers in such a way that the echo buffer is contained within an area where it does no harm. But my approach was to turn the echo function off completely (by setting bit 5 of FLG) before sending anything else to the SPC, and to write the real FLG init value as late as possible in the loading process.
The below code shows how I turn the echo off. First I place the bytes $6c, $20 at some arbitrary location in SNES RAM, then I transfer those two bytes to the SPC and tell the SPC to write them starting at address $00f2 in its own memory space, which on the SPC side is the DSP Address register. I also set the echo delay to zero.

; Make sure echo is off
sep	#A_8BIT
lda	#$7d		; Register $7d (EDL)
sta.l	$7f0100
lda	#$00
sta.l	$7f0101
sendMusicBlockM $7f, $0100, $00f2, $0002
sep	#A_8BIT
lda	#$6c		; Register $6c (FLG)
sta.l	$7f0100
lda	#$20		; Bit 5 on (~ECEN)
sta.l	$7f0101
sendMusicBlockM $7f, $0100, $00f2, $0002

Zeroing EDL will ensure that once we re-enable the echo function - assuming that the song actually uses echo - the point in SPC RAM where the DSP will start writing echo buffer data is somewhere within the first 4 bytes of the echo buffer start address (ESA*$100). This is a precaution to avoid the unlikely event that the DSP overwrites the final part of our init routine with echo buffer data.

The next step is to transfer the rest of the DSP register init values, except for a few registers which are set later; KOF, KON, EDL and FLG. I do this transfer in reverse order, starting with register $7f and going down to $00. The reason for this is that I used to include FLG in this loop, and I had to make sure that ESA was set before FLG (ESA is $6d, FLG is $6c). Since I'm now setting FLG separately it's possible that I could transfer the DSP registers in ascending order, but the order doesn't really have any significant effect on the code complexity so it doesn't matter to me.

A ship comes loaded
Another problem with the original code was that it was copying data to the entire SPC RAM range from $0002 to $ffbf, including the memory-mapped I/O registers at $00f0..$00ff. Writing the incorrect values to these registers can break the SPC IPL routine, or even (supposedly) halt SPC operation.
My solution was to first copy a piece of code starting at $0002 in SPC RAM.

mov   x,#0
mov   y,#$f8
mov   $f1, #0
mov   $f2, #$5c
mov   $f3, #init_kof
-:    cmp   y,$f4
      bne   -
      mov   a,$f5
      mov   $0000+y,a
      mov   $f4,y
      mov   a,$f6
      mov   $0001+y,a
      inc   y
      inc   y
      bne   -
      inc   $0017
      inc   $001e
      bne   -
mov   $f2, #$4c
mov   $f3, #init_kon
jmp   init_routine

This code takes care of recieving the bytes at $00f8..$ffff. Two bytes are transfered for each iteration, compared to one byte/iteration that the IPL routine does. This saves nearly half a second in transfer time. This routine also initializes DSP registers KOF and KON and finally jumps to the init routine which is described in the next section. The init routine takes care of recieving the bytes at $0000..$00f1, as well as restoring the SPC registers.

A healthy injection
Once all that data has been sent to the SPC RAM we need to construct a small routine that will initialize the SPC registers, the first two bytes of SPC RAM etc., and then jump to the PC init value stored in the .SPC file. The SPC700 code sequence we want to construct looks like this:

mov   x,#init_sp
mov   sp,x
mov   x,#init_psw
push  x
mov   x,#0
-:    cmp   x,$f4
      bne   -
      mov   a,$f5
      mov   $f4,x
      mov   (x)+,a
      cmp   x,#$f2
      bne   -
mov   $f2,#$7d
mov   $f3,#init_edl
mov   a,#init_a
mov   x,#init_x
mov   y,#init_y
mov   $f2,#$6c
mov   $f3,#init_flg
pop   psw
jmp   init_pc

Where init_sp, init_psw, init_edl, init_a, init_x, init_y, init_flg and init_pc are taken from the following offsets in the .SPC file:

The value $0a that I'm writing to $f0 was taken from Anomie's SPC doc. It seems to work, so that's what I'm using. When writing to $f1 I mask out everything except for the timer enable bits, and then I set bit 7 which allows the area at $ffc0..$ffff to be used as normal RAM from hereon.
This init routine weighs in at 34 bytes, and our next problem is to find a suitable location in SPC RAM where we can inject this routine without destroying any meaningful data. After studying several .SPC files I came to the conclusion that $ff70 was a good base address for the init code. Most of the files I checked contained strings of the same value (FFFFFF.. or 000000..) in that area, and even for those that didn't I haven't seen (or heard) any adverse effects from placing the init code there.
The actual procedure goes like this:

The song should now start playing in just a few moments. Happy listening.

Mic, 2010