A few posts ago, we managed to disassemble the firmware from the CSL Dualcom site.
The entire listing is available here as a zip. There is a lot of blank space in the file which needs to be trimmed down, but for reference this file will be left as-is.
I have also put the code on github. It’s not ideal as you can’t use the web interface to show the code/diffs, but it is a good way of recording history as mistakes will be made.
The process of turning diassembly into something useful isn’t easy. I find the most useful things are to find very commonly called subroutines first, and work out what they do. If they aren’t obvious, skip them.
The raw listing doesn’t show us the frequency with which subroutines are called. Python, to the rescue again. We trim out the fluff from the file. 0x1000-0x2000 is the string table, which the disassmebler doesn’t know about and tries to turn into code. The processor has a mirrored address structure so everything in the range 0x00000. Everything above 0x1FFFF isn’t the code – it’s special function registers and a mirror area.
Now we run the code through a small script:
from collections import Counter
import operator
datafile = open('/Users/andrew/data/Disassemble1.txt', 'r')
callAddress = []
for row in datafile:
# Rows with CALL in
if row.find('CALL') > 0:
values = row.split('CALL')
# Get value after call, remove unwanted chars, strip
# ! are for addressing mode, H\r\n aren't wanted
address = values[1].replace('!', '').replace('H\r\n', '').strip()
callAddress.append(address)
# Builds a dict of frequencies
freqs = Counter(callAddress)
# sorts the dictionary into a list of tuples
sortedFreqs = sorted(freqs.iteritems(), key=operator.itemgetter(1), reverse=True)
# Whack it out to CSV for copy and paste
for item in sortedFreqs:
print item[0] + ',' + str(item[1])
And we end up with CSV of the frequency of calls:
0E1B2,182 0E541,160 0E1D1,143 0D764,120 0DC44,105 0DED3,82 0DACC,79 0E322,68
0xE1B2 looks like a good place to start.
0e1ac bfcce0 MOVW !0E0CCH,AX 0e1af c2 POP BC 0e1b0 61ec RETB // Start of sub 0e1b2 4c01 CMP A,#1H 0e1b4 df05 BNZ $0E1BBH 0e1b6 63 MOV A,B 0e1b7 ec01e100 BR !!0E101H 0e1bb 4c02 CMP A,#2H 0e1bd df05 BNZ $0E1C4H 0e1bf 63 MOV A,B 0e1c0 ec47e100 BR !!0E147H 0e1c4 4c03 CMP A,#3H 0e1c6 63 MOV A,B 0e1c7 61f8 SKNZ 0e1c9 ec6ce100 BR !!0E16CH 0e1cd ecdfe000 BR !!0E0DFH 0e1d1 fdc404 CALL !4C4H 0e1d4 0233bd ADDW AX,!0BD33H 0e1d7 2013 SUBW SP,#13H 0e1d9 72 MOV C,A
First thing to be aware of is that disassembly is not an exact science. Sometimes you will see an address CALLed but you can’t find it. This probably means that the disassembly is misaligned in that area – look a couple of adresses above and below. This is not the case here.
We can see immediately above 0xE1B2 there is a POP and RETB, the end of a subroutine.
To work out what a sub does, it helps to know what parameters are passed to it and how. If we look through for all the CALLs to 0xE1B2, we get an idea of what is going on:
03d31 530d MOV B,#0DH 03d33 e1 ONEB A 03d34 fcb2e100 CALL !!0E1B2H
B is always set to a value over quite a wide range. It’s probably a number or a ASCII character.
A is set to either 0, 1, 2 or 3. This is likely some kind of option or enumeration.
Going back to the subroutine, we can see how this could work:
0e1b2 4c01 CMP A,#1H 0e1b4 df05 BNZ $0E1BBH 0e1b6 63 MOV A,B 0e1b7 ec01e100 BR !!0E101H // If A = 1, branch to 0xE101 0e1bb 4c02 CMP A,#2H 0e1bd df05 BNZ $0E1C4H 0e1bf 63 MOV A,B 0e1c0 ec47e100 BR !!0E147H // If A = 2, branch to 0xE147 0e1c4 4c03 CMP A,#3H 0e1c6 63 MOV A,B 0e1c7 61f8 SKNZ 0e1c9 ec6ce100 BR !!0E16CH // If A = 3, branch to 0xE16C 0e1cd ecdfe000 BR !!0E0DFH // If A = 0, branch to 0xE0DF
So we are branching to other addresses based on the parameter in A.
There’s one thing to note about this function. There is no immediate RET instruction there. These have to be dealt with in the code that is branched to.
Let’s look at 0xE101.
0e101 77 MOV H,A 0e102 8efa MOV A,PSW 0e104 9803 MOV [SP+3H],A 0e106 67 MOV A,H 0e107 717bfa DI 0e10a c3 PUSH BC 0e10b dbb6e0 MOVW BC,!0E0B6H 0e10e 48b8e4 MOV 0E4B8H[BC],A 0e111 a2b6e0 INCW !0E0B6H 0e114 afb6e0 MOVW AX,!0E0B6H 0e117 440a04 CMPW AX,#40AH 0e11a dc04 BC $0E120H 0e11c f6 CLRW AX 0e11d bfb6e0 MOVW !0E0B6H,AX 0e120 8f0401 MOV A,!SSR02L 0e123 31631e BT A.6H,$0E144H 0e126 362201 MOVW HL,#122H 0e129 71a2 SET1 [HL].2H 0e12b 71b2 SET1 [HL].3H 0e12d dbb4e0 MOVW BC,!0E0B4H 0e130 49b8e4 MOV A,0E4B8H[BC] 0e133 9e44 MOV SIO10,A 0e135 a2b4e0 INCW !0E0B4H 0e138 afb4e0 MOVW AX,!0E0B4H 0e13b 440a04 CMPW AX,#40AH 0e13e dc04 BC $0E144H 0e140 f6 CLRW AX 0e141 bfb4e0 MOVW !0E0B4H,AX 0e144 c2 POP BC 0e145 61ec RETB
It’s pretty long and complex. But there is one really key piece of info in there – the special function register SSR02L. Looking to the 78K0R data sheet, this is “Serial status register 02”. It’s pretty likely this function concerns serial. It has a return at the end as well.
If we look 0xE16C, this has reference to SSR12L. Another serial port.
It’s quite likely that this function concerns either reading or writing to the various serial ports on the board. I’ve not looked at it in enough depth to know exactly what it is doing, so we’ll do the following:
// B has char // A has 0,1,2,3 - probably different serial ports // Return is in the branches :sub_Serial_UnknownA_e1b3 0e1b2 4c01 CMP A,#1H 0e1b4 df05 BNZ $0E1BBH 0e1b6 63 MOV A,B 0e1b7 ec01e100 BR !!0E101H // A = 1 0e1bb 4c02 CMP A,#2H 0e1bd df05 BNZ $0E1C4H 0e1bf 63 MOV A,B 0e1c0 ec47e100 BR !!0E147H // A = 2 0e1c4 4c03 CMP A,#3H 0e1c6 63 MOV A,B 0e1c7 61f8 SKNZ 0e1c9 ec6ce100 BR !!0E16CH // A = 3 0e1cd ecdfe000 BR !!0E0DFH // A = 0
What have I done here?
- Called the sub :sub_Serial_UnknownA_e1b3. The : denotes that this is the actual sub. It is something to do with serial – the first unknown sub to do with serial. I have put the address on the end just to keep track of where it is.
- Search and replace on !!0E1B2H with this new name. “sub_Serial_UnknownA_e1b3” now shows instead of the raw address – when I see it called I know it is something to do with serial.
- Put some brief notes above the sub so I know what it is doing.
- Indented branches so function is a little clearer
I’m now going to do similar for the other high-frequency subs. Again, I am building up a broad picture, not going into extreme depth at this stage.