Reverse engineering a CSL Dualcom GPRS part 14 – interpreting disassembly

A few posts ago, we managed to disassemble the firmware from the CSL Dualcom site.

The entire listing is available here as a zip. There is a lot of blank space in the file which needs to be trimmed down, but for reference this file will be left as-is.

I have also put the code on github. It’s not ideal as you can’t use the web interface to show the code/diffs, but it is a good way of recording history as mistakes will be made.

The process of turning diassembly into something useful isn’t easy. I find the most useful things are to find very commonly called subroutines first, and work out what they do. If they aren’t obvious, skip them.

The raw listing doesn’t show us the frequency with which subroutines are called. Python, to the rescue again. We trim out the fluff from the file. 0x1000-0x2000 is the string table, which the disassmebler doesn’t know about and tries to turn into code. The processor has a mirrored address structure so everything in the range 0x00000. Everything above 0x1FFFF isn’t the code – it’s special function registers and a mirror area.

Now we run the code through a small script:

And we end up with CSV of the frequency of calls:

0xE1B2 looks like a good place to start.

First thing to be aware of is that disassembly is not an exact science. Sometimes you will see an address CALLed but you can’t find it. This probably means that the disassembly is misaligned in that area – look a couple of adresses above and below. This is not the case here.

We can see immediately above 0xE1B2 there is a POP and RETB, the end of a subroutine.

To work out what a sub does, it helps to know what parameters are passed to it and how. If we look through for all the CALLs to 0xE1B2, we get an idea of what is going on:

B is always set to a value over quite a wide range. It’s probably a number or a ASCII character.

A is set to either 0, 1, 2 or 3. This is likely some kind of option or enumeration.

Going back to the subroutine, we can see how this could work:

So we are branching to other addresses based on the parameter in A.

There’s one thing to note about this function. There is no immediate RET instruction there. These have to be dealt with in the code that is branched to.

Let’s look at 0xE101.

It’s pretty long and complex. But there is one really key piece of info in there – the special function register SSR02L. Looking to the 78K0R data sheet, this is “Serial status register 02”. It’s pretty likely this function concerns serial. It has a return at the end as well.

If we look 0xE16C, this has reference to SSR12L. Another serial port.

It’s quite likely that this function concerns either reading or writing to the various serial ports on the board. I’ve not looked at it in enough depth to know exactly what it is doing, so we’ll do the following:

What have I done here?

  • Called the sub :sub_Serial_UnknownA_e1b3. The : denotes that this is the actual sub. It is something to do with serial – the first unknown sub to do with serial. I have put the address on the end just to keep track of where it is.
  • Search and replace on !!0E1B2H with this new name. “sub_Serial_UnknownA_e1b3” now shows instead of the raw address – when I see it called I know it is something to do with serial.
  • Put some brief notes above the sub so I know what it is doing.
  • Indented branches so function is a little clearer

I’m now going to do similar for the other high-frequency subs. Again, I am building up a broad picture, not going into extreme depth at this stage.

Reverse engineering a CSL Dualcom GPRS part 13 – checking the SIM card

The ICCID is written on the outside of the Dualcom GPRS, stored in the EEPROM, read in from the GRPS modem, and read in from EEPROM immediately before a long, random looking, string is sent to a remote server. It seems quite important.
ICCID on case

The Dualcom board also frequently checks for received SMS.

It might be worth taking a look at the SIM to see what is on it.

From previous projects, I have an Omnikey Cardman 5321 card reader. This reads both RFID cards and smart cards. We can put the SIM card in a carrier and read it with this device.
SIM from Dualcomphoto 2

SimSpy II is a free utility which can read most data from SIM cards inclouding ICCID, IMSI, Kc (which can be used to decrypt communications), SMS messages and more.

Unfortunately, nothing too interesting comes up. The card never seems to have stored any SMS. There’s no numbers in the phone book. We might end up coming back to this at some point.

SIM data

SIM data

Reverse engineering a CSL Dualcom GPRS part 12 – board buzz out

We’ve now got the code disassembled. The disassembler has no concept of what is connected to the microcontroller though, so we need to work out which ports/pins/peripherals are used by which parts of the board. What is P11.1? What about P7? These are all I/O, but meaningless without looking at the physisal board.

The best way of doing this is using a continuity tester and buzzing the board out. It’s not worth exhaustively mapping out the PCB at this stage – just the interesting bits. There might even be some mistakes.

When doing this, I find two tools are essential:

  • A meter with a quick continuity beep. Some have a lag. I’ve not got time for that. I use my Amprobe AM-140-A for this – it’s very quick, if a bit scratchy sounding.
  • Fine probes. I really like the Pomona 6275 probes – they are very sharp and very small.

Pop one end on the peripheral and just brush the other along the sides of the microcontroller. Not too hard or you risk dragging metal between the pins. It makes it very quick to find where things are going.

Watch out for transistors and resistors in the way though e.g. the inputs from the alarm are likely transistor buffered, and some of the peripherals might have resistors to divide voltage.

IC8

IC8 is the socketed 93C86 EEPROM.

DI -> P111
DO -> P20
CLK -> P142
CS -> P141

P111 means port 11, bit 1.

IC11

IC11 is the SMT 93C86 EEPROM.

DI, DO, CLK are shared with IC8

DI -> P111
DO -> P20
CLK -> P142
CS -> P14.5

GPRS Modem

Pin 14 – device control on/off -> P05/TI05/TO05
Pin 21 – GPIO -> P47
Pin 32 – DSR1 -> P26/ANI6
Pin 33 – LED control signal -> SVC LED (not to micro)
Pin 37 – DTR1 -> P04/SCK10/SCL10
Pin 40 – CTS1 -> P21/ANI1
Pin 41 – DTM1 -> P02/SO10/TxD1
Pin 42 – DFM1 -> P03/SI10/RxD1/SDA10

IC02

This is the PSTN modem (Si2401)

Pin 7 – CTS_ -> P10/SCK00
Pin 6 – TXD – 52 P12/SO00/TxD0
Pin 5 – RXD – 53 P11/SI00/RxD0

Buttons

Button A -> P22
Button B -> P23

7 Segment

Segments ->  P60/P61/P62/P63/P64/P65/P66/P67

RH common cathode -> P53

LH common cathode -> P52

LEDs

GSM -> P51/INTP2
PSTN -> P50/INTP1

Programming header

1 -> VCC
2 -> VSS
3 -> P40/TOOL0
4 -> P41/TOOL1
5 -> RESET
6 -> FLMD0
7 -> Switched to ground via reset

(this looks like it would work with a standard Renesas debug tool – the MiniCube2).

I’m not bothered about the other parts at the moment. We can come back to them if we need to.

Next step is to identify a few basic functions inside the disassembled code, probably starting with EEPROM reading.

Reverse engineering a CSL Dualcom GPRS part 11 – disassembling firmware

I find reverse engineering is about building up a broad picture instead of working in-depth on any one aspect of the system. Dip into one bit, check what you are seeing is reliable and makes sense, dip into another area to get more detail, repeat.

We’ve done this with the logic trace – seen the long string sent, looked at the EEPROM access, checked what this in the .prm file, and seen that it is the ICCID. Now I would like to look at the firmware on the device.

CSL Dualcom have v353 hex file available for download on their site. The board I am looking at uses v202. That’s a big difference.
I can see a few paths here:

  1. Get a v202 (or anything nearer to v202 than v353) hex file downloaded. A quick Google doesn’t help here.
  2. Upgrade the board I have to v353. This would require the EEPROM to be updated, possibly other changes. I don’t want to break this board.
  3. Recover the v202 firmware from the board I have. There is a programming header – it could be possible to get the code off. But it may not be possible.
  4. Live with the difference and hope that there is enough consistency between the two to be helpful.

I’m going to run with 4. It’s the lowest effort, and I think it will work. The EEPROM structure between some of the different board seems identical – this is backed up by there only being a single Windows programming tool for the board, regardless of firmware version. In my experience, smaller embedded system firmware is quite consistent as new functionality is added, even if the toolchain changes (this really doesn’t hold true when you move up to anything bigger running Linux).

What can we do with the firmware? Well, we need to disassemble it. What does that actually mean? It means changing the raw machine code back into human-readable assembly language i.e. F7 becomes CLRW BC (clear register BC). This probably doesn’t meet everyone’s idea of human readable, but it is a lot better than machine code.

Some microcontrollers (like the ATmega series) have easy to understand and even read machine code. 90% of instructions are a fixed length (16bits). The number of instructions are limited and there are only limited addressing modes. With some practice you can make sense of machine code in text editor.

The 78K0R is not like this.

MOVs

All of the enclosed red cells are MOV instructions. That’s a lot of them. There are 4 of these maps, with a total of 1024 cells. ~950 of them are populated.

There are several addressing modes and the instruction length varies from 8bits to 32bits. This makes the machine code incredibly hard to read.

We need an automated disassembler. Google isn’t much help here – these microcontrollers aren’t as popular as x86, ARM, AVR, or PIC .

There are two toolchains widely available for these processors. Renesas Cubesuite and IAR Embedded Workbench. There is a chance that one of these has either a disassembler or a simulator that allows a hex file to be loaded.

After a lot of messing around, it appears that Renesas Cubesuite can load the hex, disassemble it, and also simulate it.

1. Download Renesas Cubesuite and install it (Windows only)

2. Start Cubesuite.

Renesas Cubesuite

3. Go to Project -> Create new project.

4. Change the microcontroller to the “uPD78F1154_80” (the 80 pin variant)

Microcontroller

6. Once the project has been created, in the “Project Tree” on the right hand side, right click on “Download files” and click “Add”

Download files

7. Find your hex or bin file and load it (hex is preferable as it seems the disassembler takes into account the missing address space).

8. Go to Debug -> Build and Download

9. The simulator starts up and you can see the disassembled code.

Great – now we can get to work trying to see what the processor is doing.

The next post will likely be buzzing the board out to find out which I/O is connected to what so we can make some sense of the code.

Reverse engineering a CSL Dualcom GPRS part 10 – analysing the logic trace 2

Last post, we looked at the comms between the board and the GPRS modem. There was a long, interesting, string send to a remote server:

When we look out to the rest of the logic trace, we can see that the EEPROM is accessed exactly as this begins:

EEPROM access

From this view, it might look like the EEPROM access is too late for it to be used to generate that long string. However, in microcontroller terms, there is ~0.3ms between the end of the EEPROM access and the start of the first character on serial (I suspect the ‘1’ is a start character). I’ve not checked the crystal speed, but it’s between 2MHz and 20MHz. Even at 2MHz, that’s 600 instructions – plenty of time to act on the EEPROM data.

We now need to zoom in on the EEPROM data and see what is happening:

Screen Shot 2014-04-02 at 22.04.43

The trace is maybe a little confusing here. The green bordered binary is the DI line – the microcontroller to the memory. The yellow bordered binary is the DO line – the memory to the microcontroller. The Saleae Logic software has no decoder to deal with Microwire, so we need to use the SPI decoder set with a bit-width to deal with both the receive and transmit sides.

According to the data sheet for the EEPROM, this should be 29 clock cycles. For whatever reason there is an additional clock cycle here though – the very short transition in the middle of the trace. So we set the SPI decoder to 30bits.

The first 3bits of the green bordered binary are 0b110 – a read command. The next 10bits are the address – 0b00 1001 1000 – i.e. 0x098. After this point, we can ignore the green bordered binary.

The last 16bits of the yellow bordered binary are the value from memory – 0x4489.

We should be able to find this in the .prm file. 0x098 is 152. The prm file has a byte per row, and 2 bytes on the first row, so we should need to go to row 304 of the file – and there we go – 89, 44. Perfect.

If we continue going through the trace, we read the following addresses:

Strangely each of the addresses is read twice. Why? Not sure at this time.

What data do we get back?

This is the ICCID – the unique ID assigned to the SIM card in the GPRS modem.

Could the board be using the ICCID as a key to encrypt the data?

Reverse engineering a CSL Dualcom GPRS part 9 – analysing the logic trace

We’ve captured a trace of:

  • The serial comms to the GPRS modem
  • The serial comms to the PSTN modem
  • The SPI comms to the EEPROM

Now we can take a look at the data in these traces. Let’s start with the communications with the GPRS modem.

The serial comms to the GPRS modem are normal ASCII characters, and it uses the AT command set. The GPRS modem is a Wavecom board – we can download the AT command set documentation.

Stepping through the logic trace and transcribing the commands, we end up with something like this:

TX RX Command  Notes
0202 Dualcom GPRS>  Restart
AT AT OK
AT&F AT&F OK Set to Factory Defined Configuration Profile
ATE0 ATE0 OK Command echo off DCE does not echo characters during command state and online command state
ATX0 OK Call Progress Monitoring Control Busy detection off
AT&D2 OK Circuit 108 (DTR) Response When in on-line data mode, deassert DTR closes the current connection and switch to on-line command mode.
AT+CMEE=1 OK Mobile Equipment Error Enable +CME ERROR: <err> result code and use numeric <err> values.

The whole CSV file of this trace can be downloaded here.

What can we see going on?

  1. The board is reset and some basic settings are sent (don’t echo commands, use DTR to close connection, turn on numeric errors).
  2. Setup SMS messaging to store messages on the ME (the GPRS modem, as compared to the SIM). There seems to be room for 100 messages in the modem.
  3. Setup three PDP contexts. I think these are essentially GRPS connections. The first two are generic and have no username/password – they might be Vodafone APNs. The third is a csldual.com – likely a private APN. An APN is a gateway between a GRPS connection and an IP network.
  4. Setup three Internet accounts. These are credentials used with the PDP contexts. The generic ones have no username or password, but the csldual.com one does – dualcomgprsxx and QO6806xx.
  5. The board periodically checks for network registration and signal strength. The signal strength is shown on the 7-seg display when idling. The GPRS modem is connected to the home network with decent signal strength.
  6. The board then repeatedly scans the first 15 SMS slots for messages. There are no messages, so we get errors back. This is quite interesting – what is it that gets sent to the board as SMS?
  7. The board then tries to connect to a private IP address/port 172.16.6.20:8965 using the csldual.com APN. The first time this is attempted it fails with error code 094, which isn’t listed in the documentation (or on the wider Internet…)
  8. The board then tries to connect to the same IP again. This time it succeeds, and some data is sent back and forth. This is a string of ASCII text which looks, from a human perspective, fairly random.

The data looks like follows (sent on left, receive on right):

DC4 HS87
r (immediately after response above)
LjS1WQjg8FHqR1a4P4DVsjO8eUITXY6ifHPlaFhkZ2SJ EE1404,0122,3343,’6’
‘3’ OK

What things are of note in this trace then?

  • The APN and the username and password used are constant across several devices and the Sample.prm I have looked at. It seems curious to require a password but for it not to vary.
  • SMS messages are checked for frequently, suggesting something important is received by SMS.
  • There is no notion of time/counters/nonce in any of the communications.
  • There doesn’t see to be any key exchange
  • There doesn’t seem to be any authentication of the APN/server with the GRPS Dualcom board.

This has raised a number of questions:

  • What data is used to authenticate a given APN? If the username and password are constant, is the ICCID and other data used?
  • Can anyone send SMS to the GPRS modem, or is there some form of blocking performed by the network in the other direction?
  • Whilst the notion of time/counters/nonce isn’t essential for strong/good encryption, it does make things easier.
  • A common failing of embedded systems that do use encryption is that they don’t change the key. Encryption with a fixed, known key is not really much better than no encryption.
  • It’s been possible to spoof a cell site for a few years now using Software Defined Radio. If the APN/server can be spoofed, then the signalling might stop working.

I’m not sure what the next step is:

  • Gather more traces and see if any patterns can be spotted in the data going between the board and server.
  • Look at EEPROM accesses whilst the GPRS modem is being used.
  • Disassemble the firmware and see if we can spot anything interesting.

I’ll see what takes my fancy next time I sit down and look at it.

Reverse engineering a CSL Dualcom GPRS part 8 – logic analyser

Last time we powered up a board to see what it did just by observing the normal IO with our eyes.

This time we are going to look at what happens in more detail with this particular board using a logic analyser.

First things first, we’ll take the EEPROM out, pop it into our Bus Pirate EEPROM reader, pass the data through our converter, and then open the resulting .prm file in the CS2364 Windows utility.

This indicates that this board only has GPRS and PSTN paths enabled – no LAN.

Comms paths

There isn’t much else of note.

Running the .prm file through the strings utility we wrote provides very similar output to before – the same IP addresses and possibly the same password.

We now need to work out exactly what we want to connect to the logic analyser. The Dualcom has convenient test points grouped in threes and labelled GSM, PSTN, LAN and 485. It’s highly likely that these are serial connections – GND, TX, RX. A quick check of data sheets and use of the continuity tester confirms this.

Let’s solder some pin headers onto GSM serial, PSTN serial and also the socketed EEPROM. Pin headers make connecting and reconnecting the logic analyser very quick and easy compared to using test hooks.

We already know what is on the EEPROM, but we don’t know when and how the data is accessed – using a logic analyser will allow us to see this. This could be compared to static analysis (reading out the EEPROM entirely) and dynamic analysis (seeing how the EEPROM is accessed).

Often test points on hardware end up full of solder due to the manufacturing process. It’s awkward to remove this solder, so I just tend to tack pin headers on at a slight angle. To hold these, I use White Tack – a bit like Blu-Tak but holds out at soldering temperatures. Much easier than using helping hands or pliers.

DSCF0454

DSCF0456

Yep – the joints look dry. Lead-free solder + leaded solder seems to result in joints looking like this.

Once this is done, we connect up the logic analyser – the Saleae Logic. This is a USB logic analyser, and probably my most used reverse engineering tool. It is only 8-channel, but this is frequently more than adequate.

Saleae Logic

The connections end up as follows:

  • 1 (Black) GPRS RX
  • 2 (Brown) GPRS TX
  • 3 (Red) PSTN RX
  • 4 (Orange) PSTN TX
  • 5 (Yellow) DO EEPROM
  • 6 (Green) DI EEPROM
  • 7 (Blue) CLK EEPROM
  • 8 (Purple) CS EEPROM
  • GND (Grey) GND

I don’t have enough channels to monitor CS for the soldered on EEPROM. We’ll have to look at that another day.

Yes – GND is grey and channel 1 is black on the Saleae Logic. This has caught more than a few people out!

After a few trial runs, I find out that I can use the following settings for analysing the data:

  • GPRS RX/TX – 9600baud serial
  • PSTN RX/TX – 2400baud serial
  • EEPROM – SPI, CS active high, 30bits transferred

So now we have a good logic trace. At an overview level, you can see that everything is accessed at one point or another.

Logic trace

If we zoom in we can see EEPROM data transfers (this is a read – 0b110 is the command):

EEPROM

And a GPRS modem response:

GPRS

And the modem as well:

PSTN

You can download the settings/data for this trace here. This can be opened in the freely downloadable Saleae Logic program.

The next step is to decode some of this data further and see what is going on.

 

Reverse engineering a CSL Dualcom GPRS part 7 – board startup

So far, we’ve had a quick look at the hardware, the HEX file firmware, the utility used to program the NVM, and the contents of the NVM. It’s all building up a picture of what the board does and how it does it.

Next I want to power up one of the boards and look at it in operations – what does it actually do when we power it up?

The board just need 9-30V applied to power up. The GPRS module needs an antenna – there is a chance it could be harmed without one. I ordered a cheap GPRS antenna with an MMCX connector on it from eBay for under £5.

Here is one of the boards starting up:

The power-up sequence seems to vary from one board to the next, but for this video, it goes:

  • Flashes 88 along with all LEDs (probably a test)
  • Flashes firmware version number (2.02)
  • Flashes grade (G2)
  • Shows “ro” (reset radio module)
  • Shows c1/2/3 (lower case c is radio call to ARC, 1 = dialling, 2=handshake, 3=sending data). The two GPRS status LEDs flash.
  • Shows A (Comms successful)
  • Shows E 21 – error 21 – “PSTN DC line voltage = low or none” – makes sense as I have no phone line connected

Some boards get to c1 and then fail with one of the lower numbered error codes related to the GPRS comms – probably because the SIM has been de-activated.

The next step will be getting the logic analyser onto some of the signals on the board to see exactly what it is doing.

 

 

Reverse engineering a CSL Dualcom GPRS part 6 – interpreting EEPROM

In the last post we read out the contents of an EEPROM for one of the Dualcom GPRS boards. This is in the native Bus Pirate format:

and needs translating into a prm file for the Windows utility to read it.

Python comes to the rescue again:

We now have BP.prm. Let’s try opening that in the Windows utility:
Real EEPROM data

Excellent! It works fine. A very old version of the firmware – 1.25!

Then if we whack this through the Python utility that converts it into strings, we get very similar output to before:
Screen Shot 2014-03-31 at 15.40.11

User email enumeration vulnerability on CSL Dualcom’s password recovery site

CSL Dualcom allow users to reset their password on http://passwordrecovery.csldual.com/  (yes, no HTTPS, again).

The password reset functionality allows an attacker to enumerate valid usernames. Genuine usernames have a different response to invalid usernames.

The forgotten username functionality also allows an attacker to check for valid email addresses.

Leaking valid usernames and email addresses like this is an incredibly bad idea. An attacker can send crafted emails directly users to reset their passwords on a server under his control, for example.