Questions for CSL Dualcom

When CSL made their statement last Friday, it was noticeable that they didn’t actually claim that any of my report was false. To me, that implies that the content of the report is true.

CSL should be answering questions right now, but are maintaining silence.

If you are a big customer of CSL, I would be asking:

  1. What encryption methods do your new devices, the Gradeshift and DigiAir, use?
  2. How often are the keys changed on these devices?
  3. If there was a serious security issue requiring the firmware to be updated, who pays for it?
  4. Do these devices have SMS controls? If so, what is the PIN and how do I change it?
  5. Are any of the device in my estate using the encryption mentioned in the report?

I suspect answers won’t be forthcoming.

 

 

CSL Dualcom CS2300-R signalling unit vulnerabilities

Today, CERT/CC will be disclosing a series of vulnerabilities I have discovered in one particular alarm signalling product made by CSL Dualcom – the CS2300-R. These are:

  • CWE-287: Improper Authentication – CVE-2015-7285
  • CWE-327: Use of a Broken or Risky Cryptographic Algorithm – CVE-2015-7286
  • CWE-255: Credentials Management – CVE-2015-7287
  • CWE-912: Hidden Functionality – CVE-2015-7288

The purpose of this blog post is to act as an intermediate step between the CERT disclosure and my detailed report. This is for people that are interested in some of the detail but don’t want to read a 27-page document.

First, some context.

What are these CSL Dualcom CS2300-R devices? Very simply, they are a small box that sits between an intruder alarm and a monitoring centre, providing a communications link. When an alarm goes off, they send a signal to the monitoring centre for action to be taken. They can send this over a mobile network, normal phone lines, or the Internet.

DSCF0451

They protect homes, shops, offices, banks, jewellers, data centres and more. If they don’t work, alarms may not reach the monitoring centre. If their security is poor, thousands of spoofed alarms could be generated. To me, it is clear that the security of these devices must be to a reasonable standard.

I am firmly of the opinion that the security of the CS2300-R devices is very poor. I would not recommend that new CSL Dualcom signalling devices are installed (regardless of model), and I would advise seeking an alternative provider if any were found on a pen-test. This is irrespective of risk profile of the home or business.

If you do use any Dualcom signalling devices, I would be asking CSL to provide evidence that their newer units are secure. This would be a pen-test carried out by an independent third-party, not a test house or CSL.

What are the issues?

The report is long and has a number of issues that are only peripheral to the real problems.

I will be clear and honest at this point – the devices I have tested are labelled CS2300-R. It is not clear to myself or others if these are the same as CS2300 Gradeshift or any other CS2300 units CSL have sold. It is also not clear which firmware versions are available, or what differences between them are.

The devices were tested in the first half of 2014.

CSL have not specifically commented on any of the vulnerabilities. On the 20 November they finally made a statement to CERT.

Here is a summary of what I think is wrong.

1. The encryption is fundamentally flawed and badly implemented

The encryption cipher used by the CSL devices is one step above the simple Caesar Cipher. The Caesar Cipher is known and used by many children to encrypt messages – each character is shifted up or down by a known amount – the “key”. For example, with the key of “4”, we have:

 Plain text: THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG
        Key: 444 44444 44444 444 444444 4444 444 4444 444
Cipher text: XLI UYMGO FVSAR JSB NYQTIH SZIV XLI PEDC HSK

CSL’s encryption scheme goes one step further, and uses a different shift as you move along the message. It’s a hybrid between a shift cipher and a polyalphabetic substitution cipher.

The mechanism the algorithm uses is like so:

 Plain text: THE QUICK BROWN FOX JUMPED OVER THE LAZY DOG 
        Key: 439 83746 97486 128 217218 9217 914 9127 197
Cipher text: XKN YXPGQ KYSET GQF LVTRFL XXFY CII UBBF EXN

The first character is shifted up by 4, the next by 3, then 9, 8, 3 etc. This is simplified, but not by much.

Here it is as a simple Python script:

iccid = "89441000300637117619"
chipNumber = "510021"
status = "15665555555555567"
 
# This is the key stored in flash at x19da
keyString = "0f15241e0919030d2a050e2329132c1014171b2726020c072201212d1a1c120a281f0b1d04250f1816112e2b2006082f41542a49503d"
 
# Change the string into a list of integers
key = [ord(x) for x in keyString.decode("hex")]

# encrypts a string with a startingVariable
def encrypt(stringToEncrypt, startingVariable):
    if startingVariable < 52:
        startingVariable -= 1
    else:
        startingVariable -= 51
 
    encryptedString = ""
 
    for y in stringToEncrypt:
 
        y = ord(y)
 
        # Input character constraint
        if y < 0x41:
            y -= 0x30
        else:
            y -= 0x37
 
        # Add value from key to character
        y += key[startingVariable]
 
        # Output constraints
        if y < 0x25:
            if y < 0x1B:
                y += 0x40
            else:
                y += 0x15
        else:
            y += 0x3C
 
        startingVariable += 1
 
        # Keep startingVariable within bounds - oddly smaller bounds than initial check
        if startingVariable > 47:
            startingVariable = 0
 
        encryptedString += chr(y)
 
    return encryptedString
 
 
def decrypt(stringToDecrypt, startingVariable):
    if startingVariable < 52:
        startingVariable -= 1
    else:
        startingVariable -= 51
 
    decryptedString = ""
 
    for y in stringToDecrypt:
 
        y = ord(y)
 
        if y < 0x61:
            if y < 0x41:
                y -= 0x15
            else:
                y -= 0x40
        else:
            y -= 0x3C
 
        y -= key[startingVariable]
 
        if y < 0x0a:
            y += 0x30
        else:
            y += 0x37
 
        startingVariable += 1
 
        if startingVariable > 47:
            startingVariable = 0
 
        decryptedString += chr(y)
 
    return decryptedString
 
stringStatus = iccid + "A" + chipNumber + status
startingVariable = 52

encryptedString = encrypt(stringStatus, startingVariable)
 
print "   Status: %s" % stringStatus
print "Encrypted: %s" % encryptedString

It would be fair to say that this encryption scheme is very similar to a Vigenère cipher, first documented in 1553, even used fairly widely until the early 1900s. However, today, even under perfect use conditions, the Vigenère cipher is considered completely broken. It is only used for teaching cryptanalysis and by children for passing round notes. It is wholly unsuitable to use for electronic communications across an insecure network.

Notice that I said the Vigenère cipher was broken “even under perfect use conditions”. CSL have made some bad choices around the implementation of their algorithm. The cipher has been abused and is no longer in perfect use conditions.

An encryption scheme where the attacker knows the key and the cipher is totally broken – it provides no protection.

And CSL have given away the keys to the kingdom.

The key is the same for every single board. The key cannot be changed. The key is easy to find in the firmware.

An encryption scheme where the attacker knows the key and the cipher is completely and utterly broken.

Beyond that, CSL make a number of elementary mistakes in the protocol design. Even if the key is not known and fixed, it could easily be recovered from observing a very limited number of sent messages. The report details some of these mistakes. They aren’t subtle mistakes- they are glaring errors and omissions.

I cannot stress how bad this encryption is. Whoever developed it doesn’t even have basic knowledge of protocol design, never mind secure protocol design. I would expect this level of work to come from a short coursework from A-level IT students, not a security company.

2. Weak protection from substitution

The CS2300-R boards use two pieces of information to identify themselves. One is the 20-digit ICCID – most people would know this as the number on a SIM card. The other is a 6-digit “chip number”.

ICCID on case

Both of these are sent in each message – the same message which we can easily decrypt. This leads to an attacker being able to easily determine the identification information, which could then be used to spoof messages from the device.

Beyond that, installers actually tweet images of the boards with either the ICCID or chip number clearly visible: 1, 2, 3.

This is a very weak form of substitution protection. There are many techniques which can be used to confirm that an embedded device is genuine without using information sent in the open, such as using a message authentication code or digital signature.

3. Unable to perform firmware updates over-the-air

It is both presumptuous and foolhardy to assume that the device you deploy on day one will be free of bugs and security issues.

This is why computers, phones, routers, thermostats, set-top boxes, and even cars, allow for firmware updates over-the-air. Deployed devices can be updated in the field, after bugs and vulnerabilities have been fixed.

It has been considered – for many years – that over-the-air firmware updates are an absolutely vital part of the security of any connected embedded system.

The CS2300-R boards examined have no capability for firmware update without visiting the board with a laptop and programmer. No installers that I questioned own a programmer. The alarm system needs to be put into engineering mode. The board needs to be removed from the alarm enclosure (and, most of the time it is held in with self-adhesive pads, making this awkward). The plastic cover needs to be removed. The programmer needs to be connected, and then the firmware updated. Imagine doing that for 100 boards, all at different sites.

Saleae Logic

At this point, we need to remember that CSL claim to have over 300,000 deployed boards.

If we imagine that it takes a low estimate of 5 minutes to update each of the 300,000 boards, that is over 1000 man-days of effort to deploy an update. If you use a more realistic time estimate, the amount of effort becomes scary.

This means that any issues found cannot and will not be fixed on deployed boards.

CSL have confirmed that none of their devices support over-the-air firmware updates.

CSL have given away the keys to the kingdom and cannot change the locks.

A software development life-cycle that has a product that cannot be updated fosters a do-not-care attitude. Why should CSL care about vulnerabilities if they cannot fix them? What would be their strategy if there was a serious, remotely exploitable vulnerability was found?

4. I do not believe the CS2300-R boards are standards compliant

One of the standards governing alarm signalling equipment is EN50136. These are a series of lengthy documents that are unfortunately not open access.

Only a small part actually discusses encryption and integrity. Even when they are discussed, it is in loose terms.

CSL have stated, in reference to my report:

As with all our products, this product has been certified as compliant to the required European standard EN-50136

Unfortunately, CSL will not clarify which version of the standards the CS2300-R devices have been tested to.

I will quote the relevant section from EN50136-1-5:2008:

To achieve S1, S2, I1, I2 and I3 encryption and/or hashing techniques shall be used.

When symmetric encryption algorithms are used, key length shall be no less than 128 bits. When other algorithms are deployed, they shall provide similar level of cryptographical strength. Any hash functions used shall give a minimum of 256 bits output. Regular automatic key changes shall be used with machine generated randomized keys.

Hash functions and encryption algorithms used shall be publicly available and shall have passed peer review as suitable for this application.

These security measures apply to all data and management functions of the alarm transmission system including remote configuration, software/firmware changes of all alarm transmission equipment.

There are newer versions of the standard, but they are fundamentally the same.

My interpretation of this is as follows:

Cryptography algorithms used should be in the public domain and peer reviewed as suitable for this application.

We are talking DES, AES, RSA, SHA, MD5. Not an algorithm designed in-house by someone with little to no cryptographic knowledge. The cryptography used in the CS2300-R boards examined is unsuitable for use in any system, never mind this specific application.

When the algorithm was sent as a Python script to one cryptographer, they assumed I had sent the wrong file because it was so bad.

The next I sent it to said:

I’ve never thought I’d see the day that someone wrote a variant on XOR encryption that offered less than 8 bits of security, but here we are.

We’re talking nanosecond scale brute force.

Key length should be 128 bits if a symmetric algorithm is used.

A reasonable interpretation of this requirement is that the encryption should provide strength equivalent to AES-128 from brute-force attacks. AES-128 will currently resist brute-force for longer than the universe has existed. This is strong enough.

The CSL algorithm is many orders of magnitude less secure than this. Given the fixed mapping table, a message can be decrypted in the order of nanoseconds.

Regular automatic key changes shall be used with machine generated randomised keys

Regular is obviously open to interpretation. There is a balance to be struck here. You need to change keys often enough that they cannot be brute-forced or uncovered. But key exchange is a risky process – keys can be sniffed and exchanges can fail (resulting in loss of communications).

Regardless, the CS2300-R boards have no facility at all for changing the keys. The key is in the firmware and can never change.

All data and management functions including remote configuration should be protected by the encryption detailed.

The CS2300-R have a documented SMS remote control system, protected by a 6-digit PIN. This is not symmetric encryption, this is not 128-bit, this is not peer-reviewed.

This is a remote control system protected by a short PIN (and it seems that PIN is often the same – 001984 – and installers don’t have the ability to change it).

Data should be protected from accidental or deliberate alteration

There is nothing in the protocol to protect against alteration. There is no MAC, no signature. Nothing. Not even a basic checksum or parity bit.

The message can easily be altered by accident or by malice. Look at this example:

   Status: 89441012345678901237A1234561111111111111117
Encrypted: 2i7MZCNhHRdkZpYTX2fiLMIaEbo02SKe5L3EbPYWRkn
    Shift: 0000000000000000000000000004444444444444440 
  Altered: 2i7MZCNhHRdkZpYTX2fiLMIaEbo46WOi9P7IfT20Von 
Decrypted: 89441012345678901237A1234565555555555555557

By adding four to a character in the encrypted text, we add four to the decrypted character. This means that the message content has been altered significantly, and very easily. The altered part of the message is the alarm status.

I can see no way that these units are compliant with this standard. CSL, however, say they are certified.

5. I do not believe third-party certification is worthwhile

CSL has obtained third-party certification for CS2300 units. When I first heard this, I was astonished – how is it compliant with the standard?

It’s still not actually clear which units have been tested. The certificate says CS2300, my units say CS2300-R, but CSL say that the units I have looked at have been tested.

After meeting with the test house I have an idea of what happened here.

The test house would not discuss the CS2300 certification specifically, due to client confidentiality. They did discuss the EN50136 standard and testing in general.

Firstly, I was not reassured that the test house has the technical ability to test any cryptographic or electronic security aspects of the standard. Their skills do not lie in this area. At a number of points during the meeting I was surprised about their lack of knowledge around cryptography.

Secondly, there are areas of standards where a manufacturer can self-declare that they are compliant. The test house expected, if another unit was to be tested, the sections on encryption and electronic security would be self-declared by the manufacturer. Note that the test house can still scrutinise evidence around a self-declaration.

Thirdly, there is no way for a third-party to see any detail around the testing without both the manufacturer and test house agreeing to release the data. To everyone else, it’s just a certificate.

From this, I can infer that the CS2300 – and probably other signalling devices, even from other manufacturers – have not actually had the encryption or other electronic security tested by a competent third-party.

I don’t feel that this is made clear enough by either manufacturers or test houses.

6. I do not think the standard is strict enough

I acknowledge that the standard must cover a range of devices, of different costs, protecting different risks, and across the EU. It must be a lot of work drawing up such a standard.

Regardless of this, the section on encryption and substitution protection is so wishy-washy that it would be entirely possible to build a compliant device that had gaping security holes in it.

Encryption, by itself, is not enough to maintain the security of a system. This is widely known in the information security and cryptography world. It’s perfectly possible to chain together sound cryptographic primitives into a useless system. There is nothing in the standard to protect against this.

7. CSL do not have a security culture

There are so many issues with the CS2300-R system it is almost unbelievable.

Other aspects of CSL’s information security are also similarly weak; leaking their customer database, no TLS on the first revision of their app, an awful apprenticeship website, no TLS on their own shop, misconfiguration of TLS on their VPN server, letting staff use Hotmail in their network operations centre… it goes on.

(It is worth noting that CSL added TLS to their shop and fixed the VPN server after I blogged about them a few weeks ago – why does it take blog posts before trivially simple issues are fixed?)

CSL do not have a vulnerability disclosure policy. It was clear that CSL did not know how to handle a vulnerability report.

CSL have refused to discuss any detail without a non-disclosure agreement in place.

There is no evidence that CSL’s security has undergone any form of scrutiny. Even a rudimentary half-day assessment would have picked up many of the issues with their website.

There is also a degree of spin in their marketing and sales. A number of installers and ARCs questioned believe that the device forms a VPN to CSL’s server. Some also believe that the device uses AES-256. Indeed, their director of IT, Santosh Chandorkar claimed to me that the CS2300-R formed a VPN with their servers. There is no evidence in the firmware to support any of these claims, but there is also no way for a normal user to confirm what is and isn’t happening.

At a meeting, Rob Evans inferred that it would be my fault should these issues be exploited after I released them. He used the example of someone getting hurt on a premises protected by their devices. It obviously would not the fault of the company that developed the system.

At one point, when raised on a forum, someone claiming to be a family friend of Rob Evans, accused me of hacking his DVR and spying on his kid, whilst at the same time attempting to track me down and make threats of violence. The same person has boasted about this on other forums.

I have asked Rob Evans to confirm or deny if he knows this person. As of today, I have had no response.

Another alarm installer going by the handle of Cubit is repeatedly stating that I am attempting to extort money from security manufacturers:

Not when he tries to hold a company to ransom, no!
Remember reading his article about the (claimed) flaws in the <redacted> product?? No, thought not. They paid to keep him quiet.

Oddly, the MD of the same company came along to state that this wasn’t the case.

I think it’s disturbing that, rather than pay attention to potential issues, defenders of CSL act like this.

And there is this gem from Simon Banks, managing director of CSL Dualcom:

IP requires elaborate encryption because it sends data across the open Internet. In my 25 years’ experience I’ve never been aware of a signalling substitution or ‘hack’, and have never seen the need for advanced 128 bit encryption when it comes to traditional security signalling.

No need for 128 bit encryption, Simon. Only the standard.

Conclusion

The seven issues to take away from this are:

  1. CSL have developed incredibly bad encryption, on a par with techniques state-of-the-art in the time before computers.
  2. CSL have not protected against substitution very well
  3. CSL can’t fix issues when they are found because they can’t update the firmware
  4. There seems to be a big gap between the observed behaviour of the CS2300-R boards and the standards
  5. It’s likely that the test house didn’t actually test the encryption or electronic security
  6. Even if a device adheres to the standard, it could still be full of holes
  7. CSL either lack the skill or drive to develop secure systems, making mistake after mistake

What do I think should happen as a result of this?

  1. All signalling devices should be pen-tested by a competent third-party
  2. A cut-down report should be available to users of the devices, detailing what was tested and the results of the testing
  3. The standards, and the standards testing, needs to include pen-testing rather than compliance testing
  4. The physical security market needs to catch up with the last 10 years of information security

Rebuttals

CSL have made some statements about this.

This only impacts a limited number of units

CSL have stated:

Of the product type mentioned in his report there are only around 600 units in the field

What product type mentioned? Units labelled CS2300-R? Speaking to installers, the CS2300-R seems to be incredibly common.

If it is only a subset of units labelled CS2300-R, how does a user work out which ones are impacted?

The other 299,400 devices may not be the same unit, but how do they differ? Has a competent third-party tested the encryption and electronic security?

We have done an internal review

CSL have stated:

Our internal review of the report concluded there is no threat to these systems

Ask yourself this: if someone has deployed a system with this many issues in it, why should you trust their judgement as to the security of the system now? Are they competent to judge? There is no evidence that they are.

They have been third-party tested

CSL have stated, specifically in reference to my report:

As with all our products, this product has been certified as compliant to the required European standard EN-50136

This worries me. This says that the very device I have examined – the one full of security problems – got past EN-50136 testing. If this device can pass, practically anything can pass.

But I am fairly sure that the standards testing essentially allows the manufacturer to complete the exercise on paper alone.

The devices are old

The product tested was a 6 year old GPRS/IP Dualpath signalling
unit.

Firstly, there are at least 600 of these still in service.

Secondly, when the research was carried out, the boards were 4.5 years old.

Thirdly, does that mean that a 6 year old product is obsolete? Does that mean they don’t support it any more?

The threat model isn’t the one we are designed for

This testing was conducted in a lab environment that isn’t
representative of the threat model the product is designed to be implemented in
line with.  The Dualpath signalling unit is designed to be used as part of a
physically secured environment with threat actors that would not be targeting
the device but the assets of the device End User.

This seems to have been a sticking point with some of the more backwards members of the security industry as well.

The reverse engineering work was done in a lab. As with nearly all vulnerability research, there needs to be a large initial investment in time and effort. Once vulnerabilities have been found, they can be exploited outside of the lab environment.

If the threat actors aren’t targeting the device, why bother with dual path?

Again, it doesn’t look like the devices comply with the standards. This is what counts.

They aren’t remotely exploitable

No vulnerabilities were identified that could be exploited remotely via
either the PSTN connectivity or GPRS connection which significantly reduces the
impact of the vulnerabilities identified.

I disagree with this. CSL and a number of their supporters do not seem to want to accept that GPRS data can no longer be classed as secure.

This still leaves the gaping holes on the IP side. When I met CSL at IFSEC 2014, they strongly implied that the number of IP units they sold was negligible. There seem to be more than a few getting installed though.

The price point is too low

The price point for the DualCom unit is £200 / $350.  CSL DualCom also
have devices in their portfolio that are tamper resistant or tamper evident to
enable customers to defend against more advanced or better funded threat
actors.  Customers are then able to spend on defence in line with the value of
their assets.

I’m not sure why the price is relevant. Are CSL saying it’s too cheap to be properly secure?

I can’t find any of these tamper resistant or tamper evident devices for sale – it would be interesting to see what they are.

Very few of the issues raised involve physically tampering with the device. They are generally installed in a protected area.

These aren’t problems, but we are releasing a product that fixes the issues

If customers are concerned about the impact of these vulnerabilities CSL are
releasing a new product in May which addresses all of the areas highlighted.

So on one hand, these vulnerabilities aren’t issues, but they are issues enough that you’ve developed a new product to fix them? Righty ho.

Firmware updates are vulnerable, but not normal communications

CSL products are not remotely patchable as we believe over the air updates
could be susceptible to compromise by the very threat actors we are defending
against.

What?

Just a few paragraphs ago, you say that you are not protecting against the kind of threat actor that can carry out attacks as in the report. But you are protecting against a threat actor that can intercept firmware updates?

Why allow critical settings to be changed over SMS if this is an issue?

What-if rebuttals

These are things that haven’t been directly stated by CSL or others, but I suspect that people will raise them

These issues are not being exploited

During discussions with CSL, they seemed very focused on what has happened in the past. I had no evidence of attacks being carried out against their system, and neither did they. Therefore, in their eyes, the vulnerabilities were not an issue.

This is an incredibly backwards view of security. The idea of a botnet of DVRs mining cryptocurrency would have seemed ridiculous 5 years ago. The idea of a worm, infecting routers and fixing security problems even more so. The Internet changes. The attackers change. Knowledge changes.

Failing to keep up with these changes has been the downfall of many systems.

But we haven’t detected any issues

This entirely misses the point.

The end result of these vulnerabilities is that it is highly likely that a skilled attacker could spoof another device undetected.

We don’t mind issues being brought to us privately

These issues were brought to CSL’s attention, privately, 17 months ago.

That is ample time to act.

He works for a competitor

Firstly, I don’t. I have spoken to competitors to find out how they work.

Secondly, this would not detract from the glaring holes in the system.

He is blackmailing people in the security industry

I have released vulnerabilities in Visonic and Risco products. Shortly, there will be a vulnerability in the RSI Videofied systems. None of these people have been asked for payment and have been given 45+ days to respond to issues. This is a fair way of disclosing issues.

I do paid work with others in the security industry. Again, at no point has payment been requested to keep issues quiet.

I have never asked CSL for payment. At several points they have asked to work with me, which I have turned down as I don’t think their security problems are going to be resolved given their culture.

The encryption and electronic security are adequate

It’s hard to explain (to someone outside of infosec) just how bad the encryption is. It is orders of magnitude less strong than encryption used by Netscape Navigator in 2001.

The problems found have been widely known for 20+ years, and many are easy to protect against. Importantly, it appears that their competitors – at least WebWayOne and BT Redcare – aren’t making the same mistakes.

The GPRS network is secure

This was true 15 years ago. It is now possible – cheaply and easily – to spoof a cell site and then intercept GPRS communications. You cannot rely on the security of the GRPS network alone.

Further to this, exactly the same protocol is used over the Internet.

But above all, the standards don’t differentiate between GPRS and the Internet – they are both packet switched networks and must be secured similarly.

We take our customers security seriously

So does every other company that has been the subject of criticism around security.

I would argue that letting your customer database leak is not taking security seriously.

Reverse engineering a CSL Dualcom GPRS part 16 – SMS remote commands

Sorry for the slow-down in posts – I stored up a load of posts, then posted them too quickly.

Since the last post, I have identified a lot of functionality in the code, including:

  • TX/RX subs for all the UARTS
  • Locations of TX/RX buffers for all UARTS
  • Multiply, divide, modulus and exponent subs
  • Conversion subs (ASCII->hex etc.)
  • EEPROM read/write subs
  • 7 segment display subs
  • Buzzer subs
  • Reading button state
  • Memory copy, search etc.
  • Hardware initilisation
  • Two intetesting subs that are called frequently to move working memory onto stack and back again

The board supports several GPRS modems – the Wavecom GR64 on the boards I have, but also Cinterion and Telit. The AT command set between the modems is completely different, so there are often several sections of code for common functionality like “Activate PDP context”. I’ve only looked at the Wavecom parts.

Having all this has given me enough to tie observed behaviour (just from using the Dualcom board, and the logic traces) back to specific areas of the code. Seeing the strings on the UART and searching for the address where that string is stored in flash/EEPROM is very useful.

I’m fairly sure the code is compiled C from IAR Embedded Workbench – the multiply, divide and startup are identical. If I compile code from Renesas Cubesuite, it looks very different.

There is still a lot of code that looks like assembly though – some of the memory operations, calling convention, and string searching look very odd to be compiled C.

One thing that caught my eye during the serial trace was that the board repeatedly checks for SMS messages. This suggests it is waiting for something to arrive, possibly commands

Digging around a bit, in the manual for the Dualcom GSM (not the GPRS), it mentions SMS remote control – sending commands to the CSL Dualcom:

SMS Remote commands

SMS Remote commands

This is interesting. A 6 digit PIN and some limited commands. It wouldn’t be the first time that PINs are defaulted to a certain value or derived from open information like the ICCID or number. It also wouldn’t be the first time that there are undocumented or hidden commands for a device, or even a backdoor.

We’re going to need to have a dig about in the code to see how these commands are dealt with. A good starting point would be to find where the string AT+CMGR (read text messages) is used, and follow on from there.

AT+CMGR is stored in the flash at 0x1BE6 (0x1000-0x2000 is almost exclusively strings and lookup tables). If we search for this address, we find the following chunk:

// State 0x86
// Request text messages
0c657        afc8f5      MOVW            AX,!0F5C8H // This is used as a time out
0c65a        7c80        XOR             A,#80H
0c65c        440180      CMPW            AX,#8001H
0c65f        dc07        BC              $0C668H
	0c661        d46a        CMP0            0FFE6AH
	0c663        61f8        SKNZ            
	0c665        ee3902      BR              $!0C8A1H
0c668        e1          ONEB            A
0c669        fc64d700    CALL            sub_Serial_ResetBuffers
0c66d        32b61e      MOVW            BC,#1EB6H // AT+CMGR=
0c670        e1          ONEB            A
0c671        fcd1e100    CALL            sub_Serial_WriteString_e1d1
0c675        8f49f7      MOV             A,!0F749H
0c678        72          MOV             C,A
0c679        f3          CLRB            B
0c67a        e1          ONEB            A
0c67b        fc04e200    CALL            sub_Serial_WriteHexAsDec_e204
0c67f        f46b        CLRB            0FFE6BH
0c681        f46a        CLRB            0FFE6AH
0c683        cf41f704    MOV             !0F741H,#4H
0c687        530d        MOV             B,#0DH //CR
0c689        e1          ONEB            A
0c68a        fcb2e100    CALL            sub_Serial_WriteChar_e1b3
0c68e        30a302      MOVW            AX,#2A3H
0c691        bfc8f5      MOVW            !0F5C8H,AX // set timeout to 675
0c694        cf4af704    MOV             !0F74AH,#4H // State 4 next
0c698        ee0602      BR              $!0C8A1H

This is a massive state machine. The “state” is stored in 0xF74A, checked at the beginning of the sub and then a branch performed to the current state.. State 0x86 sends “AT+CMGR=1” to UART1. It then sets the next state to 0x4.

Each one of the states sets a counter in 0xFF5C8 which is decremented in the timer interrupt. If this is hit, the state machine seems to be reset in most cases.

// State 0x4
// State after requesting text messages
0c69b        afc8f5      MOVW            AX,!0F5C8H // timeout
0c69e        7c80        XOR             A,#80H
0c6a0        440180      CMPW            AX,#8001H
0c6a3        dc42        BC              $0C6E7H
	0c6a5        8f19f7      MOV             A,!0F719H
	0c6a8        4c02        CMP             A,#2H
	0c6aa        61d8        SKNC            
	0c6ac        eef201      BR              $!0C8A1H
	0c6af        32c01e      MOVW            BC,#1EC0H  // "REC "
	0c6b2        e1          ONEB            A
	0c6b3        fcd3de00    CALL            sub_Serial_FindInRX_ded3
	0c6b7        d1          CMP0            A
	0c6b8        dd0d        BZ              $0C6C7H
	0c6ba        30a302      MOVW            AX,#2A3H
	0c6bd        bfc8f5      MOVW            !0F5C8H,AX
	0c6c0        cf4af718    MOV             !0F74AH,#18H // State 0x18 next
	0c6c4        eeda01      BR              $!0C8A1H

	0c6c7        32c61e      MOVW            BC,#1EC6H // "STO "
	0c6ca        e1          ONEB            A
	0c6cb        fcd3de00    CALL            sub_Serial_FindInRX_ded3
	0c6cf        d1          CMP0            A
	0c6d0        dd07        BZ              $0C6D9H
	0c6d2        cf4af717    MOV             !0F74AH,#17H // State 0x17 next
	0c6d6        eec801      BR              $!0C8A1H

	0c6d9        3152410a    BT              0FFE41H.5H,$0C6E7H
	0c6dd        afdef5      MOVW            AX,!0F5DEH
	0c6e0        7c80        XOR             A,#80H
	0c6e2        440180      CMPW            AX,#8001H
	0c6e5        dc0b        BC              $0C6F2H
0c6e7        cf49f70f    MOV             !0F749H,#0FH
0c6eb        cf4af796    MOV             !0F74AH,#96H // State 0x96 next
0c6ef        eeaf01      BR              $!0C8A1H

State 0x4 searches the receive buffer on UART1 for the characters “REC” or “STO”. “REC” is what we would see if there was a text message to read, and if this is found we move to stat 0x18. This calls 0xC301 and the flow continues from there. It might be better to describe this as a process rather than show ASM.

The format of the text message received would be as follows:

+CMGR: “REC UNREAD”,“+447747008670”,“Matt L”,“02/11/19,09:57:28+00”,145,36,0,0,“ +447785016005”,145,8

Test sms

The code, give or take, works as follows.

1. Search RX buffer for string READ – this finds REC UNREAD and REC READ, both found in text messages. If not found, abort.

2. Keep on going until a + is found – the start of the phone number

3. Loop until a non-numeric character is found, storing the number in 0xFE17F. The number is used later to send a text back.

4. Loop until a carriage return is found. This is the end of the text detail and the start of the actual text.

5. Copy the message to address 0xFE20E.

6. Read in a 6-digit PIN from EEPROM at 0x724. I can’t see where this is set in the Windows utility.

7. Check that the 6-digit PIN is at the beginning of the message.

8. Call a function to parse the rest of the message and act on it.

This is where it gets mildly interesting – not only are there the documented commands but there are ones that include “CALL” and “4 2 xxxxxxxxxx”….

Reverse engineering a CSL Dualcom GPRS part 15 – interpreting disassembly 2

In addition to finding the most frequently called functions, we should go through the memory map and identify importants parts of it.

1154 memory map

One part of this that is very important to how the device operates is the vector table, right at the bottom of the flash.

The vector table contains addresses that are called when certain interrupts are triggered. For these microcontrollers, this is structured like this:
Vector table

So we take a look right at the beginning of the disassembly:

00000        00          NOP             
00001        01          ADDW            AX,AX
00002        82          INC             C
00003        2084        SUBW            SP,#84H
00005        2086        SUBW            SP,#86H
00007        2088        SUBW            SP,#88H
00009        208a        SUBW            SP,#8AH
0000b        208c        SUBW            SP,#8CH
0000d        208e        SUBW            SP,#8EH
0000f        2090        SUBW            SP,#90H
00011        2092        SUBW            SP,#92H
00013        2001        SUBW            SP,#1H
00015        247d23      SUBW            AX,#237DH
00018        292494      MOV             A,9424H[C]
0001b        2096        SUBW            SP,#96H
0001d        203a        SUBW            SP,#3AH
0001f        21          ?               
00020        be20        MOVW            PM0,AX
00022        3c21        SUBC            A,#21H
00024        92          DEC             C
00025        225921      SUBW            AX,!2159H
00028        ba22        MOVW            [DE+22H],AX
0002a        9820        MOV             [SP+20H],A
0002c        00          NOP             
0002d        209a        SUBW            SP,#9AH
0002f        209c        SUBW            SP,#9CH
00031        209e        SUBW            SP,#9EH
00033        20a0        SUBW            SP,#0A0H
00035        20a2        SUBW            SP,#0A2H
00037        20a4        SUBW            SP,#0A4H
00039        20a6        SUBW            SP,#0A6H
0003b        205e        SUBW            SP,#5EH
0003d        23          SUBW            AX,BC
0003e        d7          RET             
0003f        226023      SUBW            AX,!2360H
00042        a820        MOVW            AX,[SP+20H]
00044        aa20        MOVW            AX,[DE+20H]
00046        ac20        MOVW            AX,[HL+20H]
00048        ae20        MOVW            AX,PM0
0004a        b020b2      DEC             !0B220H
0004d        20b4        SUBW            SP,#0B4H
0004f        20b6        SUBW            SP,#0B6H
00051        20b8        SUBW            SP,#0B8H
00053        20ba        SUBW            SP,#0BAH
00055        20ff        SUBW            SP,#0FFH

The disassembler has tried to disassemble when it shouldn’t – a common issue. Though, to be honest, it should know that this area is a vector table.

So if we re-organise the hex file into something a bit more readable, we get this:

0000 -> 0100 * RESET
0004 -> 2082
0006 -> 2086
0008 -> 2088
000A -> 208A
000C -> 208C
000E -> 208E
0010 -> 2090
0012 -> 2092
0014 -> 2401 * INTST3
0016 -> 237D * INTSR3
0018 -> 2429 * INTSRE3
001A -> 2094 
001C -> 2096 
001E -> 213A * INST0 
0020 -> 20BE * INTSR0 
0022 -> 213C * INTSRE0
0024 -> 2292 * INTST1
0026 -> 2159 * INTSR1
0028 -> 22BA * INTSRE1
002A -> 2098
002C -> 2000 * INTTM00
002E -> 209A
0030 -> 209C
0032 -> 209E
0034 -> 20A0
0036 -> 20A2
0038 -> 20A4
003A -> 20A6
003C -> 235E * INTST2
003E -> 22D7 * INTSR2
0040 -> 2360 * INTSRE2
0042 -> 20A8
0044 -> 20AA
0046 -> 20AC
0048 -> 20AE
004A -> 20B0
004C -> 20B2
004E -> 20B4
0050 -> 20B6
0052 -> 20B8
0054 -> 20BA

Notice how a lot of the addresses are just incrementing- 20AA, 20AC, 20AE. This is just a massive block of RETI instructions – i.e. the interrupt handler just returns immediately – it is not implemented.

02092        61fc        RETI            
02094        61fc        RETI            
02096        61fc        RETI            
02098        61fc        RETI            
0209a        61fc        RETI            
0209c        61fc        RETI            
0209e        61fc        RETI            
020a0        61fc        RETI            
020a2        61fc        RETI            
020a4        61fc        RETI            
020a6        61fc        RETI            
020a8        61fc        RETI            
020aa        61fc        RETI            
020ac        61fc        RETI            
020ae        61fc        RETI            
020b0        61fc        RETI            
020b2        61fc        RETI     

All of the vectors that are marked with an asterisk and with a name are implemented or used by the board. There are some important handlers here – mainly the serial IO.

Reset jumps to 0x100. I’ll save looking at that for another time – mostly the reset vector will be setting up buffers, memory, pointers, some checks.

You can also see we have groups of interrupt handlers for INTST* (transmit finished), INTSR* (receive finished), INTSRE* (receive error). These are for the the UARTs 0-3 respectively. Their implementation is very similar – let’s look at UART1 which is used for the GPRS modem.

// INTST1
	02292        c1          PUSH            AX
	02293        c3          PUSH            BC
	02294        c7          PUSH            HL
		02295        fbb6e0      MOVW            HL,!0E0B6H
		02298        afb4e0      MOVW            AX,!0E0B4H
		0229b        47          CMPW            AX,HL
		0229c        dd17        BZ              $22B5H
		0229e        dbb4e0      MOVW            BC,!0E0B4H
		022a1        49b8e4      MOV             A,0E4B8H[BC] 	// Get data from E4B8 using offset from E0B4
		022a4        9e44        MOV             SIO10,A 		// Move to serial data TX register
		022a6        a2b4e0      INCW            !0E0B4H	    // Increment the offset
		022a9        afb4e0      MOVW            AX,!0E0B4H
		022ac        440a04      CMPW            AX,#40AH 		// Is the offset greater than 1034? If so reset to 0
		022af        dc04        BC              $22B5H
		022b1        f6          CLRW            AX
		022b2        bfb4e0      MOVW            !0E0B4H,AX
	022b5        c6          POP             HL
	022b6        c2          POP             BC
	022b7        c0          POP             AX
	022b8        61fc        RETI  

Again – I’m not really currently interested in precise detail, just an idea of what is happening. This handler takes a byte from a buffer at 0xE4B8 and writes it into the transmit register. That buffer will appear elsewhere in the code and hint to us when something is being sent out of UART1.

We can then go through all of the other UART/serial functions and identify potential transmit/receive buffers.

Interestingly, INTST0 and INTST2 are just RETI instructions. Why do these not require a transmit empty interrupt handler? Is it handled in software elsewhere?

The next handler that stands out from the others is INTTM00. This is the timer interrupt for timer 0 which will fire when the timer hits a certain value.

// INTTM00        
	02000        c1          PUSH            AX
	02001        c3          PUSH            BC
	02002        c7          PUSH            HL
	02003        aefc        MOVW            AX,0FFFFCH
	02005        c1          PUSH            AX
	02006        a0b3f6      INC             !0F6B3H
	02009        8fb3f6      MOV             A,!0F6B3H
	0200c        5c03        AND             A,#3H
	0200e        4c03        CMP             A,#3H
	02010        df38        BNZ             $204AH
		02012        a0b4f6      INC             !0F6B4H
		02015        fcfc2801    CALL            !!128FCH
		02019        fc932601    CALL            !!12693H
		0201d        fcf22701    CALL            !!127F2H
		02021        f45c        CLRB            0FFE5CH

		02023        fc132a01    CALL            !!12A13H // 7SEG display
		02027        fcaa3201    CALL            !!132AAH // Buttons

		0202b        8fb4f6      MOV             A,!0F6B4H
		0202e        5c03        AND             A,#3H
		02030        dd08        BZ              $203AH
		02032        91          DEC             A
		02033        dd0b        BZ              $2040H
		02035        91          DEC             A
		02036        dd0e        BZ              $2046H
		02038        ef10        BR              $204AH
		0203a        fccbff00    CALL            !!0FFCBH // Analog
		0203e        ef0a        BR              $204AH
		02040        fcd13101    CALL            !!131D1H
		02044        ef04        BR              $204AH
		02046        fc063301    CALL            !!13306H
	0204a        fc742e01    CALL            !!12E74H
	0204e        fc84ff00    CALL            !!0FF84H
	02052        72          MOV             C,A
	02053        81          INC             A
	02054        dd24        BZ              $207AH
        02056        62          MOV             A,C
        02057        70          MOV             X,A
        02058        f1          CLRB            A
        02059        01          ADDW            AX,AX
        0205a        04b8f5      ADDW            AX,#0F5B8H
        0205d        16          MOVW            HL,AX
        0205e        f6          CLRW            AX
        0205f        b1          DECW            AX
        02060        bb          MOVW            [HL],AX
        02061        62          MOV             A,C
        02062        d1          CMP0            A
        02063        dd11        BZ              $2076H
        02065        2c11        SUB             A,#11H
        02067        dd05        BZ              $206EH
        02069        91          DEC             A
        0206a        dd06        BZ              $2072H
        0206c        ef0c        BR              $207AH
        0206e        e46a        ONEB            0FFE6AH
        02070        ef08        BR              $207AH
        02072        e46b        ONEB            0FFE6BH
        02074        ef04        BR              $207AH
        02076        fcf7fc00    CALL            !!0FCF7H
        0207a        c0          POP             AX
	0207b        befc        MOVW            0FFFFCH,AX
	0207d        c6          POP             HL
	0207e        c2          POP             BC
	0207f        c0          POP             AX

This looks like it is fired periodically. A number of counters are used so that portions of the subroutine are only run now and then.

There are a lot of calls, and if we look to them we can clearly identify function:

// Suspect from IO this is output to 7 seg
	12a13        d45d        CMP0            0FFE5DH
	12a15        f1          CLRB            A
	12a16        61f8        SKNZ            
	12a18        e1          ONEB            A
	12a19        9d5d        MOV             0FFE5DH,A
	12a1b        d45d        CMP0            0FFE5DH
	12a1d        dd24        BZ              $12A43H
	12a1f        8f46f6      MOV             A,!0F646H
	12a22        d448        CMP0            0FFE48H
	12a24        dd0a        BZ              $12A30H
	12a26        36b4f6      MOVW            HL,#0F6B4H
	12a29        31d50e      BF              [HL].5H,$12A3AH
	12a2c        51ff        MOV             A,#0FFH
	12a2e        ef0a        BR              $12A3AH
	12a30        d446        CMP0            0FFE46H
	12a32        dd06        BZ              $12A3AH
	12a34        36b4f6      MOVW            HL,#0F6B4H
	12a37        31f3f2      BT              [HL].7H,$12A2CH
	12a3a        712305      CLR1            P5.2H // These are the common cathodes
	12a3d        713205      SET1            P5.3H
	12a40        9d06        MOV             P6,A // P6 is the 7SEG
	12a42        d7          RET       

	12a43        8f47f6      MOV             A,!0F647H
	12a46        d449        CMP0            0FFE49H
	12a48        dd0a        BZ              $12A54H
	12a4a        36b4f6      MOVW            HL,#0F6B4H
	12a4d        31d50e      BF              [HL].5H,$12A5EH
	12a50        51ff        MOV             A,#0FFH
	12a52        ef0a        BR              $12A5EH
	12a54        d447        CMP0            0FFE47H
	12a56        dd06        BZ              $12A5EH
	12a58        36b4f6      MOVW            HL,#0F6B4H
	12a5b        31f3f2      BT              [HL].7H,$12A50H
	12a5e        712205      SET1            P5.2H // common cathodes flip
	12a61        713305      CLR1            P5.3H
	12a64        9d06        MOV             P6,A
	12a66        d7          RET   

From the IO, we can see this is likely to be updating the 7 segment LED displays.

The method used – of setting one common cathode, then the segments for that half, then the other common cathode, then the segments for that half – means that this needs to be called relatively frequently otherwise flicker will be detected by the eye.

// Button detection and debounce?
	132aa        31220217    BT              P2.2H,$132C5H // Button A
		132ae        4029e0ff    CMP             !0E029H,#0FFH
		132b2        dd24        BZ              $132D8H
		132b4        a029e0      INC             !0E029H
		132b7        4029e007    CMP             !0E029H,#7H
		132bb        df1b        BNZ             $132D8H
		132bd        cf29e0ff    MOV             !0E029H,#0FFH
		132c1        e445        ONEB            0FFE45H
		132c3        ef13        BR              $132D8H
	132c5        d529e0      CMP0            !0E029H
	132c8        dd0e        BZ              $132D8H
	132ca        b029e0      DEC             !0E029H
	132cd        4029e0f8    CMP             !0E029H,#0F8H
	132d1        df05        BNZ             $132D8H
	132d3        f529e0      CLRB            !0E029H
	132d6        f445        CLRB            0FFE45H

	132d8        31320216    BT              P2.3H,$132F2H // Button B
		132dc        402ae0ff    CMP             !0E02AH,#0FFH
		132e0        dd23        BZ              $13305H
		132e2        a02ae0      INC             !0E02AH
		132e5        402ae007    CMP             !0E02AH,#7H
		132e9        df1a        BNZ             $13305H
		132eb        cf2ae0ff    MOV             !0E02AH,#0FFH
		132ef        e444        ONEB            0FFE44H
		132f1        d7          RET             
	132f2        d52ae0      CMP0            !0E02AH
	132f5        dd0e        BZ              $13305H
	132f7        b02ae0      DEC             !0E02AH
	132fa        402ae0f8    CMP             !0E02AH,#0F8H
	132fe        df05        BNZ             $13305H
	13300        f52ae0      CLRB            !0E02AH
	13303        f444        CLRB            0FFE44H
	13305        d7          RET 

Again, from the IO, we can see that the buttons are being polled. There’s also some counters changing – probably some debounce.

// Analog something or other
	0ffcb        f1          CLRB            A
	0ffcc        71042a      MOV1            CY,0FFE2AH.0H
	0ffcf        7189        MOV1            A.0H,CY
	0ffd1        70          MOV             X,A
	0ffd2        f1          CLRB            A
	0ffd3        710ce3      MOV1            CY,ADIF
	0ffd6        61dc        ROLC            A,1
	0ffd8        6158        AND             A,X
	0ffda        dd23        BZ              $0FFFFH
	0ffdc        710be3      CLR1            ADIF
	0ffdf        4031ff07    CMP             !ADS,#7H
	0ffe3        8d1f        MOV             A,ADCRH
	0ffe5        df0b        BNZ             $0FFF2H
	0ffe7        9f0af6      MOV             !0F60AH,A
	0ffea        717b30      CLR1            ADCS
	0ffed        ce3106      MOV             ADS,#6H
	0fff0        ef09        BR              $0FFFBH
	0fff2        9f0bf6      MOV             !0F60BH,A
	0fff5        717b30      CLR1            ADCS
	0fff8        ce3107      MOV             ADS,#7H
	0fffb        00          NOP             
	0fffc        717a30      SET1            ADCS
	0ffff        d7          RET     

This does something with one of the ADC inputs. I’ve not seen anything of interest that uses analog yet, so I’ll not look into this more currently. It could be the input voltage (the boare can alarm on this) or PSTN line voltage.

There aren’t many other clearly idenfiable subroutines, but these few clearly identifiable ones give me confidence that this interrupt handler is most handling periodic IO.

This program structure of calling time-sensitive IO using a timer interrupt is fairly common in embedded systems. It means that IO is serviced regularly, allowing more time consuming (or non deterministic time) processing to happen outside of the interrupt in the main code. It means there are a lot of buffers and global variables to pass data back and forth that we can look at and play with.

From a security perspective, it can also produce problems. If we can stall something in the timer interrupt – by buffer overflow, bad input or so on – it can be possile to lock up a device. I’d hope that the board used a watchdog timer to recover from this though.

Reverse engineering a CSL Dualcom GPRS part 14 – interpreting disassembly

A few posts ago, we managed to disassemble the firmware from the CSL Dualcom site.

The entire listing is available here as a zip. There is a lot of blank space in the file which needs to be trimmed down, but for reference this file will be left as-is.

I have also put the code on github. It’s not ideal as you can’t use the web interface to show the code/diffs, but it is a good way of recording history as mistakes will be made.

The process of turning diassembly into something useful isn’t easy. I find the most useful things are to find very commonly called subroutines first, and work out what they do. If they aren’t obvious, skip them.

The raw listing doesn’t show us the frequency with which subroutines are called. Python, to the rescue again. We trim out the fluff from the file. 0x1000-0x2000 is the string table, which the disassmebler doesn’t know about and tries to turn into code. The processor has a mirrored address structure so everything in the range 0x00000. Everything above 0x1FFFF isn’t the code – it’s special function registers and a mirror area.

Now we run the code through a small script:

from collections import Counter
import operator

datafile = open('/Users/andrew/data/Disassemble1.txt', 'r')

callAddress = []

for row in datafile:
    # Rows with CALL in
    if row.find('CALL') > 0:
        values = row.split('CALL')
        # Get value after call, remove unwanted chars, strip
        # ! are for addressing mode, H\r\n aren't wanted
        address = values[1].replace('!', '').replace('H\r\n', '').strip()
        callAddress.append(address)

# Builds a dict of frequencies
freqs = Counter(callAddress)

# sorts the dictionary into a list of tuples
sortedFreqs = sorted(freqs.iteritems(), key=operator.itemgetter(1), reverse=True)

# Whack it out to CSV for copy and paste
for item in sortedFreqs:
    print item[0] + ',' + str(item[1])

And we end up with CSV of the frequency of calls:

0E1B2,182
0E541,160
0E1D1,143
0D764,120
0DC44,105
0DED3,82
0DACC,79
0E322,68

0xE1B2 looks like a good place to start.

0e1ac        bfcce0      MOVW            !0E0CCH,AX
0e1af        c2          POP             BC
0e1b0        61ec        RETB   
// Start of sub
0e1b2        4c01        CMP             A,#1H
0e1b4        df05        BNZ             $0E1BBH
0e1b6        63          MOV             A,B
0e1b7        ec01e100    BR              !!0E101H
0e1bb        4c02        CMP             A,#2H
0e1bd        df05        BNZ             $0E1C4H
0e1bf        63          MOV             A,B
0e1c0        ec47e100    BR              !!0E147H
0e1c4        4c03        CMP             A,#3H
0e1c6        63          MOV             A,B
0e1c7        61f8        SKNZ            
0e1c9        ec6ce100    BR              !!0E16CH
0e1cd        ecdfe000    BR              !!0E0DFH
0e1d1        fdc404      CALL            !4C4H
0e1d4        0233bd      ADDW            AX,!0BD33H
0e1d7        2013        SUBW            SP,#13H
0e1d9        72          MOV             C,A

First thing to be aware of is that disassembly is not an exact science. Sometimes you will see an address CALLed but you can’t find it. This probably means that the disassembly is misaligned in that area – look a couple of adresses above and below. This is not the case here.

We can see immediately above 0xE1B2 there is a POP and RETB, the end of a subroutine.

To work out what a sub does, it helps to know what parameters are passed to it and how. If we look through for all the CALLs to 0xE1B2, we get an idea of what is going on:

03d31        530d        MOV             B,#0DH
03d33        e1          ONEB            A
03d34        fcb2e100    CALL            !!0E1B2H

B is always set to a value over quite a wide range. It’s probably a number or a ASCII character.

A is set to either 0, 1, 2 or 3. This is likely some kind of option or enumeration.

Going back to the subroutine, we can see how this could work:

0e1b2        4c01        CMP             A,#1H
0e1b4        df05        BNZ             $0E1BBH
	0e1b6        63          MOV             A,B		
	0e1b7        ec01e100    BR              !!0E101H	// If A = 1, branch to 0xE101
0e1bb        4c02        CMP             A,#2H
0e1bd        df05        BNZ             $0E1C4H
	0e1bf        63          MOV             A,B
	0e1c0        ec47e100    BR              !!0E147H	// If A = 2, branch to 0xE147
0e1c4        4c03        CMP             A,#3H
0e1c6        63          MOV             A,B
0e1c7        61f8        SKNZ            
	0e1c9        ec6ce100    BR              !!0E16CH 	// If A = 3, branch to 0xE16C
0e1cd        ecdfe000    BR              !!0E0DFH		// If A = 0, branch to 0xE0DF

So we are branching to other addresses based on the parameter in A.

There’s one thing to note about this function. There is no immediate RET instruction there. These have to be dealt with in the code that is branched to.

Let’s look at 0xE101.

0e101        77          MOV             H,A
0e102        8efa        MOV             A,PSW
0e104        9803        MOV             [SP+3H],A
0e106        67          MOV             A,H
0e107        717bfa      DI              
0e10a        c3          PUSH            BC
0e10b        dbb6e0      MOVW            BC,!0E0B6H
0e10e        48b8e4      MOV             0E4B8H[BC],A
0e111        a2b6e0      INCW            !0E0B6H
0e114        afb6e0      MOVW            AX,!0E0B6H
0e117        440a04      CMPW            AX,#40AH
0e11a        dc04        BC              $0E120H
0e11c        f6          CLRW            AX
0e11d        bfb6e0      MOVW            !0E0B6H,AX
0e120        8f0401      MOV             A,!SSR02L
0e123        31631e      BT              A.6H,$0E144H
0e126        362201      MOVW            HL,#122H
0e129        71a2        SET1            [HL].2H
0e12b        71b2        SET1            [HL].3H
0e12d        dbb4e0      MOVW            BC,!0E0B4H
0e130        49b8e4      MOV             A,0E4B8H[BC]
0e133        9e44        MOV             SIO10,A
0e135        a2b4e0      INCW            !0E0B4H
0e138        afb4e0      MOVW            AX,!0E0B4H
0e13b        440a04      CMPW            AX,#40AH
0e13e        dc04        BC              $0E144H
0e140        f6          CLRW            AX
0e141        bfb4e0      MOVW            !0E0B4H,AX
0e144        c2          POP             BC
0e145        61ec        RETB            

It’s pretty long and complex. But there is one really key piece of info in there – the special function register SSR02L. Looking to the 78K0R data sheet, this is “Serial status register 02”. It’s pretty likely this function concerns serial. It has a return at the end as well.

If we look 0xE16C, this has reference to SSR12L. Another serial port.

It’s quite likely that this function concerns either reading or writing to the various serial ports on the board. I’ve not looked at it in enough depth to know exactly what it is doing, so we’ll do the following:

// B has char 
// A has 0,1,2,3 - probably different serial ports
// Return is in the branches
:sub_Serial_UnknownA_e1b3           
	0e1b2        4c01        CMP             A,#1H
	0e1b4        df05        BNZ             $0E1BBH
		0e1b6        63          MOV             A,B
		0e1b7        ec01e100    BR              !!0E101H // A = 1
	0e1bb        4c02        CMP             A,#2H
	0e1bd        df05        BNZ             $0E1C4H
		0e1bf        63          MOV             A,B
		0e1c0        ec47e100    BR              !!0E147H // A = 2
	0e1c4        4c03        CMP             A,#3H
	0e1c6        63          MOV             A,B
	0e1c7        61f8        SKNZ            
		0e1c9        ec6ce100    BR              !!0E16CH // A = 3
	0e1cd        ecdfe000    BR              !!0E0DFH // A = 0

What have I done here?

  • Called the sub :sub_Serial_UnknownA_e1b3. The : denotes that this is the actual sub. It is something to do with serial – the first unknown sub to do with serial. I have put the address on the end just to keep track of where it is.
  • Search and replace on !!0E1B2H with this new name. “sub_Serial_UnknownA_e1b3” now shows instead of the raw address – when I see it called I know it is something to do with serial.
  • Put some brief notes above the sub so I know what it is doing.
  • Indented branches so function is a little clearer

I’m now going to do similar for the other high-frequency subs. Again, I am building up a broad picture, not going into extreme depth at this stage.

Reverse engineering a CSL Dualcom GPRS part 13 – checking the SIM card

The ICCID is written on the outside of the Dualcom GPRS, stored in the EEPROM, read in from the GRPS modem, and read in from EEPROM immediately before a long, random looking, string is sent to a remote server. It seems quite important.
ICCID on case

The Dualcom board also frequently checks for received SMS.

It might be worth taking a look at the SIM to see what is on it.

From previous projects, I have an Omnikey Cardman 5321 card reader. This reads both RFID cards and smart cards. We can put the SIM card in a carrier and read it with this device.
SIM from Dualcomphoto 2

SimSpy II is a free utility which can read most data from SIM cards inclouding ICCID, IMSI, Kc (which can be used to decrypt communications), SMS messages and more.

Unfortunately, nothing too interesting comes up. The card never seems to have stored any SMS. There’s no numbers in the phone book. We might end up coming back to this at some point.

SIM data

SIM data

Reverse engineering a CSL Dualcom GPRS part 12 – board buzz out

We’ve now got the code disassembled. The disassembler has no concept of what is connected to the microcontroller though, so we need to work out which ports/pins/peripherals are used by which parts of the board. What is P11.1? What about P7? These are all I/O, but meaningless without looking at the physisal board.

The best way of doing this is using a continuity tester and buzzing the board out. It’s not worth exhaustively mapping out the PCB at this stage – just the interesting bits. There might even be some mistakes.

When doing this, I find two tools are essential:

  • A meter with a quick continuity beep. Some have a lag. I’ve not got time for that. I use my Amprobe AM-140-A for this – it’s very quick, if a bit scratchy sounding.
  • Fine probes. I really like the Pomona 6275 probes – they are very sharp and very small.

Pop one end on the peripheral and just brush the other along the sides of the microcontroller. Not too hard or you risk dragging metal between the pins. It makes it very quick to find where things are going.

Watch out for transistors and resistors in the way though e.g. the inputs from the alarm are likely transistor buffered, and some of the peripherals might have resistors to divide voltage.

IC8

IC8 is the socketed 93C86 EEPROM.

DI -> P111
DO -> P20
CLK -> P142
CS -> P141

P111 means port 11, bit 1.

IC11

IC11 is the SMT 93C86 EEPROM.

DI, DO, CLK are shared with IC8

DI -> P111
DO -> P20
CLK -> P142
CS -> P14.5

GPRS Modem

Pin 14 – device control on/off -> P05/TI05/TO05
Pin 21 – GPIO -> P47
Pin 32 – DSR1 -> P26/ANI6
Pin 33 – LED control signal -> SVC LED (not to micro)
Pin 37 – DTR1 -> P04/SCK10/SCL10
Pin 40 – CTS1 -> P21/ANI1
Pin 41 – DTM1 -> P02/SO10/TxD1
Pin 42 – DFM1 -> P03/SI10/RxD1/SDA10

IC02

This is the PSTN modem (Si2401)

Pin 7 – CTS_ -> P10/SCK00
Pin 6 – TXD – 52 P12/SO00/TxD0
Pin 5 – RXD – 53 P11/SI00/RxD0

Buttons

Button A -> P22
Button B -> P23

7 Segment

Segments ->  P60/P61/P62/P63/P64/P65/P66/P67

RH common cathode -> P53

LH common cathode -> P52

LEDs

GSM -> P51/INTP2
PSTN -> P50/INTP1

Programming header

1 -> VCC
2 -> VSS
3 -> P40/TOOL0
4 -> P41/TOOL1
5 -> RESET
6 -> FLMD0
7 -> Switched to ground via reset

(this looks like it would work with a standard Renesas debug tool – the MiniCube2).

I’m not bothered about the other parts at the moment. We can come back to them if we need to.

Next step is to identify a few basic functions inside the disassembled code, probably starting with EEPROM reading.

Reverse engineering a CSL Dualcom GPRS part 11 – disassembling firmware

I find reverse engineering is about building up a broad picture instead of working in-depth on any one aspect of the system. Dip into one bit, check what you are seeing is reliable and makes sense, dip into another area to get more detail, repeat.

We’ve done this with the logic trace – seen the long string sent, looked at the EEPROM access, checked what this in the .prm file, and seen that it is the ICCID. Now I would like to look at the firmware on the device.

CSL Dualcom have v353 hex file available for download on their site. The board I am looking at uses v202. That’s a big difference.
I can see a few paths here:

  1. Get a v202 (or anything nearer to v202 than v353) hex file downloaded. A quick Google doesn’t help here.
  2. Upgrade the board I have to v353. This would require the EEPROM to be updated, possibly other changes. I don’t want to break this board.
  3. Recover the v202 firmware from the board I have. There is a programming header – it could be possible to get the code off. But it may not be possible.
  4. Live with the difference and hope that there is enough consistency between the two to be helpful.

I’m going to run with 4. It’s the lowest effort, and I think it will work. The EEPROM structure between some of the different board seems identical – this is backed up by there only being a single Windows programming tool for the board, regardless of firmware version. In my experience, smaller embedded system firmware is quite consistent as new functionality is added, even if the toolchain changes (this really doesn’t hold true when you move up to anything bigger running Linux).

What can we do with the firmware? Well, we need to disassemble it. What does that actually mean? It means changing the raw machine code back into human-readable assembly language i.e. F7 becomes CLRW BC (clear register BC). This probably doesn’t meet everyone’s idea of human readable, but it is a lot better than machine code.

Some microcontrollers (like the ATmega series) have easy to understand and even read machine code. 90% of instructions are a fixed length (16bits). The number of instructions are limited and there are only limited addressing modes. With some practice you can make sense of machine code in text editor.

The 78K0R is not like this.

MOVs

All of the enclosed red cells are MOV instructions. That’s a lot of them. There are 4 of these maps, with a total of 1024 cells. ~950 of them are populated.

There are several addressing modes and the instruction length varies from 8bits to 32bits. This makes the machine code incredibly hard to read.

We need an automated disassembler. Google isn’t much help here – these microcontrollers aren’t as popular as x86, ARM, AVR, or PIC .

There are two toolchains widely available for these processors. Renesas Cubesuite and IAR Embedded Workbench. There is a chance that one of these has either a disassembler or a simulator that allows a hex file to be loaded.

After a lot of messing around, it appears that Renesas Cubesuite can load the hex, disassemble it, and also simulate it.

1. Download Renesas Cubesuite and install it (Windows only)

2. Start Cubesuite.

Renesas Cubesuite

3. Go to Project -> Create new project.

4. Change the microcontroller to the “uPD78F1154_80” (the 80 pin variant)

Microcontroller

6. Once the project has been created, in the “Project Tree” on the right hand side, right click on “Download files” and click “Add”

Download files

7. Find your hex or bin file and load it (hex is preferable as it seems the disassembler takes into account the missing address space).

8. Go to Debug -> Build and Download

9. The simulator starts up and you can see the disassembled code.

0e7e3        bd22        MOVW            0FFE22H,AX
0e7e5        17          MOVW            AX,HL
0e7e6        70          MOV             X,A
0e7e7        80          INC             X
0e7e8        61f8        SKNZ            
0e7ea        5500        MOV             D,#0H
0e7ec        3149        SHL             A,4H
0e7ee        73          MOV             B,A
0e7ef        fa22        MOVW            HL,0FFE22H
0e7f1        8b          MOV             A,[HL]
0e7f2        fa22        MOVW            HL,0FFE22H
0e7f4        a7          INCW            HL
0e7f5        37          XCHW            AX,HL
0e7f6        bd22        MOVW            0FFE22H,AX
0e7f8        17          MOVW            AX,HL
0e7f9        70          MOV             X,A
0e7fa        80          INC             X
0e7fb        dd07        BZ              $0E804H
0e7fd        618d        XCH             A,D

Great – now we can get to work trying to see what the processor is doing.

The next post will likely be buzzing the board out to find out which I/O is connected to what so we can make some sense of the code.

Reverse engineering a CSL Dualcom GPRS part 10 – analysing the logic trace 2

Last post, we looked at the comms between the board and the GPRS modem. There was a long, interesting, string send to a remote server:

LjS1WQjg8FHqR1a4P4DVsjO8eUITXY6ifHPlaFhkZ2SJ

When we look out to the rest of the logic trace, we can see that the EEPROM is accessed exactly as this begins:

EEPROM access

From this view, it might look like the EEPROM access is too late for it to be used to generate that long string. However, in microcontroller terms, there is ~0.3ms between the end of the EEPROM access and the start of the first character on serial (I suspect the ‘1’ is a start character). I’ve not checked the crystal speed, but it’s between 2MHz and 20MHz. Even at 2MHz, that’s 600 instructions – plenty of time to act on the EEPROM data.

We now need to zoom in on the EEPROM data and see what is happening:

Screen Shot 2014-04-02 at 22.04.43

The trace is maybe a little confusing here. The green bordered binary is the DI line – the microcontroller to the memory. The yellow bordered binary is the DO line – the memory to the microcontroller. The Saleae Logic software has no decoder to deal with Microwire, so we need to use the SPI decoder set with a bit-width to deal with both the receive and transmit sides.

According to the data sheet for the EEPROM, this should be 29 clock cycles. For whatever reason there is an additional clock cycle here though – the very short transition in the middle of the trace. So we set the SPI decoder to 30bits.

The first 3bits of the green bordered binary are 0b110 – a read command. The next 10bits are the address – 0b00 1001 1000 – i.e. 0x098. After this point, we can ignore the green bordered binary.

The last 16bits of the yellow bordered binary are the value from memory – 0x4489.

We should be able to find this in the .prm file. 0x098 is 152. The prm file has a byte per row, and 2 bytes on the first row, so we should need to go to row 304 of the file – and there we go – 89, 44. Perfect.

If we continue going through the trace, we read the following addresses:

0x098
0x098
0x099
0x099
0x09A
0x09A
0x09B
0x09B
0x09C
0x09C

Strangely each of the addresses is read twice. Why? Not sure at this time.

What data do we get back?

8944 1000 3006 3711 7619

This is the ICCID – the unique ID assigned to the SIM card in the GPRS modem.

Could the board be using the ICCID as a key to encrypt the data?

Reverse engineering a CSL Dualcom GPRS part 9 – analysing the logic trace

We’ve captured a trace of:

  • The serial comms to the GPRS modem
  • The serial comms to the PSTN modem
  • The SPI comms to the EEPROM

Now we can take a look at the data in these traces. Let’s start with the communications with the GPRS modem.

The serial comms to the GPRS modem are normal ASCII characters, and it uses the AT command set. The GPRS modem is a Wavecom board – we can download the AT command set documentation.

Stepping through the logic trace and transcribing the commands, we end up with something like this:

TX RX Command  Notes
0202 Dualcom GPRS>  Restart
AT AT OK
AT&F AT&F OK Set to Factory Defined Configuration Profile
ATE0 ATE0 OK Command echo off DCE does not echo characters during command state and online command state
ATX0 OK Call Progress Monitoring Control Busy detection off
AT&D2 OK Circuit 108 (DTR) Response When in on-line data mode, deassert DTR closes the current connection and switch to on-line command mode.
AT+CMEE=1 OK Mobile Equipment Error Enable +CME ERROR: <err> result code and use numeric <err> values.

The whole CSV file of this trace can be downloaded here.

What can we see going on?

  1. The board is reset and some basic settings are sent (don’t echo commands, use DTR to close connection, turn on numeric errors).
  2. Setup SMS messaging to store messages on the ME (the GPRS modem, as compared to the SIM). There seems to be room for 100 messages in the modem.
  3. Setup three PDP contexts. I think these are essentially GRPS connections. The first two are generic and have no username/password – they might be Vodafone APNs. The third is a csldual.com – likely a private APN. An APN is a gateway between a GRPS connection and an IP network.
  4. Setup three Internet accounts. These are credentials used with the PDP contexts. The generic ones have no username or password, but the csldual.com one does – dualcomgprsxx and QO6806xx.
  5. The board periodically checks for network registration and signal strength. The signal strength is shown on the 7-seg display when idling. The GPRS modem is connected to the home network with decent signal strength.
  6. The board then repeatedly scans the first 15 SMS slots for messages. There are no messages, so we get errors back. This is quite interesting – what is it that gets sent to the board as SMS?
  7. The board then tries to connect to a private IP address/port 172.16.6.20:8965 using the csldual.com APN. The first time this is attempted it fails with error code 094, which isn’t listed in the documentation (or on the wider Internet…)
  8. The board then tries to connect to the same IP again. This time it succeeds, and some data is sent back and forth. This is a string of ASCII text which looks, from a human perspective, fairly random.

The data looks like follows (sent on left, receive on right):

DC4 HS87
r (immediately after response above)
LjS1WQjg8FHqR1a4P4DVsjO8eUITXY6ifHPlaFhkZ2SJ EE1404,0122,3343,’6’
‘3’ OK

What things are of note in this trace then?

  • The APN and the username and password used are constant across several devices and the Sample.prm I have looked at. It seems curious to require a password but for it not to vary.
  • SMS messages are checked for frequently, suggesting something important is received by SMS.
  • There is no notion of time/counters/nonce in any of the communications.
  • There doesn’t see to be any key exchange
  • There doesn’t seem to be any authentication of the APN/server with the GRPS Dualcom board.

This has raised a number of questions:

  • What data is used to authenticate a given APN? If the username and password are constant, is the ICCID and other data used?
  • Can anyone send SMS to the GPRS modem, or is there some form of blocking performed by the network in the other direction?
  • Whilst the notion of time/counters/nonce isn’t essential for strong/good encryption, it does make things easier.
  • A common failing of embedded systems that do use encryption is that they don’t change the key. Encryption with a fixed, known key is not really much better than no encryption.
  • It’s been possible to spoof a cell site for a few years now using Software Defined Radio. If the APN/server can be spoofed, then the signalling might stop working.

I’m not sure what the next step is:

  • Gather more traces and see if any patterns can be spotted in the data going between the board and server.
  • Look at EEPROM accesses whilst the GPRS modem is being used.
  • Disassemble the firmware and see if we can spot anything interesting.

I’ll see what takes my fancy next time I sit down and look at it.