What3Words – The Algorithm

Posted on May 2, 2021 by cybergibbons

What3Words is a widely promoted system that is used for sharing a location using just three words. For example, ///wedge.drill.mess is located in Hyde Park.The globe is divided into 3m squares, each of which has a unique three word address to identify it.

They claim that mistakes – like changing a word into a plural, or adding an extra character – would result in a location so far away that you would detect the mistake. They have themselves stated “that people confuse plurals only about 5% of the time when hearing them read out loud”. This means the number of squares where this could be a problem is crucial. What3Words say the odds are “1 in 2.5 million” – but they seem to be under 1 in 100 in cities in the UK.

This might be OK if you are ordering takeaway, but if the application is directing emergency services to your location, these errors could cause serious problems.

If you just want the outcomes of this work, this shorter blog post might be a better read.

I’m going to describe the algorithm first and then go through the issues.

The Algorithm

What3Words can work entirely offline. A position is converted into words using an algorithm and a fixed word list of 40,000 words. This is described in their patent with enough level of detail that you can understand why it is fundamentally broken.

From a high level, the following steps are required:

A lat/lon is converted into a integer coordinate XYxy.
XYxy is converted into a single integer n representing position.
n is shuffled into m to prevent words in two adjacent squares being the same.
m is converted into i,j,k which index into the word list.
i,j,k are converted into words.

Convert Lat/Lon to XYxy

The first step is to convert the decimal lat/lon into an integer XYxy location. This is just another type of grid reference with nothing complex going on.

The globe is divided up into large cells described by XY.

The Y direction starts at 0 at latitude -90° (the South Pole) and increases to 4,320 at latitude +90° (the North Pole). The X direction start at 0 at longitude -180° and increases to 8,640 at longitude +180°. This gives a total of 37,324,800 cells.

X increases as we move right, Y increases as we move up.

If we look in central London, we can see the scale of these cells. They are 4638m high and around 2885m across in London.

Within each of these large cells are the individual location squares, described by xy. There are always 1546 in the y direction. The number in the x direction varies from 1545 at the equator down to 1 at the poles. This is because of the curve of the earth.

In our central London cell of X=4316 Y=3396, x can range between 0 and 962, and y between 0 and 1546, giving 1,487,252 positions in the cell.

We now have an integer coordinate to represent anywhere on the earth.

Convert XYxy to n

The next step is to convert XYxy into n. n is a single large number that describes each point on the earth.

First, the offset into the large XY cell is found, using a simple formula:

1546 * x + y

This means that we are just counting upwards in columns as we go across. The bottom left corner of a cell is shown below. If you move up in the y direction, add one. If you move across in the x direction, add 1546. For the green square below, x=3 y=3 results in an offset of 4641.

Of course, this offset is not a unique number on the globe. To make it unique, the starting point – q – is defined for each XY cell.

In our central London square of X=4316 Y=3396, q = 733306450. That means the bottom left corner, n = 733306450. We add the offset on as we move through the cell. The n value of the green square is therefore 733311091 (733306450 + 4641).

The overall formula is:

n = q + 1546 * x + y

Where does q come from? Well, it comes from a lookup table.

Why is this done? q is used to bias lower values of n into cities. Lower values of n are important in the next steps.

We now have a large integer n, describing each point on the earth.

Convert n to m

The n value increases entirely predictably as we move through a cell. This could lead to the words for each position only changing by one word as we move about.

To handle this, a shuffling step is added to convert n to m. This step is also used to reduce the range of words used in cities to shorter and less complex words so that the system is easier to use.

The n values are split into bands. The bands go 0 to 2500^3 -1, 2500^3 to 5000^3 -1 all the way up to 37500-40000^3 -1. 16 bands in total. I’m only going to be looking at the first band from now onwards.

The shuffing within each band is carried out using the following formula:

m = StartOfBand + (BigNumber * PositionInBand) mod SizeOfBand

The BigNumber changes for each band. For band 0, from 0 to 2500^3 – 1, we get the following formula:

m = (9401181443 * n) mod 2500^3

The modulus means that the range of m will never be greater than 2500^3. In combination with the next step, this means that only the first 2,500 words in the dictionary are going to be used in band 0 cells.

For our example cell above with an n of 733311091, we have an m of 9813662131.

We now have a shuffled value m, another large integer.

Convert m to ijk

The single integer m now needs to be converted into three indices i,j,k into the 40,000 long word list.

The maths here is not particularly important. What is important is that this step does nothing to mask or shuffle how the value in m has changed.

For incrementing values of m, the pattern can clearly be seen in the output:

872848985 (39, 949, 955)
872848986 (39, 950, 955)
872848987 (39, 951, 955)
872848988 (39, 952, 955)
872848989 (39, 953, 955)
872848990 (39, 954, 955)
872848991 (40, 0, 955)
872848992 (40, 1, 955)
872848993 (40, 2, 955)
872848994 (40, 3, 955)
872848995 (40, 4, 955)

You can observe that below 2500^3, the three indices will always be 2499 or below. This means that the band 0 of n/m below 2500^3 will only ever result in the first 2,500 words out of the 40,000 being used.

2500^3 -5 (2498, 2494, 2499)
2500^3 -4 (2498, 2495, 2499)
2500^3 -3 (2498, 2496, 2499)
2500^3 -2 (2498, 2497, 2499)
2500^3 -1 (2498, 2498, 2499)
2500^3 0  (2499, 0, 2499)
2500^3 +1 (2500, 0, 1)
2500^3 +2 (2500, 0, 2)
2500^3 +3 (2500, 0, 3)
2500^3 +4 (2500, 0, 4)
2500^3 +5 (2500, 0, 5)

ijk (numbers) to Words

We have our i, j, k. These are three indices into a 40,000 long list of words.

If the word list was:

0 most
1 much
2 took
3 give
4 both
5 fact
6 help
7 ever
8 home
9 long

An i, j, k of 8, 4, 2 would give words of home.both.took with this example.

The word list is sorted so that shorter and less complex words are at the beginning, and longer and more complex ones at the end. The algorithm favours the start of the list for highly populous areas, thereby keeping the words short and simple.

For our example cell with an m of 9813662131, the i,j,k are 1940, 910, 2140 resulting in offers.trail.medium, right down in the bottom right of our large cell.

What does this all mean?

The design of the algorithm has resulted in some significant issues.

m periodicity

The conversion from n to m has some interesting characteristics.

m = (9401181443 * n) mod 2500^3

If we graph m for increasing n, you will spot a pattern.

Every 5th sample, the same pattern is seen. They might look identical in this graph, but each repeat is gradually increasing.

We can overlay these sequences from various values of n and see that the difference between the sequences varies. Sometimes they are close, sometimes they are far apart. This graph below shows m for n=0-4, n=133-137 and n=1023-1028.

After a certain number of sequences, the highest peak hits the limit of 2500^3 and has to wrap back down to 0 due to the modulus. You can see this happen in the graph below. The blue trace shows a continuous sample, increasing by a fixed amount each cycle. The orange one shows where it wraps, with a sharp jump downwards. This happens periodically, causing the position of the sequence to shift.

There are certain offsets in n that produce very closely matched sequences. It’s like two sine waves of slightly different frequencies coming in and out of phase to produce beat frequencies.

Just to illustrate this, we can plot values from the bottom corner of our large cell (X=4316 Y=3396, q = 733306450, n = 733306450) with an offset of 14838278 for n. The difference in m is small (1402) and constant in this range – the two lines overlay each other due to the large swings in value.

Small m offset results in two common words

When the offset between two n positions is from the list above, the difference between m for the two becomes constant and small.

There are over 500 of these n offsets resulting in an m offset of less than 2500. They can be generated automatically using a simple script.

The table below shows the n offset of 7,419,139 resulting in a m offset of 701.

You can see how many of the address 1 and address 2 have one or two words in common.

So for every given cell, there are many other cells at fixed offsets where it is far more likely that two of the words are common. We can test his, choosing the ///daring.lion.race that What3Words use as the home address and testing lots of these offsets. You can see one or two words common in all the rows.

Notice how daring.hint is also very common.

For a given square, n, across all of these specific offsets, there appears to be around a 33% chance that two of the words match. I suspect it’s not 100% simply because I haven’t spotted a bigger pattern.

For a given offset, across a range of n, the probability massively varies. Offset 7,419,139 has around a 60% chance of two words matching. Offset 200,659,195 has a 99% chance of matching two words. The impact of this can be seen below – nearly all squares this far apart have two words in common.

Impact of limiting word list in cities

When talking about daring.lion.race, there are 39,999 other choices for the last word. We expect to find about 40,000 cells across the globe with the first two words common. It only becomes a problem when those cells are physically nearby.

If we were using the whole 40,000 word dictionary spread across the globe, then there is a 1 in 40000^2 chance that you pick a cell with the same two first words. That’s 1 in 1,600,000,000 – on average over 1000 cells away

But because we are in band 0 – the cities – and they only use the first 2500 words in the dictionary, we aren’t talking about 40,000 spread across the globe. We are talking about 2,500 words spread across a portion of the globe. We’re down to a 1 in 2500^2 chance – or 1 in 6,250,000 – in a much smaller area. This is just over 4 cells away – less than 12km.

And that is all that we are seeing – on average, we see two words match every 6,250,000 squares. By the time we have moved 500 cells away, we have seen 119 instances where the first two words are daring.lion. Around one in every 4 cells. This is a direct effect of limiting the word list to 2500 words in built-up areas.

So, we’ve shown that the first two words are going to match in nearby cells. But for that remaining different word, it’s unlikely that rush and stick would be confused, so why are we seeing plural pairs like take/takes?

Remember that the words are just the last step. They correspond to an ijk, three indices into the word list. Let’s look at ijk instead of the words. I am filtering the table for n offset 7,419,139 to only show addresses with two words in common, and I am showing the difference in the remaining word.

You might notice a pattern. The difference in the remaing word is always -701. This is because the conversion of m to ijk does little to mask changes in m. As the m offset is always -701, often one of i,j or k will also differ by 701.

Why is a difference of -701 in a word index a problem? Most of the time, this is just going to result in linked words pairs like stick/rush, neck/feels and so on.

But it’s also the difference between takes/take – position 726 and 25 in the wordlist.

Several of the m offsets that appear line up with other plural pairs:

It is pure coincidence that two words, 701 apart in the wordlist, are a plural pair.

If we take a positive n offset, we see a word pair in one direction – take/takes. If we take a negative n offset, we see the word pair in the oppositve direction – takes/take..

How often do we see takes appear as the last word? Well, it turns out it’s only about as often we would expect for picking from a list of 2500 words – 1 in 2500 times.

Probability for nearby confusing pairs

If we pick one of the n offsets that cause problems – 7,419,139 causing take/takes – we can look at the chance that there are confusing pairs over a range of n. Let’s do that for n over our entire example cell.

For all squares in that cell, there are 876,541 squares offset by 7,419,139 that have two common words. That is 59% of the squares in the cell. 59% of the squares have the potential for the different word to be confused and that confusion to be within 7,419,139 distance in n away.

For those squares with two common words, 822 of them have the last word of takes. That means out of the entire cell, there is a 1 in 1800 chance that you pick a cell where the last word is takes and it could be changed to take to result in another location.

We need to check for negative n offset. There are 780 squares with the last word of take. This takes the chance of hitting one of the squares to around 1 in 930.

This wouldn’t be a problem if these linked squares were distant. But they aren’t. An n offset of 7,419,139 is only around 5 cells away.

This is easily demonstrated by testing for the pair take/takes on our cell and showing them on a map.

The cell to the right is the postive n offset, with the pair takes/take. The cell to the left is the negative n offset, with the pair take/takes.squares.

This behaviour is seen in every single cell in band 0. We can pick a cell, and there will be two other cells with take/takes relationships like this, around +/- 5 cells away.

Bigger n-offsets are not always distant

As we move down the list, the n offset gets larger and they start getting further and further away. bump/bumps – with an offset of 56,994,213 should be 38 cells away. So how are these a risk? First, let’s test for the pair bump/bumps on our example cell.

One of those is just over a single cell away – much closer than 38! Why has this happened?

The algorithm is designed to use only the first part of the word list in the most populous areas to improve usability.

We can show this on a map. All of the red cells are band 0 and yellow cells in band 2 (yes, there are no band 1 areas here).

All of those red cells are using a significantly limited dictionary of 2500 words instead of 40,000.

The band in which these cells fall is defined by setting the q value. We can find the q for cells using the API and show them on the map.

They are increasing left to right, and top to bottom. A more helpful representation is showing how many cells away from the start (q=0) they are. You can see London is right down at the start of the list, between 400 and 600 cells into the q values. These numbers might not be quite correct as I have just divided by an average cell size and rounded down.

Our example cell is 479 cells into q. bump/bumps should be around 38 cells away. 479 + 38 = 517. 479 -38 = 441. 517 and 441 are shown in red in the image blow.

It’s pretty clear to see these are the areas with a high density of issues on the plural pair bump/bumps.

This situation becomes even worse with some cells. Take a look at the centre of Birmingham here – the cell immediately below the one we are testing has a huge number of pairs.

All of these have the word pair fine/fines – an m difference of 1402, which is an n offset of 14,838,278. This is around 10 cells of n away.

We just need to look at the q values though.

267 and 257 – 10 cells apart. When two cells end up adjacent and exhibit this issue, you get a large number of troublesome plural pairs – 535 pairs with only a single “s” different between two squares, all with a maximum distance of 9.3km between them.

The number of plurals

In the first 2,500 words, there are 354 words that can have an “s” added to them and that word is also in the list. That means 708 out of the 2,500 can be confused by adding or removing an “s”. This is a significant proportion.

When you look at this cumulatively, it results in about 63% of cells containing one or more of these words that can be confused. The distance that this confusion results in is variable, but with the above issues, it can often result in them being nearby.

Cumulative impact

The examples prior to now have looked at single cells with a given n offset. When you look at the bigger picture, the weaknesses in the algorithm become very apparent. The below map shows all of the plural pair confusions for our example cell that are inside London. There are 62,988 in the single cell.

There are odds of 1 in 24 that you hit one of these squares.

Conclusion

There are multiple issues with the What3Words algorithm.

For many squares in cities in the UK, there are a series of other squares which will only vary by one of the three words i.e. two of the words are common e.g. ///bank.fine.fishery and ///bank.fine.even. These relationship between these squares is predictable.

The distance between these squares that only vary by a word is often below 50km, but can be under 1km.

For some of these squares, the only variation in that remaining different word with be a single letter “s” e.g. ///bank.fine.fishery and ///bank.fines.fishery.

Certain word pairs – such as take/takes, fine/fines, plot/plots, bump/bumps and space/spaces – are more likely to result in close pairs.

This same issue can impact other errors such as spelling mistakes and homophones.

The word list is limited to 2,500 words in cities in the UK, greatly increasing the probability of these errors happening.

A side effect of defining these densely populated areas results in some of these pairs of squares being even closer than they would be normally.

Combining these issues means that there are many squares under 10km apart that can be confused by changing a word to and from a plural.

Fixing What3Words?

Posted on April 29, 2021 by cybergibbons

There are an awful lot of people saying that fixing the issues with What3Words is “simple”: just remove the plurals! Get people to read it out twice! Spell the words!

This shows that people haven’t understood the problem fully, or looked at the other issues What3Words has.

What problem are we trying to solve?

What3Words, according to Wikipedia, was created for communicating locations, but there’s nothing saying it was for emergency services.

The system may be fine to use where the cost of making mistakes is low, when people aren’t under pressure, or when entering your location for a takeaway delivery.

That doesn’t mean it’s fine for emergency use. It doesn’t mean it’s fine for use in mountains.

Without clearly stating the problem, it’s very difficult to come up with a solution.

It’s not even clear that the problem is “verbal communication of location”. With the advent of smart phones and always on data connections, why are we using voice?

Why try and fix it?

People keep on comparing What3Words to lat/lon, saying that it’s clearly better. I can’t see any evidence around this. I kind of like evidence in safety critical applications.

But why not use a better system than What3Words?

There are many alternatives available that could be significantly better. I can’t see any studies or data comparing the available systems.

Another similar word based solution could be better. Take a step back before trying to fix a broken solution. Working within the constraints that exist with the current system will likley result in a suboptimal solution.

Just change the word list…

As a result of using 3 words and 5.7e13 locations, a word list of 40,000 long was needed.

This means it contains words like:

chauvinism
secede
veterinarian
pseudonym
paraphernalia
clairvoyant
ratatouille
adjournment
hierarchy
acquiesce
tithe
etiquette
parochial
idiosyncrasy
prerogative 
fuchsia
macaque
baccalaureate

Removing over 7000 plural words will make this list even more complex as you replace the words.

In my opinion, it’s already at the point where it would be difficult to use by people who are not excellent at reading and spelling. It’s certainly less usable for children – the age at which a child can read out letters and numbers is earlier than they can spell trosseau.

The algorithm has issues

Features of the algorithm mean that there are closely linked areas that have lots of addresses with 2 common words in them. Even if you fix the word list, this issue still exists.

With this issue still present, it is far more likely that a mistake in one word results in another close location.

Other errors can be made

There are not only plurals. There are homophones, missing letters, adding letters, transposition… all of these should be fixed, no?

Why 3 Words?

There’s no evidence that 3 words is a good or optimal solution.

If you used 4 words, the dictionary size drops to 2748. It’s a lot easier to avoid complex words and plurals with a list this short.

They said they wouldn’t change the addresses

In many locations they say they won’t change the addresses. That means the wordlist and algorithm can’t change.

People have made lots of signs using locations.

It’s offline

The What3Words algorithm is meant to work offline. It’s in dashcams, car navigation systems. You can’t quickly and easily update them all. Which leads to…

And there’s no versioning

If they make significant changes, there’s no versioning system. Incompatible systems could lead to massive issues.

Why What3Words is not suitable for safety critical applications

Posted on April 29, 2021 by cybergibbons

Many UK police, ambulance and fire services advocate the use of What3Words to report your location in an emergency. The idea is that it is easier to communicate three words than it is to read out a grid reference, and that a position is more helpful than an address in many situations.

Due to a series of design flaws in What3Words, I do not believe it is adequate for use in these safety critical applications.

I initially believed that What3Words prevented simple mistakes causing errors; it wasn’t until a friend found two addresses under 10 miles apart that I considered that this might not be true. I decided to look into it.

So how bad is the problem? Pretty bad.

Easily Confused Words

The What3Words word list is 40,000 words long. It is important that words in this list cannot be confused, otherwise they may be communicated incorrectly. For example band/banned, bare/bear, beat/beet are easily confused.

What3Words acknowledge this themselves.

The problem is that their “best” doesn’t appear to be very good.

A quick inspection of their word list finds the following words that sound very, very similar to one another:

wants         once
recede        reseed
census        senses
choral        coral
incite        insight 
liable        libel 
ordinance     ordnance 
overdo        overdue 
picture       pitcher 
verses        versus
secretary     secretory
assets        acids 
arrows        arose
clairvoyance  clairvoyants
collard       collared
confectionary confectionery
disburse      disperse
equivalence   equivalents
incidence     incidents
incite        insight
incompetence  incompetents
independence  independents
innocence     innocents
instance      instants
intense       intents
lightening    lightning
ordinance     ordnance
parse         pass
pokey         poky
precedence    precedents
purest        purist
variance      variants

There are also a huge number of plurals. Out of the 40,000 words, 7,697 also exist in their plural form. That means that 15,924 out of the 40,000 can be confused by misreading, mishearing, or mistyping a single letter “s”. That’s 40% of the available words!

Over 75% of What3Words addresses contain words that can be confused in this way.

What3Words themselves said that “people confuse plurals only about 5% of the time when hearing them read out loud”. That gives an overall chance that a What3Words address is confused 1 in 27 times!

And this doesn’t take into account simple typing errors like flip and slip.

Broken Algorithm

What3Words acknowledge that two locations with similar addresses being confused is a problem. Indeed, if ///limit.broom.flip and ///limit.broom.slip were in the same town, that would lead to confusion.

They state that their solution is to space these confusing addresses “as far apart as possible”.

As far apart as possible turns out to not be very far.

In this small blue area below, there 255 confusing address that result in another location less than 5km away.

If we increase the error to 20km, the situation gets far, far worse. There are now 3,268 locations that can be confused, just by adding or removing a character “s” to one of the words.

To put this into perspective, we can calculate the odds that you land on one such square. There are 1,456,332 individual 3m squares in that blue box.

With a maximum acceptable error of 5km, that means there is a 1 in 5712 chance you land on one of these squares.

With a maximum acceptable error of 20km, that means there is a 1 in 446 chance you land on one of these squares.

These odds vary depending on where you are, but in most urban areas of the UK that I have checked, there is a better than 1 in 1000 chance that a square has an address that can be confused with another under 20km away.

For several areas in London, there is a better than 1 in 25 chance you find a confusing pair located within the M25.

What3Words implied the odds were closer to 1 in 2.5million. There’s a very big difference between what I’ve seen and what they are saying.

The error when you make a mistake can range from as little as 10m to as much as 20,000km. There is no way to determine how far away the other address is.

It is a matter of opinion as to what an acceptable level of error is, but there are examples of mountain rescue being sent to a location almost 40km away because of a single character confusion. 20km may not be an issue in a city, but it certainly is for an accident in the mountains.

This issue is inherent in the way the What3Words algorithm was designed. You can read more about this here.

Examples

Here are some examples of how badly this could go wrong. Without significant additional information, that the person may not be able to give, you cannot determine which of the two addresses is correct.

“There’s been a train derailment, on the Clyde.” – 1.54km

https://what3words.com/likely.stage.sock

https://what3words.com/likely.stages.sock

“One of our party has fallen to the west of the ridge on the way up to Beinn Maol Chaluim, visibility is poor.” – 1.83km

https://what3words.com/reworked.sheets.lions

https://what3words.com/reworked.sheet.lions

Just to give you an idea of the terrain around here – that short distance could be a serious delay.

“I think I’m having a heart attack. I’m walking at North Mountain Park. Deep Pinks Start.” – 1053m.

(Try reading both out)

https://what3words.com/deep.pink.start

https://what3words.com/deep.pinks.start

Conclusion

By making a single character error in a What3Words address, there is a significant chance that the location will change to another that is less than 5km away. This level of error is dangerous and difficult to detect.

In my opinion, this makes it unsuitable for use in emergency situations.

Even What3Words themselves have identified that confusing pairs would be an issue, and have stated, tens of times, that they designed the system so that they were not close together. This is simply not the case – confusing pairs frequently exist in close proximity.

You need to ask yourself why What3Words pitch their system in this way. Why state that their system doesn’t suffer from these issues, when it does? Why did they decide that this is an issue, and claim it was fixed?

The same kind of errors happen if we make mistakes with most grid reference systems, phone numbers, or addresses. This is a known issue. We read back phone numbers to check they are correct. If I read out my address as “Southampton, Scotland”, there is an obvious means of detecting an error. What3Words has been sold as immune to these errors, so people don’t check for them. This is where the risk creeps in.

Nurserycam disclosure timeline

Posted on February 22, 2021 by cybergibbons

The issue described in the previous blog about Nurserycam has been present for a number of years.

This post collates the previous times that it was disclosed to Nurserycam that I know of. If you also reported an issue, please use the contact form on this page, or DM me on Twitter. I will not disclose any information you are not happy disclosing.

My opinion is that the root cause on all of these reports is identical. A direct connection is established to the DVR using admin credentials.

All four parents agree that this is the case.

There is the possibility that Nurserycam did have a system that didn’t rely on this mechanism to connect. This may have existed prior to 2015, or it may have existed on a subset of their customer systems. This misses the point – the weakest link in the chain is the one that matters.

February 2015

A parent reports security issues to NurseryCam and to their nursery. One of the issues (amongst others) is that the IP address, admin username, and admin password of the DVR in the nursery are leaked in the HTML source when viewing the cameras using ActiveX. The password is the one documented on the Nurserycam website. This makes it trivial to for any parent to access any camera in the nursery and bypass the (ineffective) access restrictions.

Furthermore, the non-activeX mode serves images from the DVR without authentication or encryption, so anyone with the URL can see the live images from any computer. Communications from NurseryCam infer that something illegal has happened.

This parent agrees that the issue they reported is the same as the issue in my blog.

January 2020

A parent reports to their nursery that the connection is made directly to the DVR, and that the username and password are leaked to parents. The password is a derivative of the one found on the Nurserycam website, and is found to be common across a multiple nurseries in a chain.

This parent agrees that the issue they reported is the same as the issue in my blog.

October 2020

A parent reports to their nursery that they can see the admin username and password in the browser. Nurserycam take some action to resolve the issue for this particular nursery. As before, the password is as documented on the NurseryCam website.

This parent agrees that the issue they reported is the same as the issue in my blog.

February 2021

Another parent reports security issuses via their nursery. Again, this concerns the disclosure of the IP address, username and password to the parents. The password is the one documented on the Nurserycam website, as in 2015. Nurserycam take some action to resolve the issue for this particular nursery.

This parent agrees that the issue they reported is the same as the issue in my blog.

February 2021

I disclose the same issue in NurseryCam, inferred from the reverse engineering of their mobile app. Once a parent had confirmed the issues had been disclosed previously, I publicly disclosed immediately.

A warning to users of NurseryCam

Posted on February 14, 2021 by cybergibbons

This blog post is intended for a less technical audience – specifically parents and nurseries using the NurseryCam system.

NurseryCam is a camera system that is installed in nurseries, allowing parents to view their children remotely. There are tens of nurseries stating that they use this system. News articles go back as far as 2004.

Serious security issues have been found in the system. The statements that NurseryCam make about the security of their system do not align with reality.

These issues would allow any parent, past or present, to access the video feeds from the nursery. There is also the chance that anyone on the Internet could have accessed them.

I am a full-time security consultant who specialises in the security of the Internet of Things, including camera systems. The issues with NurseryCam are about as serious as it gets. NurseryCam were informed of these as early as February 2015 – 6 years ago.

The System

A Digital Video Recorder (DVR) is installed in the nursery, connected to cameras. These are like normal CCTV DVRs, used across thousands of businesses and homes in the UK.

The DVR has a web interface that can be viewed in a browser, but it would normally only be possible to view this when you are connected directly to the nursery’s network. This is because the DVR is behind the router’s firewall.

Without port forwarding,the DVR cannot be accessed.

To allow the DVR to be viewed remotely, something called port forwarding is used. This opens a hole in the nursery’s firewall, allowing the DVR to be accessed from the Internet.

Port forwarding allows access to the DVR from the Internet

To log in to the DVR, you need to know the username, password, and IP address.

When a parent wants to view the cameras, they log in to the NurseryCam website or mobile application. In the background, the parent is given the details for the DVR, including the username and password.

The normal log in procedure for a parent

The parent then establishes a direct connection to the DVR, allowing them to view the camera.

The Issues

For all parents connecting to a given nursery, they are given the same username and password for the DVR. In the examples I have been shown, the username is admin and the password are obvious words followed by 888.

This means that the parents, past and present, have all been given the administrator password for the DVR.

There are no indications that this password changes over time.

There is no need for the parent to login to the NuseryCam website to access the DVR.

There is no need to login to the NurseryCam servers to login to the DVR

With these details, the parent could connect directly to the DVR at any time of the day, view it for however long they want, and view all of the cameras, including ones you have not given them permission to view.

You can lock or delete the parent account on the NurseryCam website, but the username and password for the DVR will not change.

There is no way to stop the parent from logging into the DVR directly.

Anyone logging into the DVR would be seen as the admin user. It would be incredibly difficult for a nursery to determine if the login was from a genuine parent or someone else.

To make matters worse, the connection to the DVR is using HTTP, not HTTPS. It is unencrypted, allowing someone to eavesdrop on the video feed, username, and password.

The Risks

Any given parent for a given nursery could login to the DVR and view any and all cameras.

This could include:

A current parent viewing cameras for longer than they are meant to.
A current parent viewing cameras that they are not entitled to, such as rooms their child does not use.
A parent whose child no longer attends the nursery viewing the cameras.
Any parent who has been prevented from accessing the system (e.g. separation, abuse) viewing the cameras.

Worse still, because the password for the DVRs is common across multiple nurseries and openly documented on NurseryCam’s website, there is the potential for anyone on the Internet to access the DVR.

Port forwarding does not restrict connections to be from parents – anyone can access the DVR

The only missing piece of the puzzle is the IP address of the nursery. It would be possible to scan the entire of the UK for DVRs using this username and password in a matter of days.

Staff at NurseryCam would know the password and be able to access the DVRs without restriction.

The Discrepancies

NurseryCam state that their system is “safer than online banking”.

This is certainly not the case with the system seen here.

A common, shared, and openly documented login for the DVRs is passed to each parent.

There is no encryption used. There are no VPNs.

This is analogous to your local bank giving you the keys to their vault and just trusting that you will only take your money.

The same claims are repeated across multiple nursery websites.

The Disclosure

When security researchers find problems like this, we try to report them to the company so that they can be fixed. The aim is to keep users of the system safe. We call this disclosure.

I reported these to NurseryCam on 6th February 2021.

On 12th February 2021, I blogged about these initial concerns and Tweeted them.

Former parents reading my Twitter feed got in touch, with one parent confirming that they had informed NurseryCam of almost identical issues in February 2015 – six years ago.

Even six years ago, the claims made about security did not line up with reality.

They have been aware of serious security issues for 6 years and have not fixed them.

How were these found?

The NurseryCam Android mobile application was downloaded and then examined. By viewing the code, it was possible to see how the system operates.

Several parent users of the system have contacted me. They confirmed that the system operated as I suspected and that the DVR usernames and passwords were the same each time they logged in.

There has been no attempt to hack NurseryCam webservers.

These issues were trivial to uncover, taking no more than 15 minutes.

What should you do?

In my professional opinion, you cannot quickly fix a system that is this badly broken. You also cannot regain the trust that has been lost by selling a product that is described so inaccurately.

If you, as a nursery, operate one of these systems:

Unplug the network connection from the DVR.
Contact NurseryCam and ask that they inform all impacted nurseries immediately.
Ask why the system you have been paying for isn’t the one that is described on the NurseryCam site.

If you are a parent, I would advise contacting your nursery and request that they carry out the above steps.

Inadequate Fixes

Changing the username and password for the DVR is not a genuine fix – the username and password are still sent to the parents.

Adding encryption to the connections is not a fix – the username and password are still sent to the parents.

My Opinion

These issues are obvious and fundamental. They should not have existed in the first place.

Without the system being almost completely redesigned, it is hard to see how it can be secured adequately.

I have not tested their website, or looked at any of their other security practices.

Ask yourself if you could ever trust this company again with children’s data.

Serious issues in NurseryCam

Posted on February 12, 2021 by cybergibbons

I am not disclosing these issues lightly. They impact real people: children and parents.

I am not following the disclosure policy I follow normally as part of professional work due to the severity.

Given how the owner of FootFallCam have behaved, I cannot hold these back. The business managing these systems have not demonstrated they can handle data this important, and cannot handle this honestly.

When I saw the issues with FootfallCam, I noticed that there was a related product called NurseryCam which allows CCTV monitoring of children in nurseries.

Given the serious issues in FootfallCam, it concerned me that the same company could be handling the sensitive data like this.

From the site, nurserycam.co.uk, there is a link to an Android mobile application:

This application is called com.metatechnology.nurseryapps.

This application has some serious issues. Recent reviews are from 2021, strongly indicating that this is still actively used.

Issue 1:

The application does not use TLS to secure communications between it and API endpoints. There is no reason to not use TLS in 2021.

This has been unacceptable for years.

Issue 2:

The application sends the username and password of the parent in the URL to the endpoint http://nurserycam.metatechnology.co.uk/Service1.svc/Camera/

The username and password are passed in the URL directly.

Issue 3:

When logging in, the parent is returned a list of “ParentAccessModel” as JSON. This data is passed in plaintext, unencrypted

ParrentAccessModel is a list of nursery IP addresses, ports, usernames, and passwords to connect directly to the DVRs.

These are not per-parent credentials.

The user is admin.

The password are obvious words followed by 888.

These are accessible to any parent using the system.

Today, now, and in the past.

Issue 4:

The connection, directly to the nurseries, is then made over HTTP with the username and password of the DVR passed in the URL.

There is no encryption.

Issue 5:

Any access control enforced by their API would not be enforced by the DVRs.

The parent already has the IP address, port, username and password for the DVR.

Any controls around time limits, or locking out of accounts would not work.

The parents have credentials for the DVRs.The access controls are ineffective. The child may even leave the nursery and the parents account could be revoked on the web plarform, but they still have access to the DVR.

Issue 7:

There is no means for NurseryCam to remotely audit access to these DVRs. If the usernames and passwords have been used by a malicious party, there are no means of knowing this.

Summary of issues

The system directly exposes the DVRs in nurseries on HTTP. The username and password of the parents are passed without encryption, and then return the connection details for the DVR without encryption. Then, without encryption, the parents can log directly into the DVR. There is no means to stop them viewing the DVR whenever they want.

This is a massive difference from the claims on their site:

https://www.nurserycam.co.uk/web_cam_security.htm

We are not talking about slight differences. The linked page is so far away from reality it’s unreal.

Conclusion:

These are issues of critical severity. The implementation of this system places children at direct risk.

I make the following requests of NurseryCam:

1. Take down the NurseryCam service (mobile application and web application) before Monday 8th February 2021, 0800 GMT.

2. Within the next week (by Saturday 13th February) take action to ensure that the nurseries running DVRs have either changed the DVR passwords to something secure, or stop them being exposed to the Internet.

3. Within the next 4 weeks (by Saturday 6th March) send a communication to all nurseries and parents (current and former) informing them of these security issues. You should inform these people that you have no reasonable way to determine if anyone has watched their children.

BC Vault – is their security model better?

Posted on February 2, 2020 by cybergibbons

Yesterday, on the back and forth about BC Vault, their CTO, Alen Salamun, kept on saying their wallet was more secure, based on their product needing 5 items to be breached, and other wallets just 1.

Simple math. What is more secure? 1 thing you have to keep safe or 5 of them? Guys give me a break and overthink what you are saying. Up to this point you did absolutely nothing to compromise anything. Claiming phishing is an attck vector. Yes A BIG one for whole world.
— Alen Salamun (@AlenSalamun) February 1, 2020

To access the funds on BC Vault, you need:

Global password
Global PIN
Wallet password
Wallet PIN
Device or backup file

To access the funds on other wallets, you need:

The BIP39 words

I don’t see how you can possibly claim that BC Vault is more secure based on this comparison. All you can say is that it is different. It certainly is not “simple math”.

My BIP39 words are stored on a piece of paper around 200 miles from here, in a safe. I was told I would only have to enter them should my hardware wallet lose the key material. I do not need access to them, and probably never will. I do not need these words to spend funds. These words have never been entered into a computer.

Each time I want to use a BC Vault, I need to enter the passwords (which are entered into a computer) and a PIN (entered into the device). Entering data into a computer puts you at risk of phishing. Entering the PIN puts you at risk of shoulder surfing, among other attacks. A user will need to keep this information at hand to use the wallet, unlike BIP39 words.

In fact, I didn’t keep the BIP39 words on my Trezor, and hence it is impossible to access the funds without the device. This clearly demonstrates that you do not need the words to use the wallet.

This “simple math” is comparing apples and oranges, and is exactly the same path Bitfi went down. Bitfi claimed that their model of entering everything each time you used it was clearly better than storing keys in a secure box.

All we can say is that these are different security models.

It was inferred that I said this was worse or the same. It’s interesting how many vendors go down this route – when people compare their system to others, they automatically assume you said it was worse.

Your statement about 1 thing as “24 words” agains 5 as being the same. Look it up. You are claiming needed possesion of backup/device is trivial and so on. You sir are wrong.
— Alen Salamun (@AlenSalamun) February 1, 2020

My issue isn’t that they are different. It’s the claim that it is clearly better. Prove that 5 regularly used items are more secure than 1 infrequently used.

It’s not as simple as 5 > 1.

BC Vault: Bitfi Mk2?

Posted on February 2, 2020 by cybergibbons

I wasn’t aware of BC Vault until a few days ago, when their CTO, Alen Salamun, popped up in response to a vulnerability disclosed in another hardware wallet.

Interesting yep…that’s why @bc_vault does things completely different from scratch. 1BTC still in Bounty Wallet after more than a year….;)
— Alen Salamun (@AlenSalamun) January 31, 2020

What’s that? Another bounty, loaded onto a wallet, and sent out?

Does this sound familiar to anyone?

They even frame this as a “Guaranteed Security”:

All this bounty does is demonstrate that someone cannot recover a key from a given device – the stolen device threat.

It doesn’t provide any assurance around phishing, evil maid attacks, or the usability of the system. The bounty provides no guarantee whatsoever.

Then Dimitry Fedotov, who deals with BC Vaults business development, laid down the gauntlet:

https://twitter.com/fedotov_twit/status/1223859094139756544

1 BTC is currently $9,400.

Day rates for hardware testing are $2,000.

That’s less than 5 days pay.

A full security review and penetration test of a hardware wallet would easily run to 25-30 days of work, and cover many more threats than “someone stole my wallet”.

This bounty is just another rigged fairground game.

Bitfi – Some Requests

Posted on July 16, 2019 by cybergibbons

It’s been almost 11 months since we showed that the Bitfi Knox MD40 wallet was vulnerable to two quite serious attacks:

The keys persisted in memory for days, allowing them to be recovered over USB in a few minutes – the cold boot attack.
The device could be easily rooted and backdoored over USB in a few minutes, allowing an attacker to receive your keys when you used them – the evil maid attack.

In those 11 months, Bitfi have not informed users of their device that they were vulnerable, or if they continue to be vulnerable:

Is the USB bootloader still open on some or all Bitfi devices?
Can some or all devices still be rooted trivially?
Have reasonable precautions been made to wipe the RAM?
How does a user determine if their device is vulnerable?

Without knowledge of the vulnerabilities on their devices, users cannot take appropriate actions to mitigate risk.

If you take their threat model as truth – that it is safe from nation states – you have no idea if your funds are at risk or not.

This is completely unacceptable, especially when one of their co-founders claims that not informing users of security issues puts lives at risk.

Further to this, Daniel Khesin has stated they believe the attacks take at least 10 minutes and have a 25-30% success rate. In reality, it was 2 minutes, and we didn’t see them fail. This suggests a massive disconnect on Bitfi’s side – they don’t actually understand the issue.

2/ and took anywhere from 10 – 30 minutes just to attempt. This means that you would have to leave a device unattended & within the physical reach of a highly skilled & sufficiently equipped attacker in order to be “at risk”. While it is a “feasible attack” it is not a “likely”
— Daniel Khesin (@KhesinDaniel) July 14, 2019

Without acknowledging the ease with which the attacks were carried out, there is no way they can actually fix them properly.

I have some very simple – and reasonable – requests for Bitfi:

Document the attacks clearly and concisely on your own site, bitfi.com, including which versions of hardware and software are still vulnerable.
Inform your customers, by way of both email and the dashboard, of these issues.

A company unwilling to take these actions is, by their own words, putting people at risk.

Without these basic courtesies in place, I’m not even going to entertain looking at the devices at Defcon.

What threat model is Bitfi working under? Not a realistic one.

Posted on July 15, 2019 by cybergibbons

Daniel Khesin – Bitfi’s not-CEO – has recently started differentiating Bitfi as the only one that can protect against state actors.

But why would he be recovering it? He is not the threat. The threat are state actors. Nobody expects Cybergibbons to crawl through their window and do something. No one gives a shit about you. Government seizure or someone else with those resources and it will all be taken.
— Daniel Khesin (@KhesinDaniel) July 13, 2019

This doesn’t hold up to scrutiny.

Daniel keeps on talking about “forensic labs” that can recover keys from Trezor’s STM32 MCU. This is a feasible task.

Yes you are promoting it. I don’t care what elaborate methods you have for storing your 0.0001 BTC. I do know that Bitfi stores nothing. And your Trezor will not hold up to forensic methods. Just be honest and say Trezor is only good for online attacks. Even then it has issues…
— Daniel Khesin (@KhesinDaniel) July 14, 2019

He also claims a “secure element” does very little to stop extraction. Recovering data from most secure elements is beyond the means of nearly every lab in the world. Those that could carry it out will be charging very large sums of money.

1/ First, a secure element does very little to stop extraction. Second, the whole argument against SE and why companies use off the shelf microcontrollers (usually from STM) is because firmware is open source. Benefit of that is you can audit the source, unlike with a secure
— Daniel Khesin (@KhesinDaniel) July 14, 2019

Finally, he claims that anything typed into a computing device can be recovered. As of 2019, there are no labs on this earth who can recover data from the RAM inside a powered down MCU.

And as we have learned, being solely focused on this, anything you type into some kind of computing device is retrievable with forensic methods unless complex memory management is handled properly which is not easy..
— Daniel Khesin (@KhesinDaniel) July 13, 2019

But, let’s assume these labs exist.

There are three serious logical errors here.

Firstly, how does the state actor have such advanced capabilities in recovering data from a secure element, but are unable to backdoor a Bitfi?

If a state actor wants to access your funds on Bitfi, all they need to do is backdoor the device. We showed how easy this was last year. It’s a much easier attack than cold boot.

Secondly, if a state actor wants your funds, they are either going to get them or make your life unpleasant. In other words, YOU’RE STILL GONNA GET MOSSAD’ED UPON.

Thirdly, if these labs are capable of recovering anything typed into a computing device, this means that these same labs can recover the seed and phrase from a Bitfi.

Bitfi have created a threat model to which their own device is incredibly vulnerable to.