Hacker Newsnew | past | comments | ask | show | jobs | submit | hmnom's commentslogin

It could be argued you already had the phone number of your victim.

If mobile numbers in your country are in the 2________ range, how feasible is it to add millions of phone numbers to your contact list to find out the number of someone? I think this is nonsensical.


>If mobile numbers in your country are in the 2________ range, how feasible is it to add millions of phone numbers to your contact list to find out the number of someone? I think this is nonsensical.

If you're a state actor probably pretty easy. Get a couple thousand rooted remote controllable android devices (which you probably already have for other projects) and have them automatically add 10k phones numbers each. Then have them join public telegraph lists and check for matches. Now you have gone through 10 million phone numbers. Run it in a loop 10 times and you have 100 million. Might take a few days to setup and run.

I don't see why this is infeasible in any way to do if you have a moderate budget (ie: state actor).

edit: And if your target is in your jurisdiction then you probably have a good mapping of names to phone numbers already.


All this to get an app to make "do any of my contacts also use signal" requests? You could probably just figure out what endpoint the mobile client calls and imitate them yourself to avoid all the overhead of setting up the mobile devices. If you have to register to make the request, just provision a bunch of VOIP numbers and go to town.

Point being, if "who is using signal" is a question you want answered, it's far more trivial than having to acquire actual devices. Your oppressive regime could go from zero to black bag list in an afternoon.


I don't think you need a single device. Just bots with virtual numbers.


The impact is specifically related to Hong Kong, where the protesters are using telegram to coordinate, and where, according to the bug report, the telephone number range is limited.


There's apparently at least one private company that gathered a database of account-to-number correlations precisely by adding over ten million numbers to Telegram's address books. Here's an article in Russian where one account is deanonymised: https://meduza.io/feature/2019/08/10/kto-takoy-tovarisch-may...

Dunno if this is patched by Telegram in any way now. However, I don't see why it would be difficult for a program to add numbers to the contact list incrementally. To my knowledge, computers so far were pretty good at incrementing numbers. And if the contact list length is limited, the question is just how many phone numbers a company can buy.


The way cellphone telephones work, is by registering to a cell. so all they have to do is look what phones were in vicinity of cell towers in place where they protest.


It could be argued you already had the phone number of your victim.

But you have no correlation between it and Telegram user. This bug is about this correlation.


Right, the key trick here is that Telegram is easily used as an Oracle.

Telegram has essentially agreed to tell you whether any phone number is correct, so you can just guess all the phone numbers. Never allow this unless the thing an adversary has to guess is both _completely random_ and from a _very large keyspace_ (128-bits is where you can start to feel safe). If you find you're cornered into doing this (e.g. typical email + password login) aggressively rate limit it, so the adversary has to work harder/ longer to take advantage and maybe they'll give up.

Phone numbers are neither random nor from a large key space, it's maybe 10^12 worldwide or something? Much too small.


they say that they managed to add 0.1 million people at once. If you're after a group of people and getting only one of them is enough, the limits look pretty feasible to me, even more possible especially in small communities.


It's written in Python so as soon as they have enough users it will be as slow as any webpage with tons of JS. Thankfully, that CPU load won't be on the client side... so it's still an improvement


I recommend you relax your convictions about Python performance a bit.

Server-side bottlenecks are more often than not database reads/writes and other IO. And the few CPU-intensive operations can be delegated to libraries written in C.

Pure Python is slow for CPU-intensive tasks, but that doesn't mean that a Python webserver is necessarily slow.


So what you're saying is that the author of Sourcehat is expected to rewrite Flask, Jinja2, etc in C?

As an example, https://git.sr.ht/~sircmpwn/git.sr.ht/tree/master/gitsrht takes 600-800 ms to generate, and it's probably heavily cached already. What will happen when they get more users? The site will be unbearably slow, unless the guy starts spending thousands in servers.


I imagine that the CPU-intensive parts of Flask and Jinja2 are already written in C. Much of Python's standard library is.

800ms is a reasonable response time, and if they scale up according to their userbase, they will hopefully maintain that time.

(Also, we don't know how much of that 800ms is Python vs. IO)


https://github.com/pallets/flask

https://github.com/pallets/jinja

0% C

Also 800 ms is NOT a reasonable response time to generate what basically is a bunch of text, that is absurd, but I guess this is the baseline in 2019.

I trust all IO is cached. The author can confirm it. This is just how slow Python is.


Almost none of the IO of that page is cached actually[0]. The only thing I'm sure that is cached is the templates themselves. I'm pretty sure that neither git lookups, nor DB accesses are cached, where you can save time. And mind you, this is served from a single data center in USA, and the latency can already eat up a lot of that. I in Europe have 300ms ping to it, so it might be that you are simply far away from the physical location.

[0] https://git.sr.ht/~sircmpwn/git.sr.ht/tree/master/gitsrht/bl...


Switching to PyPy is likely to improve the overall performance if it's not database or I/O bound.


This. There are also several optimized versions of python, other that stock, allowing you to increase performance based on your needs. I've shipped with many of them over the years.


I used to run an image board written in Python with around 15m PV/month on a Pentium 4 machine in 2004 or so. The first bottleneck I found was the database. Some query optimization and caching fixed it very quickly. It goes a really long way until Python become a bottleneck, of which can also be fixed relatively easily (image processing in my case, which is fixed by moving the whole thing to a worker).


Are you sure? Looks like static pre-generated page to me:

  $ curl -s https://sourcehut.org/|egrep 'meta.*gen'
  <meta name="generator" content="Hugo 0.57.2" />
[0] https://gohugo.io/



The website/blog (https://sourcehut.org) is static, the Sourcehut app itself (https://sr.ht) is Python-based.

https://git.sr.ht/~sircmpwn/?search=sr.ht


Python isn't slow. Running a lot of Python is slow.


If it shards well and you have the capital to buy hardware, there's nothing wrong with using python


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: