Comments (13)
Another idea: Traceroute hops based geoip
- The API or a micro-service will run a traceroute towards an IP. In our case a probe we dont know where is located
- In most cases the traceroute hostnames contain airport codes of where each router is located.
- By parsing the last 3-4 hostnames and looking for instances of airport codes in them we could in theory reliably guess the location of the probe
Example
root@ansible2:~# tracepath 171.22.117.64
1?: [LOCALHOST] pmtu 1500
1: _gateway 0.250ms
1: _gateway 0.517ms
2: 10.193.33.129 0.936ms
3: no reply
4: 10.193.0.4 0.915ms
5: ??? 0.987ms
6: ??? 1.059ms
7: ae59.bar4.Warsaw1.Level3.net 1.042ms asymm 8
8: ae2.3601.edge2.Phoenix1.level3.net 154.058ms asymm 18
9: xe5-2-5.bcr2.phx1.us.bb.symantec.net 153.391ms asymm 18
10: border5.ae2-bbnet2.phx010.pnap.net 151.237ms asymm 18
11: dedipath-62.edge1.phx.pnap.net 152.460ms asymm 18
12: 69.25.116.199 153.214ms asymm 14
13: 171.22.117.64 151.653ms reached
phx = Phoenix, USA
tracepath hetzner.com
1?: [LOCALHOST] 0.044ms pmtu 1500
1: _gateway 0.956ms
1: _gateway 0.515ms
2: no reply
3: no reply
4: 2001:bc8:1c00:1::e 2.588ms
5: 2001:bc8:1c00:1::6 1.377ms
6: ae62.bar3.Warsaw1.Level3.net 0.957ms asymm 7
7: no reply
8: AS33891-NET.edge3.Berlin1.Level3.net 11.210ms
9: ae8-2080.nbg20.core-backbone.com 21.959ms asymm 12
10: ae2-2015.nbg60.core-backbone.com 24.023ms asymm 12
11: ae1-2014.nbg40.core-backbone.com 24.395ms asymm 12
12: 2a01:4a0:1338:1ae::2 22.455ms asymm 8
13: core11.nbg1.hetzner.com 26.577ms asymm 9
14: ex9k2.dc1.nbg1.hetzner.com 24.246ms asymm 10
15: no reply
16: no reply
17: no reply
nbg1 contains nbg which is the Nuremberg airport and correct location for the probe
But there are also edge-cases where the airport is a small town and in theory we need to figure out the closest major city. e.g.
tracepath 91.196.223.248
1?: [LOCALHOST] pmtu 1500
1: _gateway 0.681ms
1: _gateway 0.503ms
2: 10.193.33.129 0.688ms
3: no reply
4: 10.193.0.2 1.101ms
5: ??? 0.889ms
6: ??? 0.994ms
7: no reply
8: be2486.ccr21.waw01.atlas.cogentco.com 1.468ms
9: be2484.ccr42.ham01.atlas.cogentco.com 13.799ms
10: be2815.ccr41.ams03.atlas.cogentco.com 21.241ms
11: be12488.ccr42.lon13.atlas.cogentco.com 112.421ms asymm 17
12: be2490.ccr42.jfk02.atlas.cogentco.com 116.660ms asymm 17
13: be2889.ccr21.cle04.atlas.cogentco.com 116.054ms asymm 15
14: be2718.ccr42.ord01.atlas.cogentco.com 114.562ms
15: be2831.ccr21.mci01.atlas.cogentco.com 130.559ms asymm 16
16: be3035.ccr21.den01.atlas.cogentco.com 141.534ms asymm 17
17: be3038.ccr32.slc01.atlas.cogentco.com 145.863ms
18: be2085.ccr21.sea02.atlas.cogentco.com 167.937ms
19: be2895.rcr21.sea03.atlas.cogentco.com 168.930ms
20: Internap-Network-Services.demarc.cogentco.com 170.209ms
21: border2.ae2-bbnet2.sef.pnap.net 169.585ms asymm 22
22: dedipath-64.edge2.sef003.pnap.net 168.898ms asymm 23
23: 69.25.117.196 168.671ms asymm 24
24: 91.196.223.248 168.063ms reached
21 and 22 contain sef
which is the Sebring airport in Florida. We could leave it as that maybe, but I think it would make more sense to assign the probe to Tampa or Orlando
Similar project https://www.caida.org/catalog/papers/2021_learning_extract_geographic_information/learning_extract_geographic_information.pdf
from globalping.
- We don't necessarily need an IP address nearby, but in the general area - for instance, the city or state. We could extend the delay to
20-30ms
just in case. We also could run tests against commercial IPs - Amazon/Google DCs for instance. - We could combine this with first solution
- True. But on the other hand, we already trust users with probe version they report
from globalping.
- Well yes thats what I meant. But that's not doable. E.g. if someone connects from Asia, South America, Africa the chances of us having a reliable testing endpoint nearby are almost 0. And then how do you decide that 5-10ms is enough?
- Yeah its combinable with anything but I am afraid that long-term it will do more harm than good
- Yes, but version is not that critical and gives the troll nothing. Faking locations would negatively affect the service in a serious way. So more motivation to do it.
from globalping.
Quick note: We need to replace digital element with something else, potentially db-ip. In 90% of issues the fault lies at DE.
from globalping.
Maybe instead of measuring latency, we could rely on routing information. Make a request to a Cloudflare-hosted website and get the location it hit from the headers. Resolve the probe and POP locations to coordinates (there are APIs, we use one here), then check it's among the top N (5?) closest CF locations (list is easy to get and distance based on coordinates is a simple calculation). If the list of CF locations is based on https://www.cloudflarestatus.com/ it can also consider that some are temporarily rerouted.
from globalping.
Cloudflare is using maxmind as far as I know on their geoip endpoint, so its nothing special. If you mean using the location of the POP thats even worse because their routing is all over the place and randomly changes. Poland was going to Frankfurt for months without any status updates until I manually reported it. And it happens all the time with all regions
from globalping.
Yes, I meant routing directly. Putting that idea aside, what you want is probably this: https://arxiv.org/pdf/2004.07836.pdf (not necessarily this exact algorithm but the idea). Assuming we have maybe 200 static points with known correct locations, we can have them pinged, do some math, and have a probably good enough estimate of the new probe's location.
The 200 (not sure how many would be really needed for good results) static points can not only be public cloud services but also probes in our network. E.g. the API could have a list of those we run ourselves and have 100% correct location. Then those probes could be used to locate the other ones.
from globalping.
I thought of that, I even want to make a public tool that uses our network for that. But it will only work in Europe and USA. Good luck guessing the location in Africa when all we have is 10 probes in South Africa :)
Maybe in a few years when we have thousands of probes everywhere
from globalping.
It shouldn't matter that much. It's true that the closer the probes, the better the results but it should be good enough for this purpose even from larger distance.
from globalping.
Why not? If someone is from Nigeria and we have Probes in Egypt and South Africa there is no way we can guess the country, not even talking about the city which is also very important info for us. And what do we do if the algorithm says Nigeria but geoip says Cameroon?
Also in most cases country is correctly provided by the DBs, maybe not all 3 but at least 1 will be correct. The issues always come when we're talking about cities.
from globalping.
I suppose your expectations are very different from what you outlined in the issue. The first post here talks about mixed up continents so I was aiming to identify a continent and roughly a country. For two neighboring countries, it won't impact the tests much anyway if we identify them incorrectly.
from globalping.
That was an example. I guess this solution could be used as a failover when 3 DBs have all different results and then choose the DB that agrees with this test. But at that point maybe it would make sense to just block that probe.
Without accurate geo information many tests could become useless. e.g. If I am debugging the routing of my CDN in Brazil, I dont want traceroutes from Mexico being reported as Brazil, it would only confuse the user
from globalping.
Example list of probes that are all detected as Los Angeles while ipinfo seems more accurate https://gist.github.com/jimaek/b3fcd57908cb15272dd2a375a4872f1f
But even in that case its wrong in many cases.
from globalping.
Related Issues (20)
- Optimized probe listing endpoints
- Performance optimizations list
- Reduce the number of progress messages HOT 4
- Store the initial probe details regardless of ACK HOT 2
- Ignore some HTTP errors
- Use Lua scripts for redis key counting HOT 1
- lower QA requirements HOT 5
- Optimize GET operations HOT 6
- Move IP ranges and blacklist files into "data" directory
- Remove "duplicate" MTR hops HOT 2
- Limits endpoint
- ifttt service
- Zapier integration
- Terraform provider
- Update perf test script
- Setting limit to higher than available probes return data from other regions HOT 8
- Create a lua script to handle measurement result
- Lua in places with multiple parallel redis operations: HOT 2
- Custom resolver not included in error responses HOT 4
- HTTP size handling HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from globalping.