Did a quick dive to explore viability of migrating to ipinfo. My idea was: use lite version for enriching everything and then use pay-as-you-go for enriching authenticated user sessions.
I couldn't get /lite/ to work. In a sample of IPs I've tried with, multiple are returning 404. Your website for the same IPs is returning information. Looks like these are just not included in the lite dataset?
Turns out there is no pay-as-you-go tier. Subscription is the only option. Not a deal breaker, but dissapointing setup.
So many of my open questions answered in one answer. Thank you.
A follow up based on new information - if 'geofeed' identifies something with wrong geo location, and your method detects different geolocation, what do I see as the consumer consuming your API? I am assuming the inferred data, but that also feels counter-intuitive (since the data does not align with what ASN/ISP are reporting).
How often does your active measurement data disagree with geofeed data?
How do you handle mobile/cellular IPs
> Do you really need large scale IP address enrichment of all the IP addresses that visit your website? If yes, then for the first layer use our free data that provides ASN and country information.
If I am troubleshooting a support case that is days/weeks/months old, wouldn't this mean that enriching this information at a later date may give me different data than what it was associated with at the time the requests were made? My understanding was that IPs get re-assigned.
How frequently do IP-to-location mappings change in practice?
> I am assuming the inferred data, but that also feels counter-intuitive (since the data does not align with what ASN/ISP are reporting).
That is a very good question. Now, geofeed does not have a verification system. Active measurement is something we use to verify ASN or ISP itself.
Even active measurement has its own limitations. Now in those case where we see active measurements not producing reliable data, we do reach out to ISPs and ASNs to purchase a server in their facility. Geofeed as a system is voluntary and most major ISPs actually do not maintain or even publish that. For example, today I found out a major UK-based telecom geolocated 500k IP addresses in a town with 200k people. ISPs are not inherently incentivized to maintain the accuracy of their self-reported, voluntarily published location data. So, we do proactive outreach to purchase a server from them so we can provide consistent accurate data for their IP addresses.
For residential ISPs, we do a lot of outreach and open communication to build a good partnership with them. The goal is that we pay for the privilege to report accurate data for them.
> How often does your active measurement data disagree with geofeed data?
- Country-level: 92.0% accurate → 8% wrong country
- City-level: 79.6% accurate → 20.4% wrong city
Mobile Device GPS (169 devices, 24 countries):
- Country-level: 84.5% accurate
- City-level: 29.9% accurate → 70% wrong city
> How do you handle mobile/cellular IPs
Primarily through active measurement, we are also running a lot of research around more reliable mobile geolocation data.
Because our data is updated daily, I think due to the refresh rate we have an accuracy advantage.
> If I am troubleshooting a support case that is days/weeks/months old, wouldn't this mean that enriching this information at a later date may give me different data than what it was associated with at the time the requests were made? My understanding was that IPs get re-assigned.
You will be surprised to know that historical IP location does not have much demand.
If you are evaluating a support case after some time, you should work with your current data. If the customer raises a question, you address this in real time with their current IP address.
Usually, I do not recommend storing historic IP geolocation information. In most operations, the enrichment happens in real time within the day. Unless you want to do periodic reporting of some sort.
Internally, we of course have the data, but because our IP geolocation is so accurate, it currently sits at around 700 MB. If you add a historical layer to that data, it will be a terabyte of data. There is not much consumer need for it.
> How frequently do IP-to-location mappings change in practice?
Do you happen to know if anyone is compiling all of this data about VPNs into one place? It would be super interesting to know which VPNs are providing genuine services vs masquerading the locations. Maybe even an SEO for you.
> I highly recommend that you work with current day's data.
Just to clarify: You are suggesting that we don't pro-actively enrich every IP address, store IPs, and only enrich them when troubleshooting something?
> Do you happen to know if anyone is compiling all of this data about VPNs into one place? It would be super interesting to know which VPNs are providing genuine services vs masquerading the locations. Maybe even an SEO for you.
We made that report independently and, according to our analysis, we only identified three VPNs: Windscribe, Mullvad, and iVPN to not have virtual VPN server locations.
> Just to clarify: You are suggesting that we don't pro-actively enrich every IP address, store IPs, and only enrich them when troubleshooting something?
I think you should experiment with this yourself a little. The Lite API is completely free. So you can do ingestion enrichment and post-enrichment enrichment. See what works best for you.
I have a few flows I'm using it for and have a growing list of things I want to automate. Basically, if there is a process that takes a human to do (like creating drafts or running scripts with variable data) I make axe do it.
1. I have a flow where I pass in a youtube video and the first agent calls an api to get the transcript, the second converts that transcript into a blog-like post, and the third uploads that blog-like post to instapaper.
2. Blog post drafting: I talk into my phone's notes app which gets synced via syncthing. The first agent takes that text and looks for notes in my note system for related information, than passes my raw text and notes into the next to draft a blog post, a third agent takes out all the em dashes because I'm tired of taking them out. Once that's all done then I read and edit it to be exactly what I want.
This should not add more latency than your average VPN, since the overhead of websocket is minimal and roundtrip time is about the same.
At the moment, this is running on a single-instance with no load-balancing. The intended use case was to enable streaming of MCP SSE traffic, which is very lightweight. I would expect this to be able to handle a lot of traffic just like that, but if people start using the public instance for other use cases, I will need to think of ways to scale it.
I couldn't get /lite/ to work. In a sample of IPs I've tried with, multiple are returning 404. Your website for the same IPs is returning information. Looks like these are just not included in the lite dataset?
Turns out there is no pay-as-you-go tier. Subscription is the only option. Not a deal breaker, but dissapointing setup.