Dear Job & all,
APNIC is ready to do RPKI ROV and drop invalids for some time, with good enough experience of doing it on our conference network and our testing network. And, of course, we also have been doing training & technical assistance (& deployathon) on this for a few years already so we have gained good operational experience from others’ networks as well. Our AS4608 is just a bit special as our MyAPNIC system is being hosted there where members have to use to create/update their ROAs.
As we are seeing thousands of invalid routes originated from hundreds of ASNs from our region, we want to do better change management by reminding our members who are announcing invalid routes by mistake to correct their ROAs and also to educate them to access MyAPNIC to correct their ROAs via alternate networks if they have connectivity issues due to incorrect ROAs from their home networks before we do any big change. In fact, those are the key messages of our blog post: https://blog.apnic.net/2021/04/28/cleaning-up-your-rpki-invalid-routes/.
We don’t have a fixed timeline for our own ROV for AS4608 deployment yet as we are still watching the progress of the dropping of invalid routes in our region but the effective change (i.e. ROV for AS4608) should come earlier now as we have learnt that our transit providers will all drop invalids in a few months’ time so the change management part is becoming more urgent.
We’re happy to hear any suggestions/inputs from the community.
And, when we receive the ROV deployment announcement from the last transit provider of ours, we will update you all.
In any case, let’s work together to reduce the number of invalid ROAs which were created by mistake.
Best regards,
Che-Hoo
From: Job Snijders via SIG Routing Security <sig-routingsecurity@apnic.net>
Reply to: Job Snijders <job@fastly.com>
Date: Monday, 10 May 2021 at 20:30
To: Geoff Huston <gih@apnic.net>
Cc: Aftab Siddiqui <aftab.siddiqui@gmail.com>, "sig-routingsecurity@apnic.net" <sig-routingsecurity@apnic.net>
Subject: [SIG-RoutingSecurity] Re: Should APNIC drop RPKI Invalids?
Dear group,
Should APNIC drop RPKI Invalids?
Yes.
At this stage I think it is more important for APNIC to focus on gaining
operational experience (which hopefully leads to increased Trust Anchor
stability), than to count how many RPKI invalid BGP routes (potential
hijacks!) APNIC's production network accepted.
On Thu, May 06, 2021 at 12:48:50PM +1000, Geoff Huston via SIG Routing Security wrote:
If AS4608 turns on drop invalids, as you appear to be suggesting here,
I for one would loose some aspects of visibility, and while we are all
trying to deploy this and make it stable it seems to me to be a real
shame if we were blinded from this deployment data at this stage.
Have the APNIC researchers considered the option to count 'invalid
routes rejected', instead of counting 'invalid routes accepted'?
Assuming APNIC's EBGP border routers are not from an ancient make &
model, those routers probably have support for RFC 7854 "BGP Monitoring
Protocol" (BMP) which can be used to enable researchers to inspect
"pre-policy Adj-RIB-In" data.
If the APNIC researchers set themselves up with a BMP feed, they won't
need to encourage the APNIC OPS team to degrade APNIC's routing security
posture for the sake of 'science'. Instead, APNIC OPS can develop an
sound routing policy focused on routing table hygiene, while the
research department can work with unfiltered data.
The question for me is what is the _real_ objective of APNIC dropping
invalids and are they other ways to achieve this that does not impair
the visibility of measurement tools that we run from within AS4608?
The real objective is to decrease the gap that currently exists between
those holding the keys to the kingdom and those trying to make a pig
fly in production environments.
The same considerations that apply to any multi-homed network, apply to
APNIC's network: when a multi-homed network uses RPKI ROV and rejects
invalids, they increase their chances of sending packets down a path
that leads to the authorized origin.
An organization like APNIC should "walk like it talks", APNIC should
gladly suffer the same pain other network operators experience. What's
the worst that can happen? that APNIC OPS discover a problem before
members discover/report it, and then add a check to a pipeline to
prevent it from re-occuring?
If APNIC (who are a regular Autonomous System, like tens of thousands of
other ASs!) places themselves in the same boat and uses the same tools
and technologies (BGP-4, RPKI, HTTPS, DNS) that others have to use to
connect to the Internet, conversations how to improve become much
easier.
We've seen in the last 2 years how a lack of clue and hands-on
experience led some people to believe that validating signatures,
validating manifests, CRLs, expiration dates, or detecting basic
encoding errors all are optional and just for funzies. Sadly we've also
observed how RSYNC and RRDP service endpoints can be broken for months
on end. It is time to stop merely masking problems - but really solve
the underlaying issue. Problems in this context are a thousand-fold
more difficult to discuss when one of the parties will never experience
any operational consequences from failures in the RPKI.
The credibility of Trust Anchor operators will increase if those
operators actually use the RPKI technology stack for production purposes
themselves.
Kind regards,
Job
ps. I'm happy to volunteer my time to help APNIC write router configs /
set up RTR servers based on my experiences with such work.
_______________________________________________
SIG Routing Security mailing list -- sig-routingsecurity@apnic.net
To unsubscribe send an email to sig-routingsecurity-leave@apnic.net