|
Draft - comments and corrections are solicited
Level 1 Attribution
1 Introduction
BAA 03-03-FH defines four
"levels" of attack attribution, of which the first is identification
of "the specific hosts involved in the attack". The next section of
this paper is devoted to a more careful definition of the problem as
we see it. After that we describe and evaluate work that is relevant
to this problem. We begin by presenting the evaluation criteria along
with the reasons we think they are important. The related work falls
into two general classes. One is "route based filtering", an
enforcement mechanism which drops packets that routers can recognize
as carrying incorrect source addresses. The other class consists of
various methods for identifying links through which a packet was
forwarded.
Note: It's also possible to identify forwarding routers,
but links turn out to be a little better.
Most of the related work is previous work, but we also describe
a few things that have not appeared before to our knowledge.
Finally we turn to some important general questions. The first is:
How can these methods be used to solve the level 1 attribution problem?
Note that neither class of related work directly solves the problem
as we define it.
All of the methods in the related work require cooperation from
machines in different locations. We expect in most real networks it
will not be possible to obtain universal cooperation. This leads to
the question:
How does the performance of various solutions depend on the number
and placement of cooperating machines?
Finally, since there are different approaches, it is likely that some
machines will be willing and able to cooperate with one approach while
other machines will be willing and able to cooperate with another.
This leads to the question:
How can evidence from different solutions be combined?
We present answers to these questions and point out some remaining
difficult problems that must be overcome.
2 Problem Statement
Our model is that an IP packet, P, is generated by a machine, G,
forwarded by a sequence of IP routers, and finally, if not dropped
along the way, delivered to a recipient machine.
The goal of "level 1 attribution" is, given P, to identify G. This
identification might in general be requested by the recipient, any of
the forwarding routers, or any other machine to which they communicate
information about P.
Note that every IP packet is supposed to contain the IP address of the
machine that generated it in the "source address" field. If this
requirement were enforced then there would be no level 1 attribution
problem.
Some further explanation is in order on a number of points.
2.1 What's an IP router and what does it mean to "forward"?
By IP router we mean a machine that acts substantially in accordance
with rfc1812. That
reference describes forwarding, in particular how packets received by
a router are transformed into packets that it subsequently sends. The
term "forward" is used only for the transformation, described in
rfc1812, that affects only a few fields of the IP header. It does
not apply to such things as replies to packets addressed to the router
or to ICMP replies that the router sends to source addresses of
packets it cannot forward. These are considered to be "generated" by
the router. Similarly NAT devices that alter source addresses are
considered to generate the packets they send with altered source
addresses. Although it is important to find the packets that "cause"
these packets to be generated, that problem is classified as "level 2
attribution" which is to be addressed elsewhere.
2.2 What does it mean to "generate" a packet?
A packet that is sent by a machine but not "forwarded" in the sense
above is considered to have been generated by the sending machine.
2.3 Why only IP?
In general, those who wish to attribute a packet are interested in
more than IP, e.g., attributing a non-IP packet, or attributing an IP
packet to a non-IP device. We limit the discussion to IP because
- the vast majority of actual (at least non-local) traffic to be
attributed is expected to be IP
- almost all of the work to date has concentrated on IP
- the methods applicable to IP and to non-IP traffic overlap
substantially
2.4 What does it mean to "identify" a machine?
Ideally this should allow us to find a physical box. In practice a
requester is generally very happy with an IP address that is unique
within the network visible to him. In some cases, the generating
machine has no such address. In this case the second choice would be
an identity for the first forwarding router that has such an identity.
This may be viewed as an approximate answer. Any further information,
such as the fact that the packet arrived at that machine from a
particular wire, improves the approximation. This is generally the
best that can be achieved with the techniques we review here, and even
this is only possible in the presence of very high levels of
cooperation.
In the general case, lower levels of cooperation will generate less
complete attribution data, and this restricts the source to a set of
possible IP addresses. The quality of the result is then related to
the size of that set, with smaller sets considered higher quality.
2.5 Physical signals
One area that we do not address here is characterization of physical
signals. This is very useful for identifying a machine that is one
hop away, but for the most part we are concerned here with machines
that are more than one hop away.
3 Evaluation Criteria
We classify methods in terms of the answers to the following questions
that are meant to point out important advantages and disadvantages of
these solutions relative to each other. The questions are phrased so
as to make "yes" the preferred answer.
- Does the mechanism allow attribution of a single packet?
Some solutions apply only to large sets of packets presumed to come
from the same place. These are generally designed to attribute
floods. "Yes" means that after you receive a packet (but possibly
only well after deciding that some class of packets is to be
attributed - see advance notice below)
you can find data for its attribution.
Note: For purposes of the intelligence community we
believe that single packet attribution is required.
- Does the solution work with existing routers?
Some solutions require changing the way routers work. These face an
obvious economic obstacle in the cost of replacing existing routers.
If the solution is to be used in high speed routers then an even more
severe problem arises. Most organizations are not in a position to
build high speed routers and it's very difficult to convince those
that are to incorporate the desired changes. It is relatively easy to
build custom routers starting from open source software running on
cheap standard hardware, but this currently limits the bandwidth to
about 1Gbit.
An interesting compromise seems practical for some cases. If a high
speed router can identify a relatively small subset of the arriving
packets that must be processed by the attribution mechanism, it could
route these packets to a separate lower speed but more programmable
machine, which could do the specialized processing, and then return
the (possibly modified) packets to be forwarded on. We will mention
this alternative below as applicable.
One other approach to this problem is worth mentioning. Our general
complaint with commercial routers is that the end user has limited
control over what they do, i.e., they can be "configured", but not
"reprogrammed". We propose a plausible design for arbitrarily high
speed programmable routers. This design requires a very small amount
of special purpose high speed hardware to handle the high speed
network interfaces, and an arbitrarily large number of slower
general purpose programmable machines, probably running something like
FreeBSD or Linux. The special purpose hardware has to divide the
incoming high speed traffic into approximately equal packet streams,
which are then distributed to the general purpose machines. These act
as low speed but programmable routers, forwarding packets to the
special purpose high speed hardware connected to the desired output.
This hardware then recombines the outgoing traffic from all of the low
speed routers into a single high speed outgoing link. We look forward
to the appearance of the proposed special purpose hardware.
Note: Anyone in a position to help make this happen is
invited (encouraged, begged) to contact the author!
-
Does it work without advance notice of packets to be attributed?
Some solutions are too expensive to be used for all packets all the
time. One example is that the solution might slow down or even
disrupt normal traffic while in operation. As another, a monitor that
remembers something about what it has seen must limit the set of
packets it remembers. One solution is to remember only recent
packets. This, of course, limits attribution to recent packets.
Another is to remember only those determined in advance to be
interesting. In the latter case we have to tell it when we decide
that certain kinds of packets are interesting. And we have to do that
before these packets can be recorded. This is a fatal flaw for
cases where we don't know until a packet appears that we want to find
its origin.
- Does it work without additional communication after identification
of packets to be attributed?
Typically solutions that rely on memory at cooperating machines
require communication with those machines in order to ask what they
remember and receive an answer. This is a disadvantage in several
ways. First, communication has its own cost in network bandwidth and in
time before the attribution is completed. Also that communication can
be attacked. This is probably sufficient reason for the owner of the
cooperating machine to restrict access to his own network.
- Is it free of other "special" problems?
This is place for collection of miscellaneous disadvantages relative
to other approaches.
While the questions above are important and useful for classification,
we devote most of the space below to the questions
- How does it work?
- How well does it work?
The second question involves such questions as whether the results
can be incorrect, how the method could be attacked and how those
attacks can be defended.
4 Related work
As mentioned earlier, we can classify the related work into link
identification, which we describe first, and packet filtering, which
is described second. Within each of these classes we identify a
number of subclasses, each of which may be populated by a variety
of more specific variants.
4.1 Link testing
Link testing methods are intended to trace ongoing flooding attacks.
They identify links through which the traffic of interest flows by
testing candidate links, i.e., monitoring them for the traffic to be
traced. One simple way to do this is described in
Characterizing and Tracing Packet Floods Using Cisco Router.
The person doing the trace presumably has administrative control over
a router known to be on the forwarding path. He reconfigures that
router to help him determine which incoming link(s) carry the traffic.
In the case of a flood, he might simply ask for a count of packets
addressed to the victim arriving on each interface. For other traffic
he would need to provide a description that the router can use to
identify the traffic of interest. Unless the traffic is being
generated by the router it must be arriving at one or more inputs.
This leads to the next upstream router(s) which can then be traced in
turn.
Similar techniques can be used to move upstream more than one link at
a time. One might, with automated support, simultaneously reconfigure
the all routers on some perimeter around the victim, such as all the
border routers of the ISP. One interesting variant of this is
described in
CenterTrack: An IP overlay network for tracking DoS floods.
This reroutes specified packets over tunnels from edge routers to
special tracking machines, which then can easily identify the edge
routers initially receiving the traffic.
Note: This idea will appear again in the section on
deterministic marking.
Another variant is described in
BlackHole Route Server and Tracking Traffic on an IP Network.
This uses routing mechanisms to first generate ICMP replies from any
router receiving traffic for the victim and then to collect some of
these at a central point in order to identify the routers that did
receive them. Unfortunately, this technique relies on random (or
otherwise stupid forms of) source spoofing.
It is also possible to do a hostile form of link testing without
cooperation from the owners of the forwarding routers, as described in
Tracing Anonymous Packets to Their Approximate Source
Note: This paper seems to contain the original idea for most
of the methods we describe here!
This involves attacking (typically flooding) various links. This has
the significant disadvantage of attacking other (presumably innocent)
traffic. The expectation is that an attack on a link that carries the
traffic to be traced will cause a reduction in the rate at which that
traffic flows through that link, and therefore a reduction in the rate
at which it arrives downstream. This requires cooperation from
machines in a position to attack the targeted links. In this paper
that cooperation was probably unintentional. It also requires
cooperation from machines in a position to detect the loss of traffic,
but typically the party that wants to trace the packets is already in
a position to detect the loss of traffic, i.e., at the destination
address. This variant also requires more knowledge of the routing
than other link identifying methods.
The attacking variant is the only one that does not require
administrative to routers. In fact, this was the motivation for its
development. Normal link testing could actually be done without
cooperation of routers if cooperating machines could be placed at the
links to be tested, and if these were fast enough to process the
traffic at those links. However we regard this as a different
solution discussed below.
Classification
- single packet: no
- existing routers: yes
- no advance notice: no
Data collection doesn't start until an attribution problem arises.
- no additional communication: no*
Normally the operator has to communicate with upstream routers
to get results. Methods that use routing tricks avoid this.
The attack variant uses lots of additional communication.
- no special problems: no
The process is generally manual (controlled by people), hence slow.
It may disrupt normal traffic, and generally requires extra work on
the part of routers. For these reasons, link testing techniques are
generally considered to be practical only on a temporary (emergency)
basis.
Attacks:
Since link testing relies on an ongoing attack, changing the attack
pattern with time is likely to confuse matters greatly.
An attacker who knew details of the attribution testing (what's being
tested and when) he could intentionally mislead it.
Since this technique requires administrative access to routers (other
than the attack variant), cooperation can be expected to be restricted
to the administrative domain owning the routers. To the extent that
further cooperation is available it will be manual, hence inconvenient
and slow.
The attack variant does not require access to routers but faces a
whole set of problems of its own. It still needs cooperating machines
in a position to affect the links to be tested and sensors (generally
at the victim, but more generally along the attack path) to detect
resulting changes in traffic. It also requires knowledge of network
topology and routing. The ability to identify a link further depends
on such things as how much bandwidth the cooperating machines have
available and details of how routers are configured. For instance the
ability to attack might depend on what kind of traffic is used to
attack and what kind is being attacked.
4.2 Itrace
Itrace describes a method in which routers randomly (probability
~5e-5) send extra packets to source or destination addresses of
forwarded packets to report that the packet traversed a given link
connected to the router. With sufficiently many routers doing this,
the recipient of a flood can reconstruct the path(s) of the flooding
packets.
Classification
- single packet: no
Actually, it does attribute a single packet but with very low
probability, so if you really want to know where a particular
packet came from this is not a good solution.
- existing routers: no
Although this is described as being done by routers, it seems quite
practical to offload the task. If a router could be configured to
route packets probabilistically then it could route occasional
packets to a separate Itrace machine which would then forward both
the original packet and the trace packet.
Alternatively, if a router can be configured to sample packets and
send them to an analysis interface, that could be connected to an
Itrace machine that would simply send out trace packets.
- no advance notice: yes
- no additional communication: yes
The trace packets are, of course, additional communication relative
to non-Itrace operation, but these are sent as part of normal operation
independent of any attribution attempts.
- no special problems: yes
Attacks:
Attackers can send itrace packets to make it appear that attacks come
from other places, but generally cannot stop the "good" data from
telling the victim about the real attacker. The false (attack) trace
packets might be distinguished from the real trace packets by
authentication techniques, but it's much easier to make up false
packets than to check them, so this is another opportunity for the
attacker to impose additional cost on the victim.
4.3 Probabilistic packet marking (ppm)
The method described in
Practical Network Support for IP Traceback
is similar to Itrace in that a router has to do some extra work for
a small fraction of the packets it forwards, though in this case the
fraction is substantially more than for Itrace (probability order of .05).
It replaces (presumably) non-essential data in forwarded packets with
data identifying the link over which the packet was forwarded. This allows
the recipient to reconstruct the forwarding path(s) of a flood.
Classification
- single packet: no
- existing routers: no
As in the case of Itrace one could imagine offloading this task if the
router could probabilistically route packets to a marking machine.
However, the increased probability suggests that the marking machine
would have to be much faster than an Itrace machine, and for a fast
link several such machines might be necessary.
- no advance notice: yes
- no additional communication: yes
- no special problems: yes
Attacks:
Attacks involving spoofed traceback data are described in
GOSSIB vs. IP Traceback Rumors and
On the Effectiveness of Route-Based Packet Filtering for Distributed
DoS Attack Prevention in Power-Law Internets.
In general the two major problems in ppm reliability are
- The probabilistic nature of the algorithm causes some packets to not
be marked by cooperating routers, and these retain whatever marks are
given them by the senders. Attackers can then use these marks to
intentionally mislead the attribution mechanism.
- In an effort to avoid expanding packets, ppm uses (supposedly)
underutilized space in current IP packets. But this is not enough
to send the desired data, so the marks in a single packet are used
to provide only partial data. However this also leads to errors in
attribution, particularly in the presence of large numbers of paths.
Advanced and authenticated marking schemes for ip traceback
attempts to fix some of the problems.
Editorial opinion
Our view is that even if all of the problems are fixed, probabilistic
marking has no advantage over deterministic
marking (below), while deterministic marking has significant
advantages over probabilistic marking, particularly for attribution of
a single packet. High speed commercial routers don't currently
support any of the marking methods. None of them appear to
be significantly more difficult to support than any other. In the
unlikely event that we can influence router manufacturers to support
new features we might as well ask for the ones that are most useful.
The supposed benefit of probabilistic marking over deterministic
marking is that it does not require additional space in the packet.
This is considered to be a problem not because of the loss of payload
bandwidth but because the space will increase with each hop and this
is thought to be expensive for routers. The first problem can be
solved by allocating the space all at once at the first marking
router. The marking itself seems trivial, comparable in complexity
with decreasing TTL. In fact the work to allocate the extra space in
the packet at the first router is also relatively small. A more
realistic objection is that the extra space is likely to cause the
packet to exceed the MTU on the next link. This should not
be a problem since TCP/IP was designed to deal with such problems. In
the current internet it actually is something of a problem because of
widespread misconfiguration. See rfc2923 and mss.phildev.net for details. Note
the same problems occur with other protocols that need to steal a
little bandwidth, e.g., tunneling. There are commonly used
"work-arounds" (perhaps more accurately described as "I have to break
my network in order to work with your broken network") for such
problems.
4.4 SPIE
This method, described in
Single-Packet IP Traceback, calls on
routers to remember, possibly for a very limited time, data that allows
them to answer with high confidence the question "did you recently
forward the following packet?". The forwarding path of a single packet
can be reconstructed by querying such routers soon after the packet is
observed. More recent work (private communication) moves the processing
from the router to a specialized machine observing traffic on a link.
This can be viewed as a variant of
remote monitor described below.
Classification
- single packet: yes
- existing routers: no
As noted above, this task could be performed without the router.
In order to support very high speeds it would be necessary to
use special purpose hardware.
- no advance notice: yes
- no additional communication: no
As in other monitoring solutions one has to query the monitor.
- no special problems: no
The storage required is proportional to the number of packets one
wants to record. This method achieves high compression, on the order
of one or two bytes per packet, but this is still a lot of storage
per unit time for a fast router.
Attacks:
Attackers can attack the query/response communication, either the traffic
or the endpoints. For that reason access to traceback data will normally
be restricted to the administrative domain owning the routers, and possibly
a few other trusted places.
4.5 Remote Monitors
General purpose machines controlled by the party interested in attribution
can be distributed throughout the network to listen to traffic passing by.
They record whatever they are programmed to and then answer questions about
the recorded data.
Classification
- single packet: yes
Of course, we can only expect to attribute the packets that the sensors
have been instructed to look for.
Generally this is a small subset of all packets.
- existing routers: yes
- no advance notice: no
- no additional communication: no
You have to query the sensor.
- no special problems: yes
Attacks:
The same considerations described under SPIE apply here as well.
4.6 Deterministic packet marking
In this case routers mark all forwarded packets with path data.
Packets traversing the same path therefore are marked the same. The
simplest and oldest example of a similar capability is the IP "record
route" option. IP routers are supposed to add their own addresses to
such an option as they forward packets. This actually records a router
rather than a link, but if the option had room for two addresses, it
would record the first two routers, which would identify a link.
The first problem with the IP option as an attribution method is that
it's the choice of the sender to include the option. In the current
internet this allows attribution of senders who wish to be attributed,
but that's not very useful. This problem could be solved by
cooperating routers inserting the option into packets that do not
already have it.
The next problem is that routers are supposed to leave the record
route option unchanged if the space allocated for it is already filled
with data. An attacker could therefore spoof any data he wanted to.
The solution to this problem is that cooperating routers should
be configured to know which of their neighbors is "trustworthy" and
accept trace data only from such neighbors. Data from other neighbors
should be overwritten. This gives us the address of the last router
on the path that was preceded by an untrusted neighbor.
In the case where the first cooperating router is not connected to the
last in a continuous path of cooperating routers this loses some data,
namely the links traversed before the packet reached the last untrusted
router. Unfortunately, as will be shown later, this is actually the
most valuable data! The best solution to this problem that we have
found so far is tunnels between non-adjacent cooperating routers.
<-- a new result, I believe -->
Similar techniques were mentioned under link testing. In effect, a
cooperating router must determine which cooperating router will be
the next to receive a given packet, and encapsulate that packet in a
tunnel to that next router. In the most general case this leads to
an overlay network of cooperating routers.
The design of such a network reflects trade offs between the efficiency
of routing and what attribution data is available at what places.
It should be mentioned that cooperation in a given method is not always
a binary property of a machine. In this case an ISP might be willing
to route packets addressed to your network via a tunnel to another
router cooperating with you, but still not be willing to route other
packets that way.
Another problem is that transitive trust as described above allows a
single compromised router to generate any attribution data. For this
reason it is better to gather at least some additional downstream data
for purposes of validation. We show below how that data can be used.
Providing for Responsibility in a Global Information
Infrastructure describes a form of packet marking that gathers
complete path data in a very compressed form, encoding each link in
just the number of bits to specify one of the neighbors of the machine
that received the packet on that link.
Note: This is the earliest reference we have seen for this
scheme. It was reinvented and implemented as part of the flooding
defense described in A Fair
Service Approach to Defenses Against Packet Flooding Attacks. We
now see some ways to do a little better, but as a practical matter
this is probably not necessary.
As will be seen below, the data needed for attribution is the IP address
at the sending side of the first identified link. The compressed path
does not make extraction of that data convenient. In small networks
with static routing it is sufficient to simply store the association
between paths and their origins. In networks with single path routing
this can be easily generated by sending a ping to each address and
storing the path recorded on the reply packet.
Changing IP to Eliminate Source
Forgery suggests a modification that is more suitable in the
general case, where the sending IP address of the first recorded link
is added to the path data. The compressed path data is sufficient for
reconstructing the initial address (in fact, the entire path) but this
requires either information about network topology or additional
communication to provide such information. An easy test to see that
the initial IP address does actually correspond to the recorded path
is to send a ping packet to the IP address and see whether the reply
comes back with the same path. This, of course, still assumes a
single path and a stable route.
Note that a router can still forge paths that pass through itself.
Advanced and Authenticated Marking Schemes for IP Traceback
describes some techniques that are actually meant to address problems
in nondeterministic packet marking, but seem applicable to this problem
as well.
Pi: A new defense mechanism against IP spoofing and DDoS attacks
describes a deterministic marking method that is in some ways a
compromise between deterministic and nondeterministic marking. It is
deterministic, but like the nondeterministic methods, avoids expanding
the packet and generally provides less complete attribution data.
- Like ppm there may be a residue left over from the initial marks.
Unlike ppm, the residue is a particular set of bits that survive a
given path. This means that different marks may in fact come from the
same source.
- The marks from different paths are not necessarily different.
In general, then, the marks that arrive with PI do not identify a
single link as the other methods do. They do still identify a set of
possible links. This slightly complicates the analysis
below.
Some commercial routers can be configured to do a very limited form of
deterministic path marking. Cisco IOS allows routers to alter the TOS
field, which could be used by a small neighborhood of routers to identify
up to 64 places where a packet enters the neighborhood. Possibly other
commercial routers can be programmed in similar ways.
This could be used in conjunction with routers that do "real" path
marking to allow small clusters of legacy routers to cooperate in a
larger neighborhood. That is, the old routers would use TOS marking
at entry into the neighborhood, and at exit, a more capable router
would incorporate that mark into the "real" path data.
As in other methods above, it is also possible to get away without
changing routers in the case where a small portion of the traffic
coming into a router can be identified as worthy of attribution. In
this case the router could simply route that traffic to a separate
lower speed machine that does the marking. This machine could also
do the tunneling mentioned above.
Classification
- single packet: yes
- existing routers: no
A minor exception is TOS marking.
- no advance notice: yes
- no additional communication: yes
additional communication can be useful for validation
- no special problems: yes
Some people consider it a significant problem to increase the
packet size. We regard this as minor. This is discussed above
in the "editorial opinion" under probabilistic marking. The tunnels
used for communicating between non-adjacent cooperating machines may
be too expensive to allow large amounts of traffic, and this may serve
as a reason to restrict cooperation.
Attacks: An attacker who controls a trusted router can forge any path
up to that router unless some further authentication scheme is used.
Those paths that disagree with the routing algorithm can be recognized
by a simple check as described above. A router that trusts data from
an attacker effectively allows that attacker to act like a compromised
router. Authentication methods could be used, but these add significant
cost in the form of processing time and space in the marked packets.
4.7 Tunneling from cooperating machines to attribution machine
The overlay network described above suggests an additional solution.
Instead of tunneling to the next cooperating router, suppose we tunnel
to the last one. This is particularly easy in certain common cases,
such as the one where we are only interested in attribution of packets
sent to our own network. We simply ask the cooperating routers to
encapsulate traffic addressed to our network in a tunnel directly to a
router in our network. Note that the remote cooperating routers do
not have to do any marking, which means that they can be currently
available commercial routers. The receiving router can tell from the
tunnel on which a packet arrives which cooperating machine forwarded
it. In fact, the remote cooperating router can identify the link on
which it received a packet by using different tunnels to carry packets
that arrive on different links.
The tunneled traffic might carry authentication data but still be
unencrypted so as to allow other machines to attribute the
encapsulated packets. However, at the end of the tunnel, the original
packets emerge without the tunnel encapsulation which is now carrying
the attribution data. If attribution is to be done after that then
something must be done beforehand to save the attribution data. One
possibility is marking in the normal sense. Another is to summarize
the attribution data and store the result for later use.
Note: In either case the receiving machine will probably
have to be more programmable than a commercial router. Even if the
end of the tunnel is the host to which the original packet was
addressed, some additional processing must be done in the operating
system to make use of the attribution data, since it will not be
visible at the application level. How the attribution data is
stored and accessed after it arrives, however, is outside the scope
of this paper.
If no forwarding paths pass through more than two cooperating
machines then this method is the same as the overlay network. We should
therefore consider the difference in the case of a packet that passes
through more than two cooperating machines. In the overlay network, the
first machine sends the packet in a tunnel to the second machine, which
then adds a mark and sends it through another tunnel. Here the first
machine sends the packet in a tunnel directly to the final cooperating
machine. An intermediate cooperating machine receiving a packet already
encapsulated could try to verify that the encapsulation is not forged,
and if satisfied, simply forward the packet unchanged. A more straight
forward solution is to reencapsulate the packet. The receiving machine
would then have additional work to do, but would find that the packet
had traversed both machines. In effect, we use encapsulation as a form
of marking. It is, however, an inefficient method, especially if
performed at many intermediate points. This method therefore seems
best suited to the case of sparse cooperation, which we expect to be a
common case.
Note: This use of tunnels is similar to that proposed in
DecIdUoiS: Decentralized Source Identification for Network-Based Intrusions"
. They seem to have had in mind a somewhat different use from ours.
They propose to build tunnels after an attack is recognized, which would
only make sense for ongoing attacks. Also they seem to imagine a situation
where it is possible to build a tunnel from anywhere.
Classification
- single packet: yes
- existing routers: yes
This assumes that the router supports tunneling and that packets to be
attributed arrive at a low enough rate to be tunneled.
- no advance notice: yes
Except that you might want to change the set of packets to be attributed.
- no additional communication: yes
- no special problems: yes
This may be too expensive to allow large amounts of traffic, and that may
serve as a reason to restrict cooperation.
Attacks:
none
4.8 Preventing source spoofing - Ingress Filtering
RFC2827 suggests
that routers filter packets with "obviously" false source addresses.
Obvious in this case means that a given interface is only supposed to
carry traffic to/from an a-priori known fixed IP range. This is a
special case of "route based" filtering, discussed next.
Note: This idea seems first to have appeared in 1996 in
Internet Holes - Eliminating IP Address Forgery.
Classification
- single packet: yes
- existing routers: yes
- no advance notice: yes
- no additional communication: yes
- no special problems: no
This limits the possible origins of a packet to either the LAN of its
source address or any LAN that is not filtered. When a substantial
number of LANs are not filtered this is a very large set of possible
origins. Another problem is that it may not even be easy to find out
where such filtering is deployed.
Attacks:
none
4.9 Preventing source spoofing - Route Based Filtering
On the effectiveness of route-based packet filtering for distributed
DoS attack prevention in power-law internets describes a
generalization of ingress filtering in which routers filter packets
with source addresses that are inconsistent with routing data. The
generalization involves replacing the word "obvious" above with much
more sophisticated reasoning based on global routing data. We note
that it's not clear yet how to obtain sufficient routing data for the
internet. SAVE:
Source address valididty enforcement protocol describes a protocol
by which such data could be collected, but we again have the usual
problem of how well partial compliance (in this case using the
protocol to generate the needed data) solves the problem.
To the extent that route based filtering is practical, it could be
combined with other link identifying methods to allow a router to filter
on the basis of not only the fact that this packet is arriving at one
of its own interfaces, but to simulate a filtering router further upstream.
The first reference (from Purdue) presents many results on how
effective filtering would be if it were done by, e.g., a randomly
chosen half of all routers in the internet, or a set of routers chosen
by a greedy algorithm to cover all links. All of the analysis is done
on the basis of AS's. Of course AS's can be large, so determining the
AS of origin might not be very satisfactory. It's not clear whether
finer grained routing data can be obtained outside your own AS. Park
reports that cooperation from the "right" 20% of AS's (that is, we get
to choose the 20%) localizes every IP address to 5 AS's. Since we don't
generally expect to be able to choose which places will cooperate, this
doesn't seem like a very practical result.
Classification
- single packet: yes
- existing routers: no
In some cases the set of filters will be simple enough to implement with
normal filtering methods.
- no advance notice: yes
- no additional communication: yes
- no special problems: no
The main problem is obtaining data. The next problem will be maintaining
the data in real time under routing updates. There may also be as yet
unknown implementation difficulties in performing the filtering.
And, like ingress filtering, it may be difficult to determine what
filters are actually in place.
Attacks:
Since it is not yet clear what problems will arise in trying to implement
this method, it's also not yet clear what an attacker might do to magnify
those problems. One possibility is causing routing updates and preventing
the filtering routers from learning how these affect what should be filtered.
The routers might then either filter legal packets or, in an effort to
avoid that, forward many illegal packets.
4.10 summary table
Note: + is good, - is bad, * means different answers for different
cases. Techniques that we consider to be closely related are grouped
into sections.
method |
single packet |
existing routers |
no advance notice |
no later communication |
no other problems |
link testing |
- |
+ |
- |
* |
- |
|
Itrace |
- |
-[1] |
+ |
+ |
+ |
probabilistic packet marking (PPM) |
- |
-[1] |
+ |
+ |
+ |
|
SPIE |
+ |
-[2] |
+ |
- |
-
memory vs time |
monitor |
+ |
+ |
- |
- |
+ |
|
deterministic packet marking |
+ |
-[1] |
+ |
+ |
+ for tunnels see below |
tunneling |
+ |
+
high speed? |
+ |
+ |
- expense limits speed, cooperation |
|
ingress filtering |
+ |
+ |
+ |
+ |
-
poor effectiveness to cooperation ratio |
route based filtering |
+ |
-[3] |
+ |
+ |
-
obtaining routing data |
Table 1: Summary of methods and characteristics
Notes:
[1] This may be possible using (possibly nonstandard) router features to
route a subset of packets to another cooperating machine.
[2] Current work attempts to monitor links with
a separate machine. Very high speed links would still require new special
purpose hardware.
[3] Current routers are able to filter, and in some cases there will be
few enough filtering rules to use this mechanism without unacceptable
cost. However, we expect that in other cases this will not be the case.
In the general case filtering is similar in complexity to routing.
5 Combining the two classes into a solution
We remarked earlier that neither filtering nor link identification methods
directly solves the level 1 attribution problem as we have defined it.
Note: That's admittedly a slight exaggeration. Route based
filtering would solve the problem by providing perfect enforcement if
it were deployed universally, i.e., between every machine and the
network. This may even be practical in some networks.
However, the filtering approach does contain the information needed,
and in fact that information can be combined with the information
provided by the link identification methods to greatly improve the
solution. In order to describe this solution we need to describe
route based filtering at a high level.
5.1 Route Based Filtering overview
The routing in a network may be viewed as a set of permissible paths.
Route based filtering tests whether the link from which the router
receives a packet is on any of the permissible paths from the source
address of the packet to its destination address. If not, it must
have a falsified source address, so the router drops it.
Note: Routers may also use tests that are easier to perform
but allow more false source addresses.
5.2 Relation of Route Based Filtering to the solution
Filtering does not actually solve the level 1 attribution problem, in
the sense of giving the receiver of a packet the set of possible
origins for the packet. However, the information used by a router in
order to do that filtering is the same information that is needed for
level 1 attribution. The router needs to test whether a source
address is in the set of possible origins for packets received from
the link where it arrived, while level 1 attribution needs to generate
that same set of origin addresses. One could imagine changing a
filtering router to report the set of possible origins for each packet
it forwards. This would provide a direct level 1 attribution
solution.
Above we point out that filtering and attribution use the same data in
different ways. This would almost certainly lead to different data
representations. Other differences worth mentioning are that
filtering needs immediate access but only to current data, whereas
attribution need not be done very quickly (certainly not in the time
it takes to forward a packet), and one might wish to do it for packets
that arrived long ago. This suggests that while filtering would need
to track routing updates in real time, attribution might need access
to historical routing data. Also, certain approximations might be made
in filtering due to time or space constraints, whereas for attribution
it might be worth spending more effort in order to get a better
answer.
5.3 Extending the solution to link identifying methods
In order to use link identifying methods to solve the level 1
attribution problem we can simulate an imaginary router above placed
at the receiving side of the identified link. More generally, given a
set of "positive" links, that a given packet is known to have
traversed (those identified by the link identifying methods), and a
set of "negative" links, that it is known not to have traversed (the
set of links that our link identifying methods did not report, but
would have reported if the packet had traversed them), we can conclude
that the possible origins are those from which at least one path to
the destination traverses all of the observed links and none of the
unobserved links.
It is worth mentioning a few minor complications in this scheme.
One is that different methods are susceptible to different errors.
In some cases we might believe that a packet has traversed a link
when it has not, or that it has not when it really has. In these
cases, of course, the certainty of the conclusions must be adjusted
accordingly. In particular, some methods (such as probabilistic
marking) provide strong evidence that a link was traversed but very
weak evidence that it was not. Others (such as SPIE) provide strong
evidence that a link was not traversed and slightly weaker evidence
that it was. Also, some methods provide evidence for only some
packets, and we have to know which packets those are. For instance,
remote sensors may be looking for only certain types of packets.
The most interesting case is route based filtering itself, which
tells us that certain packets could not have traversed certain links.
Note: Actually it tells us that those that did traverse
those links were dropped immediately after they did, so the fact
that a packet arrived at some later place shows it did not traverse
that link.
However, in order to know which packets could not have traversed which
links we need the same data that the router uses to filter.
Despite the complications, we now see that there is a very straight
forward way to combine the evidence from all of the methods we have
described, both link identifying and filtering methods. There may
still be other ways of combining lower level conclusions of different
methods, i.e., data used to justify the conclusion that a link was or
was not traversed. We have not attempted to identify these lower
level conclusions or how they might be combined.
5.4 Remaining problems in Route Based filtering and Level 1 attribution
As described above, the ability to translate what can be observed from
cooperating machines into the desired set of IP addresses relies on
routing information. While the initial focus of this work was the
cooperation of the machines needed for identifying links, we now see
that another kind of cooperation is needed in order to obtain the
necessary routing data! Routing protocols are designed to tell each
router what it needs to know in order to forward packets. They do not
attempt to compute what sources could have sent packets that would
traverse a given link with a given destination address.
In a private network it may be possible to obtain complete routing
data. In the current internet this is apparently not possible at
present. There are sources of partial routing data in the internet.
What data can be obtained, how it can be combined and how well the
result can be used to determine the set of possible origins of packets
traversing a given link are all subjects of active research.
(*** citations? Kihong Park, Rajesh Talpade, me, who else?)
5.5 Quality of attribution as a function of cooperation
The other general question we hoped to explore was how the set of
cooperating machines was related to the quality of the attribution
result for various methods. Here we refer to the machines that
cooperate in identifying links or filtering packets, leaving aside the
new issue of what cooperation might be needed to obtain routing data.
It would be nice to be able to characterize the cooperation and the
quality of the solution in simple terms, such as "with n percent of
the routers cooperating we can expect to limit the set of possible
origins to size m". Unfortunately this seems to be asking too much.
The size of the sets of possible origins are generally not uniform,
and the results depend strongly on such factors as routing and which
routers are cooperating.
We instead present some general facts and examples that we hope will
serve to help the reader understand of how the placement of
cooperating machines affects the attribution quality. We mention to
begin with that the mechanisms for identifying links and filtering
require the cooperation of remote machines, either the routers that
actually do the forwarding or other machines in position to see the
traffic in transit. In all of these cases the cooperating machine
must be connected to the link to be filtered or identified. The rest
of our explanations concentrate on links that are known to be positive
or negative. It is to be understood that this maps in the obvious way
to the locations of cooperating machines.
Actual routing algorithms are much more constrained than the notion of
a set of paths would suggest. Each router makes a local decision
based on the content of the packet (generally only the destination
address). That is, the router normally doesn't even know where the
packet has been before so the next hop it chooses cannot depend on
that. Given two links, l1 and l2, with the property that some path,
p, traverses first l1 and later l2 on the way to destination d, we
will say that l1 is "upstream" of l2 (or l2 is "downstream" of l1).
Note: There are actually two different conditions of
interest, which turn out to be the same in the common case where there
is only one next hop from a given router to a given destination. For
the rest of this paper we assume that is the case, but for
completeness we mention here that for positive links the interesting
property is the one above, meaning that the set of possible sources
for a packet traversing the link is the set of machines at the sending
side of the upstream links. For negative links, the corresponding
property is that ALL paths through l1 on the way to D later go through
l2. This means that if the packet did NOT traverse l2 then it could
not have traversed l1, or more to the point, it could not have
originated at any point upstream of l2 in this (negative) sense.
The fact that routers choose next hops independent of previous hops
means that the set of possible origins for the upstream link is a
subset of the set of possible origins for the downstream link. In
general, positive results from upstream links provide more information
than from downstream links, while negative results from downstream
links provide more information than negative results from upstream
links. Since the positive links must all be on a common path, all of
the positive information is contained in the most upstream positive
link. On the other hand, all of the negative information is contained
in the set of negative links that are farthest downstream (meaning
that no other negative link is downstream of them).
Note: Some link identifying methods identify multiple
links, or even complete paths. We see here that only the most
upstream link is important. Other links may be useful as consistency
checks. There are two main reasons that other links are even
collected. One is that the methods sometimes can't tell which links
are upstream of which. The other is that the methods were invented
for goals other than attribution in our sense. Mostly they are meant
as components in defense against flooding attacks, and attributing the
packets to ip addresses is not necessary for this task.
 Figure 1
Figure 1 attempts to summarize the possible situations. The boxes
represent routers and the lines connecting them represent links. In
order to remove any doubts about routing the sample graph shows no
loops. D is the destination of the packet in question. An arc is
meant to represent the subset of the IP addresses or machines upstream
of the link crossed by the arc. Plus signs represent positive links
(known to have been traversed by the packet), while minus signs
represent negative links (known not to have been traversed).
Cooperating routers either justify a plus sign on one link other than
the one leading toward D (the one where the packet arrived) or, by
failing to do that, justify a minus sign on the link from the router
toward D. In this case router A has not seen the packet so we put a
minus sign on link D-A. Of course, this would also justify minus
signs on all the other links connected to A, but we can omit these
since they add no more information. Router C, on the other hand,
reports that the packet arrived on link C-E which justifies a plus
sign on that link. This similarly implies a plus sign on link D-C but
again that adds no information. It is also possible for a machine
other than a router to monitor a set of links, which might justify
such marks as the plus sign on E-F and the minus sign on G-H.
As mentioned earlier, all of the plus signs must be on a common path,
specifically on the path from the true source to D. If there are any
positive links we can ignore all but the one farthest upstream, in
this case link E-F. Also if there are any positive links then we can
ignore any negative links other than those upstream of the farthest
upstream positive link. In this case the minus sign on link D-A is
not upstream of E-F so it adds no new information. We also mentioned
earlier that we can ignore any minus sign that is upstream of another
minus sign. In this case, if there were a minus sign on link F-G then
we could ignore the minus sign on link G-H. The set of possible
origins is then the set of machines upstream of the farthest upstream
positive link minus (that is, set difference) the set of machines
upstream of the remaining negative links. In this example there is
only one negative link upstream of the most upstream positive link, so
the result is the set of machines upstream of E-F minus the set of
those upstream of G-H.
If there are no positive links then we must use the entire address
space as the positive set and subtract from that the addresses
upstream of the negative links. That case can be visualized by
erasing the two plus signs. In that case the answer is the entire
address space minus the addresses upstream of D-A and G-H. This
illustrates a problem with negative links. They typically provide
much less information than positive links. In order to provide the
same amount of information as one positive link it would take negative
links that cover the entire address space outside that positive link.
In particular, negative links far upstream provide very little
information. Ingress filtering is a case in point. An ingress filter
amounts to a minus sign at the outside of the diagram (very far
upstream). The common complaint is that ingress filtering does little
good unless it's deployed universally. For instance, if half the
networks in the internet did ingress filtering, this would allow you
to attribute an arbitrary packet to either the network containing its
source address or the half of the networks that don't filter. That
seems a very weak result considering that half of the networks in the
internet are cooperating.
Note: Despite the complaints, ingress filtering is
currently the most widely deployed technology relevant to level 1
attribution. Even at low levels of deployment it's worth using
where available. The greatest cost in doing so may be simply
finding out where it is deployed. If others start to deploy more
general filtering technology, there will also be a cost in
determining what exactly is being filtered.
We now present a variety of what we hope are instructive examples.
-
As an example where negative links are very useful, consider a
perimeter of sensors around the destination, all reporting negative
results. This shows that the packet originated from inside the
perimeter.
-
Even a single negative link can be very useful by eliminating a large
part of possible address space. In figure 1 this would be the case
if most of the machines upstream of link E-F were also upstream of
link G-H. Alternatively, in figure 1, if most of the address space
were upstream of link D-A then that single negative link would be
very useful even in the absence of any other data.
-
A cooperating firewall allows you to determine whether a packet
originated inside or outside the firewall. The set of possible
sources is generally much larger outside, but you might be much more
interested in knowing which packets originated inside.
-
We next consider a filtering firewall which does not identify packets
that it forwards. It simply drops inbound packets with inside source
addresses and outbound packets with outside addresses. The fact that
negative results are most useful close to the destination might
suggest that filters, which provide only negative results, would be
particularly useful close to the destination. The problem is that
filters provide negative information only for those addresses that
would be filtered. A packet that arrives on the inside with an
outside source address would not have been filtered, so we gain no
information about its origin (it could have come from either the
inside or outside). This does allow us to attribute to the inside any
packets with inside source addresses observed on the inside. For
purposes of attribution filters would obviously be more useful if they
also reported links for packets they forwarded.
-
Suppose you want to be able to determine whether an arbitrary packet
that arrives in your network originated in Brazil.
Note: We restrict this to packets arriving at some
particular place because in order to even see every packet originating
in Brazil (without even trying to determine its origin) it would be
necessary to have a sensor on every link between machines in Brazil.
The first task is to determine what IP addresses are in Brazil. We do
not address the problem of how to do that. In fact that problem may
already be unsolvable. The next task is to determine the routing from
those addresses to you. So far we don't have a good solution to that
problem either. Of course, routing is subject to change. It is
clearly useful for both the party interested in attribution and his
adversary to control routing. The next step is to find the links on
these routes between machines inside Brazil and machines outside. We
need to monitor these links. (Just in case the previous problems
didn't seem difficult enough!). Note that there may be some links
between inside Brazil and outside that we don't need to monitor.
These are the links that only carry traffic that could never get to your
network. We conclude that a packet originated in Brazil if the
farthest upstream link of those above originates in Brazil.
Referring again to figure 1, we can imagine that links E-F and
G-H are the only ones connecting the inside of Brazil to the outside.
G-H carries traffic from Argentina which reaches you through Brazil.
E-F carries traffic from Brazil to Florida. In this case a packet
originated in Brazil if it was first marked on link E-F (which implies
that it was not marked on link G-H).
5.6 Summary of the attribution process
The party that wants to do attribution initially arranges to gain the
cooperation of some set of machines in various places around the
network. Generally he has some control, but not complete control over
where such cooperating machines can be placed. Where he would like to
place cooperative machines depends on what sort of attribution
problems he wants to solve. The first question is what class of
packets he would like to attribute. The answer is typically those
with a particular set of destinations, but sometimes those with a
particular set of origins, and sometimes only a subset of such
packets. The second question is what sort of answer he needs,
typically expressed as a set of equivalence classes, e.g., he wants to
be able to determine whether the packet originated in his own internal
network or elsewhere and within his own network he wants to be able to
attribute each packet to a single machine. Finally he needs some idea
of the network routing to be expected. This determines whether a set
of cooperating locations is adequate to satisfy the requirements , and
so drives the placement.
When a packet arrives the cooperating machines report that the packet
traversed certain links and did not traverse certain others. This,
along with detailed routing information (obtained we do not yet know
how) implies that it must have originated from one of the places in a
set that is computed.
6 Summary
The first contribution of this paper is the definition of the level 1
attribution problem in terms of identifying sets of possible IP
addresses from partial link traversal data. In the past traceback has
been viewed as identifying the entire forwarding path, which requires
levels of cooperation that can be expected only in theory.
We also point out the relation between this problem and route based
filtering. We further point out that many different methods that have
been proposed all produce similar data, namely links traversed (or not)
by the packets to be attributed. We show how partial results for all
such methods can be combined to produce the best result and characterize
the quality of that result. Finally, we survey and characterize a wide
variety of techniques, some of which appear not to have been discussed
in the literature, at least in this context.
address for Don's spam filter, which will provide instructions for
reaching him
Last modified: Fri Dec 5 09:03:48 PST 2003
|
|