IP

Network layer

Path-selection algorithms is implemented in

  • routing protocols: OSPF (inside 1 ISP), BGP (over multiple ISPs)
  • SDN controller: Used in DC (data center)

Network layer uses IP protocol, but ICMP protocol is also used.
ICMP is a network layer protocol that reports error between routers.

IP Datagram

IP Datagram

Each header is 32 bits (4 bytes).
Theoretical maximum length of datagram is 64K bytes, but typically IP datagram's maximum length is 1500 bytes or less.

IP header size is 20 bytes in IPv4. (40 bytes in IPv6)
Theorically IP header can include more options, but it is not used.

Payload is typically a TCP or UDP segment.

First header

  • First four bits is IP protocol version numer (4, 6)
    The IPv5 protocol used to exist as an experiment, but it is no longer in use.
  • Header length (If header length is 5, actual header length is 5 * 4bytes = 20 bytes)
  • Type of service (e.g. ECN)
  • Total datagram length as bytes

Second header

Second header is used for fragmentation.

  • 16-bit packet identifier
  • Flags (Is fragmented?)
  • Fragment offset (divided by 8)

IP fragmentation

IP packet size can't exceed MTU (Maximum Transmission Unit). MTU is usually 1500 bytes.
If IP packet is larger than MTU, router will split packet into multiple MTU size packets. (fragmentation)

If offset is 0, this packet is the first fragment.
If flag is OFF, this packet is the last fragment.

e.g. Let's assume we need to send 4000 byte datagram, which has 20 bytes IP header.

Initial header looks like this: length=4000, ID=x, fragflag=0, offset=0.
This datagram is split into 3 packets:

  • length=1500, ID=x, fragflag=1, offset=0
  • length=1500, ID=x, fragflag=1, offset=185 (1480/8)
  • length=1040, ID=x, fragflag=0, offset=370

IP reassembly

We can reassembly IP packet with 16-bit identifier.

However, if packet is lost serious problem happens.
Since this is done in network layer, (at the router) we don't have retransmission and timeout.

Instead, we prove path MTU (smallest MTU in the path) and don't use fragmentation at all. (i.e. limit IP packet size to less than or equal to path MTU)

There is a path MTU discovery protocol, but it's not widely used (and can be abused).
Instead, we just think everybody is using 1500 byte MTU.

Other headers

  • TTL (Time to live): Remaining max hops (decremented at each router)
  • Upper layer protocol (TCP/UDP)
  • Header checksum (only header, doesn't check payload)
  • 32-bit source IP address
  • 32-bit destionation IP address

IP addressing

IP address is a 32-bit identifier associated with each host or router interface.
It is usually represented as a dotted-decimal IP address notation. e.g. 223.1.1.1

Interface is a connection between host/router and physical link.
Router typically have multiple interfaces, and host typically has one or two interfaces. Typically host use wired Ethernet and wireless 802.11 (Wi-Fi).

Subnet

If we detach each interface from its host or router, we'll have islands of isolated networks, called subnet.
Subnet is a device interfaces that can physically reach each other without passing through an intervening router.

CIDR (Classless InterDomain Routing)

Subnet is represented as a CIDR format a.b.c.d/x. e.g. 223.1.3.0/24
x is the number of bits in subnet part of IP address. (i.e. high-order x bits are same)

  • Subnet part: Devices in same subnet have common high order bits.
  • Host part: Remaining low order bits are different for each devices.

Initially, IP address was classful, e.g. Class A,B,C,D,E IP addresses.

  • Class A: Starts with 0 (0~127), subnet part is 8 bits.
  • Class B: Starts with 10 (128~191), subnet part is 16 bits.
  • Class C: Starts with 110 (192~223), subnet part is 24 bits.
  • Class D: Starts with 1110 (224~239), subnet is not allowed.
  • Class E: Experimental, subnet is not allowed.

But today we use classless IP address, and we can choose arbitrary length of subnet part.
Why? Because IPv4 address space is not enough!
We want a better utilization of IP addresses.

DHCP (Dynamic Host Configuration Protocol)

If host joins the network, it dynamically get IP address from server.
Widely used in wireless networks (Wi-Fi), but wired networks can also use DHCP.

DHCP can return more than just allocated IP address, e.g. name and IP address of DNS server, subnet mask, address of first-hop router for client.

Problem: What if DHCP server is malicious and gives already assigned IP address?
Modern DHCP server have authentication and its own defense mechanism (because host can be malicious too).

DHCP messages

  1. Host broadcasts DHCP discover message. (optional)
  2. DHCP server responds with DHCP offer message. (optional)
  3. Host requests IP address with DHCP request message.
  4. DHCP server sends address with DHCP ack message.

Host use 0.0.0.0 as a source IP address until it receives DHCP ack message. (i.e. until host gets IP address)
Host/DHCP server use 255.255.255.255 as a destination IP address because host doesn't have a IP address.
If destination is 255.255.255.255, this message will be broadcast to all machines in the network.

Since we cannot distinguish hosts with IP address, we use transaction ID.

Network's IP address

Network gets allocated portion of its provider ISP's address space.

e.g. ISP has 200.23.16.0/20 IP space.
This ISP can allocate its address space to 8 organization, such as:
200.23.16.0/23, 200.23.18.0/23, ..., 200.23.30.0/23

ISP receives all IP packets destined for 200.23.16.0/20.
Then ISP can send IP packets to organizations depending on IP address, e.g. 200.23.16.0/23 goes to organization 1.

Recall) This is same as longest prefix match in routers!

ISP's IP address

ICANN (Internet Corporation for Assigned Names and Numbers) allocates IP addresses through 5 regional registries (RRs).
ICANN also manages DNS root zone, delegates individual TLD, etc.

ICANN allocated last chunk of IPv4 addresses to RRs in 2011.
IPv6 has 128-bit address space, which is enough for assigning different IP addresses to each device.

However, IPv6 is not widely used because IPv4 was enough and IPv6 didn't make internet faster...
In fact, router can have smaller forwarding table if there are fewer networks.

NAT (Network address translation)

All devices in local network share just one IPv4 address for the internet!
(Sharing multiple IP address is possible, but typically only 1 IP address is shared.)

Advantages:

  • Only one IP address is needed from provider ISP
  • Can change address of host in local network without notifying outside world
  • Can change ISP without changing addresses of devices in local network
  • Security: Devices inside local network is not directly addressable & visible by outside world

NAT router

  • For each outgoing datagram, NAT router replaces source (local IP address + port number) to (NAT IP address + new port number).
  • NAT router store translation pairs in NAT translation table.
  • For each incoming datagram, NAT router replaces destination (NAT IP address + port number) to (local IP address + port number) using NAT translation table.

Criticism of NAT

  • Router should only process up to network layer, but NAT process up to transport layer (replaces port number)
  • Address shortage is supposed to be solved by IPv6
  • Violates end-to-end argument (network-layer device should only transport packets)
  • Client can't connect to server behind NAT (other method such as port forwarding is required)

But NAT is extensively used in home networks, institutional networks, and 4G/5G cellular networks.

IPv6

Motivation: 32-bit IPv4 address space is completely allocated

Other motivations: More options - uses 40 byte fixed length header (c.f. IPv4 uses 20 byte header)
e.g. enable different network-layer tratment of flows, speed processing/forwarding

How does network operate with mixed IPv4 and IPv6 routers?
We use tunneling: IPv6 datagram is carried as payload in IPv4 datagram among IPv4 routers.
Routers determine IPv4 subpass, and use tunneling inside IPv4 pass. (We use IPv6 datagram directly outside IPv4 pass.)

IPv6 datagram format

  • Priority: Identify priority among datagrams in flow
  • Flow label: Identify datagrams in same flow (but concept of flow is not well defined...)
  • Payload length
  • Hop limit: Same as TTL
  • Source/destination address (128-bit)

Unlike IPv4, IPv6 doesn't have checksum, fragmentation/reassembly, options.
IPv6 options are available as upper-layer, next-header protocol at router.

IPv6 adoption

Google claims that about 48% of clients use IPv6. However 52% of clients still uses IPv4...
Korea tends to use IPv4, and Japan tends to use IPv6. China have to use IPv6.

IPv6 came out 25 years ago, why is it taking so long?
There are hundreds of thousands of ISPs... They don't want to move to IPv6 because it doesn't make them money.

SDN (Software-Defined Networking)

Review: each router contains a forwarding table that forwards packets based on destination IP address.

SDN is software-defined, so we can use other fields and other operations.
Also, we can use algorithms other than shortest path, e.g. use backup links, prioritize some packets.

Generalized forwarding

We can abstract generalized forwarding into "Match + Action" abstraction.

  • Match: Any pattern values in any packet header fields are matched. (e.g. MAC address, port number, etc.)
  • Actions: In addition to forwarding, you can also drop, copy, modify, and log packets. (Traditional router should only transport packets.)

Generalized forwarding examples

  • Router matches longest destination IP prefix, then forward it out a link.
  • Firewall matches IP addresses and TCP/UDP port numbers, then permit or deny it.
  • NAT matches IP address and port, then rewrite IP address and port.

Middleboxes

Middlebox is an any intermediary box performing functions apart from normal, standard functions of an IP router on the data path between a source host and destination host.

e.g. NAT, Firewalls, Web cache

Initially middleboxes started with blackbox hardware solutions, but it moved toward whitebox hardware implementing open API.
It can be implemented with programmable local actions via match + action, and it is moving towards innovation/differentiation in software.

IP hourglass

Criticism of IP?

There are many protocols in physical, link, transport, and application layers.
e.g. HTTP/SMTP/QUIC/DASH, TCP/UDP, Ethernet/Wi-Fi/Bluetooth, copper/radio/fiber

But there is only one network layer protocol: IP.
Why? IP is very simple and reliability is not needed.
IP must be implemented by every internet-connected devices.

Middleboxes are trying to solve this issue?
There is only one network layer protocol, but there are many middleboxes operating inside network layer.