I’m mentoring a highly-motivated high school student through a senior project, as he’s interested in network security and wants to do a penetration test of his high school. He’s got permission and I’ve got the spare cycles, so I agreed to mentor him.
What will hopefully follow is a series of blog posts of the compressed education I’ll give. I’m trying to constrain this to 40-ish hours, and also trying to pass enough education so he knows to a reasonable extent what he’s doing by the end, not just showing off a series of tools. With such a short time limitation, this will obviously be a whirlwind of topics, so advanced readers will have to forgive me glossing over some rather important details.
Module One – Overview of Practical Networking
I’m a big believer in learning networking from a practical point of view, as it helps with many aspects of troubleshooting. Troubleshooting, as I use the word, is having a goal and working through or bypassing the obstacles encountered, often using many different approaches to the problem. It’s okay to have eleven failed solutions, as long as you have twelve different ways to solve the problem. Penetration testing is just troubleshooting your way to Domain Admin, so it relates well.
Though in a classroom environment I’d also teach the OSI model, for the defined purpose of this class the TCP/IP model fits better, so we’ll work with that. The one-sentence descriptions of each layer below are my own attempt to sum up the intent of each layer.
I’d be foolish not to link to Wikipedia’s page on the TCP/IP model, because it is very well fleshed out. Specifically, the section on encapsulation, which I’ll quote below, explains the concept very well:
The Internet protocol suite uses encapsulation to provide abstraction of protocols and services. Encapsulation is usually aligned with the division of the protocol suite into layers of general functionality. In general, an application (the highest level of the model) uses a set of protocols to send its data down the layers, being further encapsulated at each level.
The layers of the protocol suite near the top are logically closer to the user application, while those near the bottom are logically closer to the physical transmission of the data. Viewing layers as providing or consuming a service is a method of abstraction to isolate upper layer protocols from the details of transmitting bits over, for example, Ethernet and collision detection, while the lower layers avoid having to know the details of each and every application and its protocol.
Layer 1 – Network Access Layer – “Physical network interface to network interface communication, within the same subnet”
The network access layer is scoped to just allowing hosts (or more precisely, their network interfaces) on the same network to communicate. This also includes the physical components (such as cabling and interfaces) and protocols for sending and receiving the physical signals. As an example protocol, an Ethernet address is assigned to the network card by the manufacturer and is supposed to be unique.
The link light on an Ethernet card, for example, is solely indicating whether the involved interface believes it is connected to another device speaking the same protocol. A laptop connected via Ethernet to a switch with no other hosts attached, for example, would still have an active link light, because both the switch and network interface speak Ethernet.
Moving packets from one interface to another at the link layer is called switching. A switch is a network device that connects hosts within the same subnet (or “switches packets”), and therefore operates at layer one.
Layer 2 – Internet Layer – “Logical host to host communication, across separate subnets (or within a subnet)”
As a building block above layer one, the Internet layer allows hosts in different subnets to communicate. Note that while network access layer addresses are hardware addresses assigned by the manufacturer, Internet layer addresses are assigned by the user to particular hosts. As such, a machine can have different Internet layer addresses while in different subnets (such as a laptop with a different IP address at home and at a coffee shop), whereas a network access layer device will always have the same address (assuming no MAC spoofing shenanigans).
Moving packets from one subnet to another is called routing. A router is a network device that connects multiple subnets (or “routes packets”), and therefore operates at layer two.
Layer 3 – Transport Layer – “Service to service communication”
The transport layer builds upon the Internet layer (starting to get the theme, here?) by allowing multiple services on a single host. Look at the IPv4 header, for example. There’s a field for destination address, but how do you speak to a particular service on a host? Imagine a server that runs both HTTP and FTP. How do I, as a client, tell the server which service I want to talk to? Using solely IPv4, I can only send a packet to the host as a whole — there isn’t a field for which service I mean to talk to. The transport layer, at a minimum, provides this service through the concept of ports, of which there are 65,535 (or 2^32 – 1, since port zero is reserved).
Particular services are by convention found on particular ports, and vice versa (see IANA Assigned Port Numbers). If you see port 80 is open, for example, you’d expect a web server (HTTP) to be running on that port. However, there is no “Internet police” regulating this, so people can and do run services on non-standard ports, for a multitude of reasons. High ports are commonly used for a client to connect from (i.e., as an ephemeral port), so it’s very common to see a client connect from port 49,273 (for example) to port 80, in order to connect to a web server.
As for the two major protocols used at this layer, TCP (Transport Control Protocol) and UDP (User Datagram Protocol), there are some fundamental differences.
User Datagram Protocol (UDP)
UDP is connectionless, meaning messages are sent directly, without any of the overhead that TCP has and as minimal as possible, barely providing anything beyond source and destination ports. For IPv4, even the checksum field is optional. In general, the protocols that emphasize low latency (Voice over IP, real-time video) or extremely high throughput with low overhead (DNS primarily) will tend to prefer UDP. However, since reliability isn’t guaranteed, protocols choosing UDP must be able to work despite occasional loss of packets. Voice over IP, for example, will just skip a few phonemes, where a missed DNS request will just result in sending the request again after a short timeout.
Transmission Control Protocol (TCP)
TCP is more complex, and provides reliable delivery of data in proper order. Even if some packets are dropped, if the overall connection *can* pass packets successfully, then eventually the two application-layer protocols above TCP will get their data. On a lossless network (which almost all local networks should be), the overhead is very minimal and high-speed communications are in no way hampered by TCP. As shown in the TCP header, all these features mean the protocol header itself has many more fields to track these options. We’ll skip going over the TCP fields at this level, for now.
Layer 4 – Application Layer – “Application to application communication, often on behalf of a user”
As the top-most layer in the networking stack, the application layer is the one that carries traffic from a particular application. As such, there’s a ton of variation in this layer, and many, many protocols. I’ll touch on this concept in a further blog post, but application layer packets are, in blunt terms, the entire point of the communication. Though there’s a lot of scaffolding in lower layers to get an HTTP client connected to an HTTP server, the point of all of those connections it the HTTP (which is application layer) communication. The HTTP communication (in this example) is what the user requested, which is an exceedingly common theme.