[Networking] the OSI Model
In a former attempt (actually two) I explained the request-to-response journey from a user clicking a URL to seeing a web page displaying on the screen. Those two articles focus on the web server–how the request is handled inside, and how web servers interact with others. But that was just part of the story. How exactly are the data handled? How are they generated, encoded, transported, decoded, and presented? This article, therefore, is another attempt to walk through the clicking-a-url-to-seeing-a-result process again, with the focus on data processing. To be able to do that, we will use the help of the Open System Interconnection (OSI) model to understand the ideas and concepts that are going to be messy (we will soon see) in a structured and understandable way. Hopefully.
OSI: The Open System Interconnection Model
The OSI model is a conceptual model that characterizes and standardizes the communication functions of a telecommunication or computing system without regard to its underlying internal structure and technology. –Wikipedia
Basically, what the OSI model does is constructing a “language”, or custom, for networks to transfer data. It describes how computers should behave so that we can make sure the data we want to send and receive are properly and correctly done. Some features and characteristics of the OSI model:
- There are seven layers in this model. From bottom (where the actual data transfer occurs) to top (closest to the end user) are Physical Layer → Data Link Layer → Network Layer → Transport Layer → Session Layer → Presentation Layer → Application Layer.
- Each layer has its own concerns and functions in this data transfer procedure.
- Each time a data transfer occurs, the data is passed through the seven layers in order. For the source computer, the flow of data is from top to bottom (encoding process). When the data are sent across networks, and reach the receiving computer, they will be sent up from the bottom of the OSI model (decoding process), and are displayed to the end user.
- If we draw a mental picture, the data flow would be like a “U”, where the starting point (upper left) is the application layer of the source computer, and the end (upper right) is the corresponding application layer of the receiving computer.
- Between networks, each layer communicates only with its counterpart. That is, if computer A and B are sending data to each other, layer 1 in computer A woud only be talking to Layer 1 in computer B. Layer 2 in computer A to Layer 2 in computer B, and so on.
Why do we need OSI model?
Before we go on, let us talk about why we need OSI model. First of all, it compartmentalizes a plethora of concepts and protocols in the network world into categories, and provides a nice structure for them so we can talk about these concerns separately and design/debug networks with more methods and efficiency. Another important advantage of the OSI model is data encapsulation. Layers in the model do not interfere with each other. They only pass data, and keep the implementation details to themselves. The benefit of this is that when something goes wrong, it is easier to pin down the source of errors.
With that in mind, to the OSI model we go.
Layer Seven: Application Layer
Network Process to Application
As the top layer of the OSI model, application layer is the layer that we (users) interact directly with. Application-layer functions typically include identifying communication partners (where to transmit the data), determining resource availability (is the destination available), and synchronizing communication (send the data).
In our url-clicking case, the application is the browser. The browser selects which web server it is sending data to, contacts the server and send the request. When a response is returned, the browser displays the desired content on the web page.
Layer Six: Presentation Layer
Data Encapsulation and Encryption
Sometimes called the syntax layer, the presentation layer provides a mapping of different syntaxes to a unified data representation. It translates between application and network format, and transform data into the format that the application accepts. Other functions of this layer includes data encryption and data compression.
Usually a web page contains different formats of data: HTML files, Javascript files, image files, etc. In the presentation layer these files are transformed by Abstract Syntax Notation One into XML format to be displayed on the browser.
Layer Five: Session Layer
Interhost Communication
The session layer controls the connections between two computers by establishing, managing, and terminating sessions. Whenever we visit any website, our computer is creating a session with the web server. In applications that uses Remote Procedure Calls, the session layer is is commonly implemented explicitly.
When we request a web page, the web browser opens a TCP/UDP connection (explained below) to the web server. The web server sends back the web page and closes the connection. Each TCP/UDP connection is a session.
Layer Four: Transport Layer
End-to-End Communication and Reliability
This layer provides host-to-host communication services for applications. That is, it is responsible for delivering data to the appropriate application process on the host computers. It also coordinates the data transfer process: how much data to send, at what rate, with what behaviors, and so on. Two important protocols for this layer are the TCP (Transmission Control Protocol) and the UDP (User Datagram Protocol). They each describes a very different approach to this transfer process.
TCP: Transmission Control Protocol
We (well if you don’t, I) often hear the term “TCP/IP” in the context of computer networking. TCP is the original implementation which provides a delivery of data between hosts, or end systems, running via an IP network. Later, the term “TCP/IP” commonly refers to the entire network model (like the OSI model that we are walking through). We will go into that in the next post. Now, some characteristics of TCP:
- It is a connection-oriented protocol. It requires handshaking (that is, acknowledging each side is there and ready) to set up communications.
- It is reliable. TCP makes sure of the right order, and the integrity of data. It manages message acknowledgment, retransmission and timeout. If a message is lost along the way, the server will re-request the lost parts.
- It is heavyweight. TCP’s handshaking mechanism requires three packets to set up a socket connection, before any user data can be sent. Also, it has a fat overhead so that it could provide the functionalities mentioned above.
- A brief TCP connection walk-through: new socket → bind → listen (handshakes begin) → (connection established) → accept
UDP: User Datagram Protocol
Like TCP, UDP is a protocol that instructs how the data-transporting process in the Transport layer should go about. Yet UDP’s principle is quite different from TCP. TCP gets things done in a slower but throughout manner. UDP, on the other hand, values speed over reliability.
Some characteristics:
- UDP uses a simple connectionless communication model. It has no hand-shaking dialogues, so a message can be sent from one end point to another without prior arrangement.
- UDP does not keep track of lost packets, nor does it care about packet arrival order.
- As a result, UDP requires less computer resources.
- UDP is suitable for purposes where error checking and correction are either not necessary or are performed in the application, like streaming media applications and real-time applications.
- Since it is transaction-oriented, UDP is suitable for simple query-response protocols such as the Domain Name System (where queries must be fast and only consist of a single request followed by a single reply packet.
- It provides datagrams, basic transfer units associated with a packet-switch network, suitable for modeling other protocols such as IP tunneling or Remote Procedure Call.
Layer Three: Network Layer
Path Determination and Logical Addressing
This is the layer responsible for transferring variable-length network packets from a source to a destination, possibly through several proxies (networks). The router here plays an important part. It connects networks formed by switches. An important protocol for this layer is Internet Protocol (IP).
IP: Internet Protocol
An IP is a piece of software that operates at the Network Layer of the OSI model. It provides unique addresses (IP addresses), connectionless communication, routing, and unicast/broadcast/multicast.
The IP address is for anything connected to an internet. These addresses can change, but they are guaranteed to be unique. For example, “104.27.187.82” would be a valid IP address for website codecharms.me.
Router/Gateway
A router/gateway is a specialized host responsible for forwarding packets between networks. The reason why it exists is that many networks are partitioned into subnetworks and connect to other networks for wide-area communication.
Sending a Packet to Router
If we know both the IP address and MAC address (explained below) of our destination, we can send the data. In the case that the source and destination are not from the same IP domain, the source will be sending the data to the router first for transmission.
Layer Two: Data Link Layer
Physical Addressing
At the layer, data are decoded into frames of bits. It manages and handles errors in the physical layer, flow control and frame synchronization. Data link layer has two sub-layers: the Medium Access Control (MAC) layer and the Logical Link Control (LLC) layer. The implementation widely adopted now is a network switch.
MAC: Medium Access Control
This sub-layer controls how the computer hardware gains access to the data and permission to transmit them. A MAC address (Media Access Control address) is a unique identifier for a device in communications within a network segment. It is used as a network address for technologies including Wi-Fi, Bluetooth and Ethernet.
MAC address are assigned by the manufacturer of network interface cards, so they are sometimes referred to as burned-in address. A MAC address looks like this: ‘3A-34–52-C4–69-B8’.
LLC: Logical Link Control
The LLC sublayer acts as an interface between the MAC sublayer and the network layer. It controls frame synchronization, flow control and error checking, and makes it possible for several network protocols (IP and Appletalk for example) to coexist.
Network Switch
A switch is a computer networking device which connects devices on a network. It uses hardware addresses (physical addresses or MAC Addresses) to process and forward data in the data link layer. Also, it offers port-to-port support–data are only sent between concerning devices as well as buffering service.
Layer One: Physical Layer
Media, Signal and Binary Transmission
The physical layer is where the raw data are transported in the form of bits–0 and 1–across the network (i.e. where the magic happens). The form can be either electrical, mechanical or radio waves.
That’s it
When you click on a url, the browser will immediately translate the event into a request, and the request is sent down the OSI model, across networks, and climb up the layers in the destination server, where the data is received and processed, before a response is sent back. And it all happens in milliseconds (well if nothing crashes and it’s not a badly structured website). Amazing, isn’t it?
Next post will be on TCP/IP, a model built upon the OSI model. Be excited.
文章同步發表於 Medium。