Master TCP/IP Programming: Build a Health Checker with Go
Welcome! Today, you’re going to gain a deep understanding of the TCP/IP protocol foundations, and together, we’ll write a TCP server from scratch, plus a health checker for network monitoring.If you already know about TCP and are only interested in the code implementations in Go, I have provided the timestamps for that and you can skip this explanation if you want.TCP, or Transmission Control Protocol, is the backbone of reliable data streaming between nodes in a network. If it was not for this protocol, we wouldn’t have the internet today like it is, it’s an important key in our daily digital interaction, from simple web browsing to complex cloud services. But, before we dive into writing TCP code and establishing network connections, it’s crucial to understand the process that ensures this reliability - the TCP handshake process which sets the session for a smooth communication.Once we’ve covered the theory, we will get our hands dirty and implement a TCP session in Go. We will go over key operations such as dialing, listening, accepting client connections and a proper way to **handle session termination.**Lastly, we will implement a TCP health checker using only golang and standard library packages. Let ‘s get started.
Why a TCP health checker?
Before continuing, let me know in the comments below if you like this type of videos, and if you are interested in more of this, if you want any topic on this channel, feel free to comment and give me any feedback also.
1. Ensure Service Availability
A TCP health checker is able to continuously verify if services (like websites or databases) are online and reachable. If a service goes down, you know right away, ensuring minimal downtime.
2. Identify Network Issues Early
It helps detect connection problems like high latency or packet loss. By catching these early, you can prevent performance issues from impacting users.
3. Optimize Troubleshooting
A TCP health checker gives you clear insights into where the problem is—whether it’s the network, the server, or another issue. This makes fixing problems faster.
4. Track Network Performance Metrics
Measuring packet loss, latency, and response times provides key data to analyze the network’s health. This helps optimize both speed and reliability.
5. Automate Monitoring
You can automate the monitoring of critical infrastructure. This reduces the need for manual checks and alerts you immediately when something goes wrong.
6. Support Scaling Infrastructure
As systems grow, maintaining connectivity across multiple servers and networks becomes harder. A health checker can help monitor and manage this complexity effectively.
What makes TCP Reliable?
The TCP is a reliable internet protocol because it can handle problems like packet loss or packets arriving out of order. This is something you won’t get with a protocol like UDP for example.One common mistake beginners often make is thinking that TCP solves all network issues, and that’s not the case. Let’s compare TCP with UDP.The UDP protocol allows us to keep the connection alive even if packets are lost and arrive in the wrong order. This might seem like a problem, but it’s actually helpful for certain types of applications. For example, live streaming video or online games would struggle with too much delay if every lost packet had to be resent. With UDP, you can keep things moving smoothly, even if some data gets lost.TCP allows for reliable data sharing between nodes on a network, if it was not by TCP we wouldn’t have things like web browsing, email, text messaging, file transfers and bank transactions to name a few…So while UDP is better for speed in some cases, TCP is essential when data accuracy is critical.Let’s talk about packet loss. This happens when data fails to reach its destination - typically because of data transmission errors, like network congestion. Imagine trying to send 100 Mbps of data over a connection that only supports 10 Mbps. The network would get overwhelmed, and any extra data would get lost. This is called network congestion, and the nodes involved in the flow of the data drop the excess data, causing a **data loss.**TCP handles this issue by controlling the flow of data. It adjusts the speed at which data is sent, so it can send it as quickly as possible without losing too much along the way. Even if the network gets slower—like if your Wi-Fi signal weakens or the device you’re communicating with becomes overloaded—TCP will slow down to ensure the data still arrives properly. This process is called flow controlBut, keep in mind that TCP isn’t magic. It can’t fix a bad network or poor hardware. What it does is make the best out of the available network conditions by keeping track of the data being sent. If any data goes missing (unacknowledged packets), TCP will automatically resend it.Another neat trick TCP does is handling out-of-order packets. Sometimes, because of how networks are built, packets of data don’t always take the same path. This can lead to packets arriving at different times or in the wrong order. But don’t worry—TCP rearranges everything so the data gets processed in the correct sequence.Together with flow control and retransmission, these properties allow TCP to overcome packet loss and make it easy to delivery data to the recipient. So as an engineer working with TCP you have only to focus on the data you send and receive.
Working with TCP Sessions
A TCP session allows you to deliver a stream of data of any size to a recipient. In most cases, the length of the stream will be determined by higher-level protocols, and TCP will “break” this data into smaller packets or chunks to ensure reliable delivery.One of TCP’s best features is getting confirmation when the data is received. This means you don’t waste time sending large amounts of data only to find out later that something went wrong.But what happens if a packet gets lost? TCP uses timeouts to detect it. After sending off a packet, the sender starts a timer and puts the packet in a retransmission queue. If the timer runs out and the sender has not yet received an ACK from the recipient, it automatically resends the missing packet.You can think about a TCP session as a conversation between two nodes. It starts with a greeting, progresses into a conversation, and ends with a proper goodbye.
In Go, the net
package handles most of the hard work, making it easier for you to manage network connections without the usual headaches.
Establishing a Session with the TCP Handshake
The TCP Handshake consists of a sequence of numbers, acknowledgements, retransmissions and other features that sets the session for a smooth communication.The TCP handshake is like a greeting between nodes, in this case the client and the server (listening node and dialing node, the TCP itself doesn’t have a concept of client and server, but a session of two nodes). A TCP connection uses a three-way handshake to introduce the client to the server and the server to the client.
Before it can establish a TCP session, the server must listen for incoming connections waiting to react to them, that’s why servers are the listening node. **First step of the handshake:**The first step of the handshake, the client sends a packet with the synchronize (SYN) flag to the server. This SYN packet informs the server of the client’s capabilities and preferred windows settings for the rest of the conversation. Second step of the handshake:On the second step of the handshake the server will respond with its own packet, with both the acknowledgement (ACK) __and SYN flags set. The ACK flag tells the client that the server acknowledges receipt of the client’s SYN packet. The server’s SYN packet tells the client **what settings it’s agreed to for the duration of the conversation.****Third and final step of the handshake:**Finally, the client replies with an ACK packet to acknowledge the server’s SYN packet, completing the tree-way handshake. By the completion of the three-way handshake process it establishes the TCP session, and nodes may then exchange data. Once a TCP session is set up, it stays open even if no data is being sent. But leaving a session idle for too long can waste memory. Later in the video, we’ll talk about how to manage these idle connections in your code.When you start a TCP connection in Go, the program will either give you a connection object or an error. If you get the connection object, it means the TCP handshake (the setup process) worked. The good news? You don’t need to worry about handling the handshake yourself—Go takes care of it for you!
Establishing a TCP Connection with Go’s Standard Library
Go’s net
package makes it easy to create servers and clients that use TCP (and other protocols). But it’s still important to handle connections properly. Our software should pay attention to incoming data and always try to close the connection cleanly when it’s done.We’re going to write a TCP server that will:1) Listen for incoming TCP connections
-
Allow clients to connect
-
Handle each connection in the background (asynchronously)
-
Exchange data between the server and client
-
Close the connection gracefully
Establishing a TCP listener / server with Go’s Standard Library
To create a TCP server capable of listening for incoming connections (called a listener) in Go, we use the net.Listen function, there are other ones for this purpose, but this is the more generalist one. This function will return an object that implements the net.Listener interface. Let’s start by creating a listener (or server).
Understanding net.Listen
in Go
The net.Listen
function is how we start creating a TCP server in Go. It needs two things:1) A network type (like tcp
, tcp4
, or tcp6
).
- An IP address and port (these are combined as a single string, separated by a colon like
"localhost:4321"
).When you runnet.Listen
, it gives you two things back:* A listener object (this is the server, ready to receive connections).
- And an error (in case something went wrong).If it works, your server is now “bound” to the IP address and port you gave it. Binding means your operating system has locked that port for your server, and no other program can use it. If you try to use a port that’s already in use, Go will throw an errorAlways remember to close your listener when you’re done with it. You can do this with the
Close
method. It’s a good habit to usedefer
to automatically close the listener when your function finishes.If you forget to close the listener, your program might run into problems, like memory leaks or getting stuck waiting for a connection that will never come. And when you close the listener, any connections waiting to be accepted will also be unblocked and closed
What if You Don’t Provide an IP or Port?
You don’t always have to give both an IP address and a port.* If you leave the port empty or set it to 0
, Go will randomly pick a port for you.
- If you leave the IP empty, the server will listen on all available IP addresses (this includes both IPv4 and IPv6).To see what IP and port your listener ended up with, you can call the
Addr
method on the listener.
Network Types (tcp
, tcp4
, tcp6
)
The first argument of net.Listen
is the network type. The function will accept most protocols, however let’s talk about tcp…* tcp
is the most common and works with both IPv4 and IPv6.
-
Use
tcp4
to limit your server to only IPv4 addresses. -
Use
tcp6
to limit it to IPv6 addresses.
How a listener / server can accept new TCP connections
n a real-world server, you’ll want to handle multiple connections at the same time, not just one. In Go, unless you want to accept only one connection, we use a for loop to keep the server accepting connections.Once a connection comes in, we handle it in a goroutine. A goroutine is Go’s way of doing things in the background so that the server can get back to accepting new connections without waiting for the first one to finish.
The Process Step-by-Step
1. Accepting connectionsInside the for loop, the server waits for a connection by using the Accept
method on the listener. This method blocks (pauses) until a client tries to connect and the TCP handshake is complete.2. Getting a ConnectionOnce a client connects, Accept
returns two things:* A connection object (net.Conn
) that represents the link between the client and server.
- An error (if something went wrong during the handshake or if the server stopped listening).3. Handling the ConnectionWe start a new goroutine to asynchronously handle the connection. This way, the server can keep listening for new connections while it processes the current one in the background.4. Closing the Connection****After we’re done interacting with the client, we need to call
Close
on the connection. This gracefully shutdown the connection by sending a FIN packet (this tells the client the connection is done).Using goroutines lets us handle multiple clients simultaneously, taking advantage of Go’s concurrency strengths. Without them, the server would get stuck waiting for each connection to finish before moving on to the next one, which would be really inefficient.Now that we understand how to make a TCPListener, let’s implement it and them let’s make our TCP Health checker.
Making the TCP health checker
TCP listener code
func main() {
TCPListener()
}
func TCPListener() {
listener, err := net.Listen("tcp", "127.0.0.1:4321")
if err != nil {
log.Fatalf("net.Listen() error: %v", err)
}
defer func() {
if err := listener.Close(); err != nil {
log.Fatalf("listener.Close() error: %v", err)
}
}()
log.Printf("Listening on %s", listener.Addr())
// handle new connections / accept new connections
for {
conn, err := listener.Accept()
if err != nil {
log.Printf("listener.Accept() error: %v", err)
continue
}
// handle / process the connection accepted
go handleConnection(conn)
}
}
func handleConnection(conn net.Conn) {
now := time.Now()
defer func() {
if err := conn.Close(); err != nil {
log.Printf("conn.Close() error: %v", err)
}
log.Printf("Close connection from %s. Connection duration: %v ms", conn.RemoteAddr(), time.Since(now))
}()
// connection accepted - processing it
log.Printf("Accepted connection from %s", conn.RemoteAddr())
_, err := conn.Write([]byte("Hello, client!"))
if err != nil {
log.Printf("conn.Write() error: %v", err)
}
}
HealthChecker Code
func main() {
checker := NewTCPChecker(net.ParseIP("127.0.0.1"), 4321, 10)
checker.Timeout = 1 * time.Second
logOutput := log.Writer()
result := checker.CheckWithRetries(5, 4*time.Second, logOutput)
println("Result: ", result.Message)
}
type Target struct {
IP net.IP
Host net.IP
Port int
Packets int
}
type TCPChecker struct {
Target
Timeout time.Duration
}
type Result struct {
Success bool
Message string
}
func NewTCPChecker(ip net.IP, port int, packets int) *TCPChecker {
return &TCPChecker{
Target: Target{
IP: ip,
Port: port,
Packets: packets,
},
}
}
func (hc *TCPChecker) addr() string {
return fmt.Sprintf("%s:%d", hc.IP.String(), hc.Port)
}
func (hc *TCPChecker) Check(timeout time.Duration) *Result {
conn, err := net.DialTimeout("tcp", hc.addr(), timeout)
if err != nil {
return &Result{Success: false, Message: fmt.Sprintf("Failed to connect: %v", err)}
}
defer conn.Close() // graceful close connection
// here you can do a bunch of checks on the network, like tls, packets lost rate and many more things
buf := make([]byte, 1024)
for {
n, err := conn.Read(buf)
if err != nil {
if err != io.EOF {
log.Printf("conn.Read() error: %v", err)
}
break
}
// print message received
log.Printf("Received: %q", buf[:n])
}
return &Result{Success: true, Message: "Connected succesfully"}
}
func (hc *TCPChecker) CheckWithRetries(retries int, retryDelay time.Duration, logOutput io.Writer) *Result {
var result *Result
for i := 0; i < retries; i++ {
start := time.Now()
result = hc.Check(hc.Timeout)
duration := time.Since(start)
logOutput.Write([]byte(fmt.Sprintf("Health Check Attempt %d - Success: %v, Latency: %v, MSG: %s\n", i+1, result.Success, duration, result.Message)))
if result.Success {
return result
}
// if not succesful, try again
time.Sleep(retryDelay)
}
return result
}