Your website (or online application or service) isn’t just a website. It’s also the backend, plus the hosting medium (local hardware, cloud, etc.), plus the IP connection, plus search engine visibility, plus an open-ended (and growing) list of other factors. And one of the most basic factors is the flow of traffic. In many ways, it is the key factor in the dynamic environment in which any website or service lives. Without any traffic, you site doesn’t really exist in the online universe.
Your site’s traffic defines its relationship with your user base, and with the broader internet user base in general. So understanding web application traffic flow is critical. Amazon’s Virtual Private Cloud Flow Logging VPC flow intends to help you with that.
VPC Flow Logging is a service provided by Amazon which allows you to create separate logs of the flow of traffic for an entire VPC, for a specified subnet, or for a specific cloud-based network interface; when you log a VPC or a subnet, all of its subordinate network interfaces are included in the log. The logs themselves aren’t real-time; they are broken up into capture windows, each of which represents about ten minutes’ worth of activity. While this precludes real-time response to log events, it does allow you to respond (or set up automated responses) with a ten-minute time granularity. Within this limit, the log data provides an extremely valuable record of specific events, along with statistical information which can be used to analyze a variety of metrics.
Amazon VPC itself is designed for near-maximum flexibility; you can use it to do, if not everything, at least a large enough subset of everything to meet the DevOps deployment needs of most organizations. This includes creating a precisely-controlled mix of private and public-facing resources.
Each line in a VPC Flow Log represents a record of the network flow from a specific IP/port combination to a specific IP/port combination using a specific protocol during the capture window time. The record captures the source and destination IPs and ports, the protocol, the size of the data transferred in packets and bytes, the capture window start and end times, and whether the traffic was accepted or rejected according to the network or security group access control lists (ACLs). Each record also includes some basic housekeeping information, such as the IDs of the account and network ID interfaces, and the status of the log record itself (normal logging, no data during the capture window, data skipped during the window).
VPC Flow Logs Impact on DevOps
What does this mean for AWS DevOps? Thew logs are filterable, so, for example, you can create a log of nothing but records with the REJECTED status, allowing you to quickly examine and analyze all rejected connection requests. You can also create alarms, which will send notifications of specific types of events, such as a high rate of rejected requests within a specified time period, or unusual levels of data transfer. And you can use API commands to pull data from the logs for use with your own automation tools.
The ability to create logs for specific subnets and interfaces makes it possible to monitor and control individual elements of your VPC independently. This is an important capability in a system which combines private and public-facing elements, particularly when the private and public-facing regions interface with each other. VPC Logging can, for example, help you diagnose security and access problems in such a mixed environment by indicating at what point in an interaction an unexpected ACCEPT or REJECT response occurred, and which direction it came from. You can also tailor automated responses based on the role of a specific subnet or interface, so, for instance, a high volume of data transfer, or repeated REJECTS on a private subnet would produce a different response than they would with a public-facing interface (or even with another private subnet), based on expected user behavior in relation to the specific subnet or interface.
The biggest impact that VPC Flow Logging has on DevOps may come from the ability of third-party analytics applications to interface with the Flow Logs. An analytics suite with access to Flow Log data, for example, can map such things as ACCEPT/REJECT and dropped data statistics by region and time. It can also provide detailed insight into user traffic for specific elements of a site or specific services; transferring the raw Flow Log data to a visual analytics dashboard can make specific traffic patterns or anomalies stand out instantly.
Although the data contained in individual Flow Log records may appear to be somewhat basic at first glance, an analytics suite can provide a sophisticated and detailed picture of your site’s traffic by correlating data within individual Flow Logs, with data from other Flow Logs for the same system, and with data from outside sources. When displayed on a visual analytics dashboard with drill-down capabilities, correlated log data can provide you with an in-depth picture of your site’s traffic at varying levels of resolution, and it can provide you with a detailed view of any anomalies, suspicious access attempts, or configuration errors.
Sumo Logic App for Amazon VPC Flow Logs
The Sumo Logic App for Amazon VPC Flow Logs leverages this data to provide real-time visibility and analysis of your environment. The app consists of predefined searches, and Live and Interactive Dashboards that give you real-time visibility and data visualization on where network traffic is coming from and going to, actions that were taken such as accepted or rejected as well as data volumes (number of packets and byte counts). You can also share relevant dashboards and alerts with your security teams to streamline operational, security and auditing processes. Specifically, the Sumo Logic application for Amazon VPC Flow Logs allows you to:
- Understand where there is latency and failures in your network.
- Monitor trending behaviors and traffic patterns over time.
- Generate alarms for observed anomalies & outliers within the network traffic such as source/destination IP address, number of packets accepted/rejected and byte count.
It’s worth noting that Sumo Logic also provides a community-supported collection method using Kinesis stream. If you are using Amazon Kinesis and are interested in this option, log into the Sumo Logic Support Portal and see the instructions in Sumo Logic App for Amazon VCP Flow Logs using Kinesis. But for most use cases, the Sumo Logic App for Amazon VPC Flow Logs described here is the preferred solution.
Summary
Because the VPC Flow Log system is native to the Amazon Cloud environment, it eliminates the overhead and the limitations associated with agent-based logging, including the limited scope of network flow that is visible to most agents. VPC Flow Logs provide a multidimensional, multi-resolution view of your network flow, along with an API that is versatile enough for both automation and sophisticated analytics. All in all, the VPC Flow Log system is one of the most valuable and flexible cloud-based tools available for DevOps today.
About the Author
Michael Churchman started as a scriptwriter, editor, and producer during the anything-goes early years of the game industry. He spent much of the 90s in the high-pressure bundled software industry, where the move from waterfall to faster release was well under way, and near-continuous release cycles and automated deployment were already de facto standards. During that time he developed a semi-automated system for managing localization in over fifteen languages. For the past ten years, he has been involved in the analysis of software development processes and related engineering management issues.