Quantcast
Channel: Blog – Sumo Logic
Viewing all 1036 articles
Browse latest View live

Customers Share their AWS Logging with Sumo Logic Use Cases

$
0
0

In June Sumo Dojo (our online community) launched a contest to learn more about how our customers are using Amazon Web Services like EC2, S3, ELB, and AWS Lambda. The Sumo Logic service is built on AWS and we have deep integration into Amazon Web Services. And as an AWS Technology Partner we’ve collaborated closely with AWS to build apps like the Sumo Logic App for Lambda.

So we wanted to see how our customers are using Sumo Logic to do things like collecting logs from CloudWatch to gain visibility into their AWS applications. We thought you’d be interested in hearing how others are using AWS and Sumo Logic, too. So in this post I’ll share their stories along with announcing the contest winner.

The contest narrowed down to two finalists – SmartThings, which is a Samsung company operates in the home automation industry and provides access to a wide range of connected devices to create smarter homes that enhance the comfort, convenience, security and energy management for the consumer.

WHOmentors, Inc. our second finalist, is a publicly supported scientific, educational and charitable corporation, and fiscal sponsor of Teen Hackathon. The organization is, according to their site, “primarily engaged in interdisciplinary applied research to gain knowledge or understanding to determine the means by which a specific, recognized need may be met.”

At stake was a DJI Phantom 3 Drone. All entrants were awarded a $10 Amazon gift card.

dji_phantom3_drone - sumo logic contest aws logging

AWS Logging Contest Rules

The Drone winner was selected based on the following criteria:

  • You have to be a user of Sumo Logic and AWS
  • To enter the contest, a comment had to be placed on this thread in Sumo Dojo.
  • The post could not be anonymous – you were required to log in to post and enter.
  • Submissions closed August 15th.

As noted in the Sumo Dojo posting, the winner would be selected based on our own editorial judgment and community reactions to the post (in the form of comments or “likes”) to select one that’s most interesting, useful and detailed.

SmartThings

SmartThings has been working on a feature to enable Over-the-air programming (OTA) firmware updates of Zigbee Devices on user’s home networks. For the uninitiated, Zigbee is an IEEE specification for a suite of high-level communication protocols used to create personal area networks with small, low-power digital radios. See the Zigbee Alliance for more information.

According to one of the firmware engineers at SmartThings, there are a lot of edge cases and potential points of failure for an OTA update including:

  • The Cloud Platform
  • An end user’s hub
  • The device itself
  • Power failures
  • RF inteference on the mesh network

Disaster in this scenario would be a user’s device ending up in a broken state. As Vlad Shtibin related:

“Our platform is deployed across multiple geographical regions, which are hosted on AWS. Within each region we support multiple shards, furthermore within each shard we run multiple application clusters. The bulk of the services involved in the firmware update are JVM based application servers that run on AWS EC2 instances.

Our goal for monitoring was to be able to identify as many of these failure points as possible and implement a recovery strategy. Identifying these points is where Sumo Logic comes into the picture. We use a key-value logger with a specific key/value for each of these failure points as well as a correlation ID for each point of the flow. Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.

Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.

We then created a search query (eventually a dashboard) to view the flow of the firmware updates as they went from our cloud down to the device and back up to the cloud to acknowledge that the firmware was updated. This query parses the log messages to retrieve the correlation ID, hub, device, status, firmware versions, etc.. These values are then fed into a Sumo Logic transaction enabling us to easily view the state of a firmware update for any user in the system at a micro level and the overall health of all OTA updates on the macro level.

Depending on which part of the infrastructure the OTA update failed, engineers are then able to dig in deeper into the specific EC2 instance that had a problem. Because our application servers produce logs at the WARN and ERROR level we can see if the update failed because of a timeout from the AWS ElasticCache service, or from a problem with a query on AWS RDS. Having quick access to logs across the cluster enables us to identify issues across our platform regardless of which AWS service we are using.

As Vlad noted, This feature is still being tested and hasn’t been rolled out fully in PROD yet. “The big take away is that we are much more confident in our ability identify updates, triage them when they fail and ensure that the feature is working correctly because of Sumo Logic.”

WHOmentors.com

WHOmentors.com, Inc. is a nonprofit scientific research organization and the 501(c)(3) fiscal sponsor of Teen Hackathon. To facilitate their training to learn languages like Java, Python, and Node.js, each individual participate begins with the Alexa Skills Kit, a collection of self-service application program interfaces (APIs), tools, documentation and code samples that make it fast and easy for teens to add capabilities for use Alexa-enabled products such as the Echo, Tap, or Dot.

According WHOmentors.com CEO, Rauhmel Fox, “The easiest way to build the cloud-based service for a custom Alexa skill is by using AWS Lambda, an AWS offering that runs inline or uploaded code only when it’s needed and scales automatically, so there is no need to provision or continuously run servers.

With AWS Lambda, WHOmentors.com pays only for what it uses. The corporate account is charged based on the number of requests for created functions and the time the code executes. While the AWS Lambda free tier includes one million free requests per month and 400,000 gigabyte (GB)-seconds of compute time per month, it becomes a concern when the students create complex applications that tie Lambda to other expensive services or the size of their Lambda programs are too long.

Ordinarily, someone would be assigned to use Amazon CloudWatch to monitor and troubleshoot the serverless system architecture and multiple applications using existing AWS system, application, and custom log files. Unfortunately, there isn’t a central dashboard to monitor all created Lambda functions.

With the integration of a single Sumo Logic collector, WHOmentors.com can automatically route all Amazon CloudWatch logs to the Sumo Logic service for advanced analytics and real-time visualization using the Sumo Logic Lambda functions on Github.”

Using the Sumo Logic Lambda Functions

“Instead of a “pull data” model, the “Sumo Logic Lambda function” grabs files and sends them to Sumo Logic web application immediately. Their online log analysis tool offers reporting, dashboards, and alerting as well as the ability to run specific advanced queries as needed.

The real-time log analysis combination of the “SumoLogic Lambda function” assists me to quickly catch and troubleshoot performance issues such as the request rate of concurrent executions that are either stream-based event sources, or event sources that aren’t stream-based, rather than having to wait hours to identify whether there was an issue.

I am most concerned about AWS Lambda limits (i.e., code storage) that are fixed and cannot be changed at this time. By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 100. Why? The default limit is a safety limit that protects the corporate from costs due to potential runaway or recursive functions during initial development and testing.

As a result, I can quickly determine the performance of any Lambda function and clean up the corporate account by removing Lambda functions that are no longer used or figure out how to reduce the code size of the Lambda functions that should not be removed such as apps in production.”

The biggest relief for Rauhmel is he is able to encourage the trainees to focus on coding their applications instead of pressuring them to worry about the logs associated with the Lambda functions they create.

And the Winner of AWS Logging Contest is…

Just as at the end of an epic World-Series battle between two MLB teams, you sometimes wish both could be declared winner. Alas, there can only be one. We looked closely at the use cases, which were very different from one another. Weighing factors like the breadth in the usage of the Sumo Logic and AWS platforms added to our drama. While SmartThings uses Sumo Logic broadly to troubleshoot and prevent failure points, WHOmentors.com use case is specific to AWS Lambda. But we couldn’t ignore the cause of helping teens learn to write code in popular programming languages, and building skills that may one day lead them to a job.

Congratulations to WHOmentors.com. Your Drone is on its way!


Using HTTP Request Builders to Create Repeatable API Workflows

$
0
0

paw http request builderAs an API Engineer, you’ve probably spent hours carefully considering how API will be consumed by client software, what data you are making available at which points within particular workflows, and strategies for handling errors that bubble up when a client insists on feeding garbage to your API. You’ve written tests for the serializers and expected API behaviors, and you even thought to mock those external integrations so you can dive right into the build. As you settle in for a productive afternoon of development, you notice a glaring legacy element in your otherwise modern development setup:

  • Latest and greatest version of your IDE: Check.
  • Updated compiler and toolchain: Installed.
  • Continuous Integration: Ready and waiting to put your code through its paces.
  • That random text file containing a bunch of clumsily ordered cURL commands.

…one of these things is not like the others.

It turns out we’ve evolved…and so have our API tools

Once upon a time, that little text file was state-of-the-art in API development. You could easily copy-paste commands into a terminal and watch your server code spring into action; however, deviating from previously built requests required careful editing. Invariably, a typo would creep into a crucial header declaration, or revisions to required parameters were inconsistently applied, or perhaps a change in HTTP method resulted in a subtly different API behavior that went unnoticed release over release.

HTTP Request Builders were developed to take the sting out of developing and testing HTTP endpoints by reducing the overhead in building and maintaining test harnesses, allowing you to get better code written with higher quality. Two of the leaders in the commercial space are Postman and Paw, and they provide a number of key features that will resonate with those who either create or consume APIs:

  • Create HTTP Requests in a visual editor: See the impact of your selected headers and request bodies on the request before you send it off to your server. Want to try an experiment? Toggle parameters on or off with ease or simply duplicate an existing request and try two different approaches!
  • Organize requests for your own workflow…or collaborate with others: Create folders, reorder, and reorganize requests to make it painless to walk through sequential API calls.
  • Test across multiple environments: Effortlessly switch between server environments or other variable data without having to rewrite every one of your requests.
  • Inject dynamic data: Run your APIs as you would expect them to run in production, taking data from a previous API as the input to another API.

From here, let’s explore the main features of HTTP Request Builders via Paw and show how those features can help make your development and test cycles more efficient. Although Paw will be featured in this post, many of these capabilities exist in other HTTP Builder packages such as Postman.

How to Streamline your HTTP Request Pipeline

Command-line interfaces are great for piping together functionality in one-off tests or when building out scripts for machines to follow, but quickly become unwieldy when you have a need to make sweeping changes to the structure or format of an API call. This is where visual editors shine, giving the human user an easily digestible view of the structure of the HTTP request, including its headers, querystring and body so that you can review and edit requests in a format that puts the human first. Paw’s editor is broken up into three areas. Working from left to right, these areas are:

  • Request List: Each distinct request in your Paw document gets a new row in this panel and represents the collection of request data and response history associated with that specific request.
  • HTTP Request Builder: This is the primary editor for constructing HTTP requests. Tabs within this panel allow you to quickly switch between editing headers, URL parameters, and request bodies. At the bottom of the panel is the code generator, allowing you to quickly spawn code for a variety of languages including Objective-C, Swift, Java, and even cURL!
  • HTTP Exchange: This panel reflects the most recent request and associated response objects returned by the remote server. This panel also offers navigation controls for viewing historical requests and responses.

Paw Document containing three sample HTTP Requests and the default panel arrangement

Figure 1. Paw Document containing three sample HTTP Requests and the default panel arrangement.

As you work through building up the requests that you use in your API workflows, you can easily duplicate, edit, and execute a request all in a matter of a few seconds. This allows you to easily experiment with alternate request formats or payloads while also retaining each of your previous versions. You might even score some brownie points with your QA team by providing a document with templated requests they can use to kick-start their testing of your new API!

Organize Request Lists for Yourself and Others

The Request List panel also doubles as the Paw document’s organization structure. As you add new requests, they will appear at the bottom of the list; however, you can customize the order by dragging and dropping requests, or a create folders to group related requests together. The order and names attached to each request help humans understand what the request does, but in no way impacts the actual requests made of the remote resource. Use these organization tools to make it easy for you to run through a series of tests or to show others exactly how to replicate a problem.

If the custom sort options don’t quite cover your needs, or if your document starts to become too large, Sort and Filter bars appear at the bottom of the Request List to help you focus only on the requests you are actively working with. Group by URL or use the text filter to find only those requests that contain the URL you are working with.

Request List panel showing saved requests, folder organization, and filtering options

Figure 2. Request List panel showing saved requests, folder organization, and filtering options.

Dealing with Environments and Variables

Of course, many times you want to be able to test out behaviors across different environments — perhaps your local development instance, or the development instance updated by the Continuous Integration service. Or perhaps you may even want to compare functionality to what is presently available in production.

It would be quite annoying to have to edit each of your requests and change the URL from one host to another. Instead, let Paw manage that with a quick switch in the UI.

Paw’s Environment Switcher changes variables with just a couple of clicks.

Figure 3. Paw’s Environment Switcher changes variables with just a couple of clicks.

The Manage Environments view allows you to create different “Domains” for related kinds of variables, and add “Environments” as necessary to handle permutations of these values:

Paw’s Environment Editor shows all Domains and gives easy access to each Environment.

Figure 4. Paw’s Environment Editor shows all Domains and gives easy access to each Environment.

This allows you flexibility in adjusting the structure of a payload with a few quick clicks instead of having to handcraft an entirely new request. The Code Generator pane at the bottom of the Request Builder pane updates to show you exactly how your payload changes:

Paw Document showing the rebuilt request based on the Server Domain’s Environment

Figure 5. Paw Document showing the rebuilt request based on the Server Domain’s Environment.

One of the most common setups is to have a Server Domain with Environments for the different deployed versions of code. From there, you could build out a variable for the Base URL, or split it into multiple variables so that the protocol could be changed, independent of the host address — perhaps in order to quickly test whether HTTP to HTTPS redirection still works after making changes to a load balancer or routing configuration. Paw’s variables can even peer into other requests and responses and automatically rewrite successive APIs.

Many APIs require some form of authentication to read or write privileged material. Perhaps the mechanism is something simple like a cookie or authentication header, or something more complex like an oAuth handshake. Either way, there is a bit of data in the response of one API that should be included in the request to a subsequent API. Paw variables can parse data from prior requests and prior responses, dynamically updating subsequent requests:

Paw Document revealing the Response Parsed Body Variable extracting data from one request and injecting it into another.

Figure 6. Paw Document revealing the Response Parsed Body Variable extracting data from one request and injecting it into another.

In the case shown above, we’ve set a “Response parsed body” variable as a Querystring parameter to a successive API, specifically grabbing the UserId key for the post at index 0 in the Top 100 Posts Request. Any indexable path in the response of a previous request is available in the editor. You may need to extract a session token from the sign-in API and apply it to subsequent authenticated-only requests. Setting this variable gives you the flexibility to change server environments or users, execute a sign-in API call, then proceed to hit protected endpoints in just a few moments rather than having to make sweeping edits to your requests.

Request Builders: Fast Feedback, Quick Test Cycles

HTTP Request Builders help give both API developers and API consumers a human-centric way of interacting with what is primarily a machine-to-machine interface. By making it easy to build and edit HTTP requests, and providing mechanisms to organize, sort, and filter requests, and allowing for fast or automatic substitution of request data, working with any API becomes much easier to digest. The next time someone hands you a bunch of cURL commands, take a few of those minutes you’ve saved from use of these tools, and help a developer join us here in the future!

Using HTTP Request Builders to Create Repeatable API Workflows is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Bryan Musial (@BKMu) is a Full-Stack Senior Software Engineer with the San Francisco-based startup, Tally (www.meettally.com), working to make managing your credit cards automatic, simple, and secure. Previously, Bryan worked for the Blackboard Mobile team where he built and shipped iOS and Android applications used by millions of students and teachers every day.

Integrated Container Security Monitoring with Twistlock

$
0
0

Twistlock provides dev-to-production security for the container environment. More specifically, The Twistlock container security suite offers 4 major areas of functionality:

  • Vulnerability management that inspects the full stack of components in a container image and allows you to eradicate vulnerabilities before deployment.
  • Compliance which enforces compliance with industry best practices and configuration policies, with 90+ built-in settings covering the entire CIS Docker benchmark.
  • Access control that applies granular policies to managing user access to Docker, Swarm, and Kubernetes APIs.  This capability builds on Twistlock’s authorization plugin framework that’s been shipping as a part of Docker itself since 1.10.
  • Runtime defense, which combines static analysis, machine learning, Twistlock Labs research, and active threat feeds to protect container environments at scale, without human intervention.

Integration with Sumo Logic

Because Twistlock has a rich set of data about the operations of a containerized environment, integrating with powerful operational analytics tools like Sumo Logic is a natural fit.  In addition to storing all event data in its own database, Twistlock also writes events out via standard syslog messages so it’s easy to harvest and analyze using tools like Sumo Logic.

Setting up integration is easy, simply follow the standard steps for collecting logs from a Linux host that Sumo Logic has already automated.  After a collector is installed on a host Twistlock is protecting, configure Sumo Logic to harvest the log files from /var/lib/twistlock/log/*.log:

3

In this case, the log collection is named “twistlock_logs” to make it easy to differentiate between standard Linux logs.

Note that Twistlock produces 2 main types of logs, aligned with our distributed architecture as illustrated below.

  • Console logs track centralized activities such as rule management, configuration changes, and overall system health.
  • Defender logs are produced on each node that Twistlock protects and are local in scope.  These logs track activities such as authentication to the local node and runtime events that occur on the node.

4_cloud

Once log files are collected, searching, slicing, and visualizing data is done using the standard Sumo Logic query language and tools.  Here’s a simple example of just looking across all Twistlock logs using thesource=”twistlock_logs” query:

5
Of course, the real power of a tool like Sumo Logic is being able to easily sort, filter, and drill down into log data.  So, let’s assume you want to drill down a little further and look for process violations that Twistlock detected on a specific host.  This is a common incident response scenario and this illustrates the power of Twistlock and Sumo Logic working together to identify the anomaly and to understand it more completely.  To do this, we simply add a little more logic to the query:

(_sourceCategory=twistlock_logs (Process violation)) AND _sourcehost = “cto-stable-ubuntu.c.cto-sandbox.internal”

6
Perhaps you’re looking for a specific action that an attacker took, like running netcat, something that should likely never happen in your production containers.  Again, because of Twistlock’s runtime defense, this anomaly is automatically detected as soon as it occurs without any human having to create a rule to do so.  Because Twistlock understands the entrypoint on the image, how the container was launched via Docker APIs, and builds a predictive runtime model via machine learning, it can immediately identify the unexpected process activity.  Once this data is in Sumo Logic, it’s easy to drill down even further and look for it:

(_sourceCategory=twistlock_logs (Process violation)) AND _sourcehost = “cto-stable-ubuntu.c.cto-sandbox.internal” AND nc

7

Of course, with Sumo Logic, you could also build much more sophisticated queries, for example, looking for any process violation that occurs on hosts called prod-* and is not caused by a common shell launching.  Even more powerfully, you can correlate and visualize trends across multiple hosts.  To take our example further, imagine we wanted to not just look for a specific process violation, but instead to visualize trends over time.  The Twistlock dashboard provides good basic visualizations for this, but if you want to have full control of slicing and customizing the views, that’s where a tool like Sumo Logic really shines.

Here’s an example of us looking for process violations over time, grouped in 5 minute timeslices, and tracked per host, then overlaid on a line chart:

_sourceCategory=twistlock_logs (Process violation)| timeslice 5m | count as count by _timeslice, _sourceHost| transpose row _timeslice column _sourceHost

8

Of course, this just touches on some of the capabilities once Twistlock’s container security data is in a powerful tool like Sumo Logic.  You may also build dashboards to summarize and visualize important queries, configure specific views of audit data to be available to only specific teams, and integrate container security event alerting into your overall security alert management process.  Once the data is in, the possibilities are limitless.

Create a dashboard

Here we go over the steps of which to create a dashboard in Sumologic to show and analyze some of this data

  • Login to Sumo Logic
  • Create a new search
  • Use the following query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
    • _sourceCategory=twistlock/example (violation) | timeslice 24h | count by _timeslice | order by _timeslice desc
    • Run the query and select the Aggregates tab
    • You should be looking at a list of dates and their total count of violations

9

  • Select the single value viewer from the Aggregate Tab’s toolbar

10

  • Click the “Add to dashboard” button on the right hand side to start creating a new dashboard by adding this chart as a panel
  • Create the new panel
    • Enter a title for example: Violations (last 24 hours)
    • Enter a new dashboard name for example: Overview Dashboard

11

  • Click Add

12

  • As an optional step you can set coloring ranges for these values. This will help you quickly identify areas that need attention.
    • When editing the value choose Colors by Value Range… from the cog in the Aggregate Tab’s toolbar

13

    • Enter 1 – 30 and choose green for the color
    • Click Save
    • Enter 31-70 and choose orange for the color
    • Enter 71 – (leave blank) and choose red for the color
    • Click Save

14

  • Create single value viewers using the same process as above for each of the queries below: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
    1. Network Violations
      • _sourceCategory=twistlock/example (Network violation) | timeslice 24h | count by _timeslice | order by _timeslice desc
    2. Process Violations
      • _sourceCategory=twistlock/example (Network violation) | timeslice 24h | count by _timeslice | order by _timeslice desc
  • Your dashboard should look similar to this

15

  • Create another chart using the same process as above but this time use the search query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
  • _sourceCategory=”twistlock/kevin” (violation) | timeslice 1d | count by _timeslice | order by _timeslice asc
  • Run the query and select the Aggregates tab
  • You should be looking at a list of dates and their total number of violations
  • Select the area chartfrom the Aggregate Tab’s toolbar
  • Click the “Add to dashboard” button on the right hand side to start creating a new dashboard by adding this chart as a panel
  • Create the new dashboard panel
    • Enter a title for example: Violations by day
    • Select Overview Dashboard as the dashboard

16

    • Click Add
    • Resize the line chart so it extends the full width of the dashboard by clicking and dragging on the bottom right corner of the area chart panel
    • Your dashboard should now look similar to the one below

17

  • Use the following query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
    • _sourceCategory=”twistlock/example” (Denied)|parse “The command * * for user * by rule *’” as command, action, user, rulename | count by user | order by user asc
    • Run the query and select the Aggregates tab
    • You should be looking at a list of users and their total count of violations

18

  • Select the column chart iconfrom the Aggregate Tab’s toolbar

19

  • Click the “Add to dashboard” button on the right hand side to start creating a new dashboard by adding this chart as a panel
  • Create the new panel
    • Enter a title for example: Top Users with Violations
    • Enter a new dashboard name for example: Overview Dashboard

20

  • Click Add

23

  • Create another chart using the same process as above but this time use the search query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
  • _sourceCategory=”twistlock/example” (violation) | parse “.go:* * violation ” as linenumber, violation_type | count by violation_type | order by _count desc
  • Create the new panel
    • Enter a title for example: Top Violation by Types
    • Select Overview Dashboard as the dashboard

22

  • Click Add
  • Your completed dashboard should now look similar to the one below

23

In summary, integrating Twistlock and Sumo Logic gives users powerful and automated security protection for containers and provides advanced analytic capabilities to fully understand and visualize that data in actionable ways.  Because both products are built around open standards, integration is easy and users can begin reaping the benefits of this combined approach in minutes.

5 Log Monitoring Moves to Wow Your Business Partner

$
0
0

Log Monitoring - Sumo LogicLooking for some logging moves that will impress your business partner? In this post, we’ll show you a few. But first, a note of caution:

If you’re going to wow your business partner, make a visiting venture capitalist’s jaw drop, or knock the socks off of a few stockholders, you could always accomplish that with something that has a lot of flash, and not much more than that, or you could show them something that has real and lasting substance, and will make a difference in your company’s bottom line. We’ve all seen business presentations filled with flashy fireworks, and we’ve all seen how quickly those fireworks fade away.

Around here, though, we believe in delivering value—the kind that stays with your organization, and gives it a solid foundation for growth. So, while the logging moves that we’re going to show you do look good, the important thing to keep in mind is that they provide genuine, substantial value—and discerning business partners and investors (the kind that you want to have in your corner) will recognize this value quickly.

Why Is Log Monitoring Useful?

What value should logs provide? Is it enough just to accumulate information so that IT staff can pick through it as required? That’s what most logs do, varying mostly in the amount of information and the level of detail. And most logs, taken as raw data, are very difficult to read and interpret; the most noticeable result of working with raw log data, in fact, is the demand that it puts on IT staff time.

5 Log Monitoring Steps to Success

Most of the value in logs is delivered by means of systems for organizing, managing, filtering, analyzing, and presenting log data. And needless to say, the best, most impressive, most valuable logging moves are those which are made possible by first-rate log management. They include:

  • Quick, on-the-spot, easy-to-understand analytics. Pulling up instant, high-quality analytics may be the most impressive move that you can make when it comes to logging, and it is definitely one of the most valuable features that you should look for in any log management system. Raw log data is a gold mine, but you need to know how to extract and refine the gold. A high-quality analytics system will extract the data that’s valuable to you, based on your needs and interests, and present it in ways that make sense. It will also allow you to quickly recognize and understand the information that you’re looking for.
  • Monitoring real-time data. While analysis of cumulative log data is extremely useful, there are also plenty of situations where you need to see what is going on right at the moment. Many of the processes that you most need to monitor (including customer interaction, system load, resource use, and hostile intrusion/attack) are rapid and transient, and there is no substitute for a real-time view into such events. Real-time monitoring should be accompanied by the capacity for real-time analytics. You need to be able to both see and understand events as they happen.
  • Fully integrated logging and analytics. There may be processes in software development and operations which have a natural tendency to produce integrated output, but logging isn’t one of them. Each service or application can produce its own log, in its own format, based on its own standards, without reference to the content or format of the logs created by any other process. One of the most important and basic functions that any log management system can perform is log integration, bringing together not just standard log files, but also event-driven and real-time data. Want to really impress partners and investors? Bring up log data that comes from every part of your operation, and that is fully integrated into useful, easily-understood output.
  • Drill-down to key data. Statistics and aggregate data are important; they give you an overall picture of how the system is operating, along with general, system-level warnings of potential trouble. But the ability to drill down to more specific levels of data—geographic regions, servers, individual accounts, specific services and processes —is what allows you to make use of much of that system-wide data. It’s one thing to see that your servers are experiencing an unusually high level of activity, and quite another to drill down and see an unusual spike in transactions centered around a group of servers in a region known for high levels of online credit card fraud. Needless to say, integrated logging and scalability are essential when it comes to drill-down capability.
  • Logging throughout the application lifecycle. Logging integration includes integration across time, as well as across platforms. This means combining development, testing, and deployment logs with metrics and other performance-related data to provide a clear, unified, in-depth picture of the application’s entire lifecycle. This in turn makes it possible to look at development, operational, and performance-related issues in context, and see relationships which might not be visible without such cross-system, full lifecycle integration.

Use Log Monitoring to Go for the Gold

So there you have it—five genuine, knock-’em-dead logging moves. They’ll look very impressive in a business presentation, and they’ll tell serious, knowledgeable investors that you understand and care about substance, and not just flash. More to the point, these are logging capabilities and strategies which will provide you with valuable (and often crucial) information about the development, deployment, and ongoing operation of your software.

Logs do not need to be junkpiles of unsorted, raw data. Bring first-rate management and analytics to your logs now, and turn those junk-piles into gold.

5 Log Monitoring Moves to Wow Your Business Partner is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Michael Churchman started as a scriptwriter, editor, and producer during the anything-goes early years of the game industry. He spent much of the ‘90s in the high-pressure bundled software industry, where the move from waterfall to faster release was well under way, and near-continuous release cycles and automated deployment were already de facto standards. During that time he developed a semi-automated system for managing localization in over fifteen languages. For the past ten years, he has been involved in the analysis of software development processes and related engineering management issues.

Setting Up a Docker Environment Using Docker Compose

$
0
0

Docker Compose is a handy tool for solving one of the biggest inherent challenges posed by container-based infrastructure. That challenge is this: While Docker containers provide a very easy and convenient way to make apps portable, they also abstract your apps from the host system — since that is the whole point of containers. As a result, connecting one container-based app to another — and to resources like data storage and networking — is tricky.

If you’re running a simple container environment, this isn’t a problem. A containerized web server that doesn’t require multiple containers can exist happily enough on its own, for example.

But if life were always simple, you wouldn’t need containers in the first place. To do anything serious in your cloud, you will probably want your containers to be able to interact with one another and to access system resources.

That’s where Docker Compose comes in. Compose lets you define the containers and services that need to work together to power your application. Compose allows you to configure everything in plain text files, then use a simple command-line utility to control all of the moving parts that make your app run.

Another way to think of Compose is as an orchestrator for a single app. Just as Swarm and Kubernetes automate management of all of the hundreds or thousands of containers that span your data center, Compose automates a single app that relies on multiple containers and services.

Using Docker Compose

Setting up a Docker environment using Compose entails multiple steps. But if you have any familiarity with basic cloud configuration — or just text-based interfaces on Unix-like operating systems — Compose is pretty simple to use.

Deploying the tool involves three main steps. First, you create a Dockerfile to define your app. Second, you create a Compose configuration file that defines app services. Lastly, you fire up a command-line tool to start and control the app.

I’ll walk through each of these steps below.

Step 1. Make a Dockerfile

This step is pretty straightforward if you are already familiar with creating Docker images. Using any text editor, open up a blank file and define the basic parameters for your app.

The Dockerfile contents will vary depending on your situation, but the format should basically look like this:


FROM [ name of the base Docker image you're using ]
ADD . [ /path/to/workdir ]
WORKDIR [ directory where your code lives ]
RUN [ command(s) to run to set up app dependencies ]
CMD [ command you'll use to call the app ]

Save your Dockerfile. Then build the image by calling docker build -t [ image name ]

Step 2. Define Services

If you can build a Dockerfile, you can also define app services. Like the first step, this one is all about filling in fields in a text file.

You’ll want to name the file docker-compose.yml and save it in the workdir that you defined in your Dockerfile. The contents of docker-compose.yml should look something like this:
version: '2'
services:
[ name of a service ]:
build: [ code directory ]
ports:
- "[ tcp and udp ports ] "
volumes:
- .: [ /path/to/code directory ]
depends_on:
- [ name of dependency image ]
[ name of another service ]:
image: [ image name ]

You can define as many services, images and dependencies as you need. For a complete overview of the values you can include in your Compose config file, check out Docker’s documentation.

Don’t forget that another cool thing you can do with Compose is configure log collection using Powerstrip and the Sumo Logic collector container.

Step 3. Run the app

Now comes the really easy part. With your container image built and the app services defined, you just need to turn the key and get things running.

You do that with a command-line utility called (simply enough) docker-compose.

The syntax is pretty simple, too. To start your app, call docker-compose up from within your project directory.

You don’t need any arguments (although you can supply some if desired; see below for more on that). As long as your Dockerfile and Compose configuration file are in the working directory, Compose will find and parse them for you.

Even sweeter, Compose is smart enough to build dependencies automatically, too.

After being called, docker-compose will respond with some basic output telling you what it is doing.

To get the full list of arguments for docker-compose, call it with the help flag:

docker-compose —help

When you’re all done, just run (you guessed it!) docker-compose down to turn off the app.

Some Docker Compose Tips

If you’re just getting started with Compose, knowing about a few of the tool’s quirks ahead of time can save you from confusion.

One is that there are multiple ways to start an app with Compose. I covered docker-compose up above. Another option is docker-compose run.

Both of these commands do the same general thing — start your app — but run is designed for starting a one-time instance, which can be handy if you’re just testing out your build. up is the command you want for production scenarios.

There’s also a third option: docker-compose start. This call only restarts containers that already exist. Unlike up, it doesn’t build the containers for you.

Another quirk: You may find that Compose seems to hang or freeze when you tell it to shut down an app using docker-compose stop. Panic not! Compose is not hanging. It’s just waiting patiently for the container to shut down in response to a SIGTERM system call.

If the container doesn’t shut down within ten seconds, Compose will hit it with SIGKILL, which should definitely shut it down. (If your containers aren’t responding to standard SIGTERM requests, by the way, you may want to read more about how Docker processes signals to figure out why.)

That’s Compose in a nutshell — or about a thousand words, at least. For all of the nitty-gritty details, you can refer to Docker’s Compose reference guide.

Setting Up a Docker Environment Using Docker Compose is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Hemant Jain is the founder and owner of Rapidera Technologies, a full service software development shop. He and his team focus a lot on modern software delivery techniques and tools. Prior to Rapidera he managed large scale enterprise development projects at Autodesk and Deloitte.

Getting the Most Out of SaltStack Logs

$
0
0

Learn about SaltStack log storage and customization, and how to analyze the logs with Sumo Logic to gain useful insights into your server configuration.SaltStack, also known simply as Salt, is a handy configuration management platform. Written in Python, it’s open source and allows ITOps teams to define “Infrastructure as Code” in order to provision and orchestrate servers.

But SaltStack’s usefulness is not limited to configuration management. The platform also generates logs, and like all logs, that data can be a useful source of insight in all manner of ways.

This article provides an overview of SaltStack logging, as well as a primer on how to analyze SaltStack logs with Sumo Logic.

Where does SaltStack store logs?

The first thing to understand is where SaltStack logs live. The answer to that question depends on where you choose to place them.

You can set the log location by editing your SaltStack configuration file on the salt-master. By default, this file should be located at /etc/salt/master on most Unix-like systems.

The variable you’ll want to edit is log_file. If you want to store logs locally on the salt-master, you can simply set this to any location on the local file system, such as /var/log/salt/salt_master.

Storing Salt logs with rsyslogd

If you want to centralize logging across a cluster, however, you will benefit by using rsyslogd, a system logging tool for Unix-like systems. With rsyslogd, you can configure SaltStack to store logs either remotely or on the local file system.

For remote logging, set the log_file parameter in the salt-master configuration file according to the format:

<file|udp|tcp>://<host|socketpath>:/.

For example, to connect to a server named mylogserver (whose name should be resolveable on your local network DNS, of course) via UDP on port 2099, you’d use a line like this one:

log_file: udp://mylogserver:2099

Colorizing and bracketing your Salt logs

Another useful configuration option that SaltStack supports is custom colorization of console logs. This can make it easier to read the logs by separating high-priority events from less important ones.

To set colorization, you change the log_fmt_console parameter in the Salt configuration file. The colorization options available are:

'%(colorlevel)s' # log level name colorized by level
'%(colorname)s' # colorized module name
'%(colorprocess)s' # colorized process number
'%(colormsg)s' # log message colorized by level

Log files can’t be colorized. That would not be as useful, since the program you use to read the log file may not support color output, but they can be padded and bracketed to distinguish different event levels. The parameter you’ll set here is log_fmt_logfile and the options supported include:

'%(bracketlevel)s' # equivalent to [%(levelname)-8s]
'%(bracketname)s' # equivalent to [%(name)-17s]
'%(bracketprocess)s' # equivalent to [%(process)5s]

How to Analyze SaltStack logs with Sumo Logic

So far, we’ve covered some handy things to know about configuring SaltStack logs. You’re likely also interested in how you can analyze the data in those logs. Here, Sumo Logic, which offers easy integration with SaltStack, is an excellent solution.

Sumo Logic has an official SaltStack formula, which is available from GitHub. To install it, you can use GitFS to make the formula available to your system, but the simpler approach (for my money, at least) is simply to clone the formula repository in order to save it locally. That way, changes to the formula won’t break your configuration. (The downside, of course, is that you also won’t automatically get updates to the formula, but you can always update your local clone of the repository if you want them.)

To set up the Sumo Logic formula, run these commands:

mkdir -p /srv/formulas # or wherever you want to save the formula
cd /srv/formulas
git clone https://github.com/saltstack-formulas/sumo-logic-formula.git

Then simply edit your configuration by adding the new directory to the file_roots parameter, like so:

file_roots:
base:
- /srv/salt
- /srv/formulas/sumo-logic-formula

Restart your salt-master and you’re all set. You’ll now be able to analyze your SaltStack logs from Sumo Logic, along with any other logs you work with through the platform.

Getting the Most Out of SaltStack Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

Integrating Apps with the Sumo Logic Search API

$
0
0

Sumo Logic Search Job APIThe Sumo Logic Web app provides a search interface that lets you parse logs. This provides a great resource for a lot of use cases — especially because you can take advantage of a rich search syntax, including wildcards and various operators (documented here), directly from the Web app.

But we realize that some people need to be able to harness Sumo Logic search data from within external apps, too. That’s why Sumo Logic also provides a robust RESTful API that you can use to integrate other apps with Sumo Logic search.

To provide a sense of how you can use the Sumo Logic Search Job API in the real world, this post offers a quick primer on the API, along with a couple of examples of the API in action. For more detailed information, refer to the Search Job API documentation.

Sumo Logic Search Integration: The Basics

Before getting started there are a few essentials you should know about the Sumo Logic Search Job API.

First, the API uses the HTTP GET method. That makes it pretty straightforward to build the API into Web apps you may have (or any other type of app that uses the HTTP protocol). It also means you can run queries directly from the CLI using any tool that supports HTTP GET requests, like curl or wget. Sound easy? It is!

Second, queries should be directed to https://api.sumologic.com/api/v1/logs/search. You simply append your GET requests and send them on to the server. (You also need to make sure that your HTTP request contains the parameters for connecting to your Sumo Logic account; for example, with curl, you would specify these using the -u flag, for instance, curl -u user@example.com:VeryTopSecret123 your-search-query).

Third, the server delivers query responses in JSON format. That approach is used because it keeps the search result data formatting consistent, allowing you to manipulate the results easily if needed.

Fourth, know that the Search Job API can return up to one million records per search query. API requests are limited to  four API per second and 240 requests per minute across all API calls from a customer. If the rate is exceeded, a rate limit exceeded (429) error is returned.

 

Sumo Logic Search API Example Queries

As promised, here are some real-world examples.

For starters, let’s say you want to identify incidents where a database connection failure occurred. To do this, specify “database connection error” as our query, using a command like this:


curl -u user@example.com:VeryTopSecret123 "https://api.sumologic.com/api/v1/logs/search?q=database connection error"

(That’s all one line, by the way.)

You can take things further, too, by adding date and time parameters to the search. For example, if you wanted to find database connection errors that happened between about 1 p.m. and 3 p.m. on April 4, 2012, you would add some extra data to your query, making it look like this:


curl -u user@example.com:VeryTopSecret123 "https://api.sumologic.com/api/v1/logs/search?q=database connection error&from=2012-04-04T13:01:02&to=2012-04-04T15:01:02

Another real-world situation where the search API can come in handy is to find login failures. You could locate those in the logs with a query like this:


curl -u user@example.com:VeryTopSecret123 "https://api.sumologic.com/api/v1/logs/search?q=failed login"

Again, you could restrict your search here to a certain time and date range, too, if you wanted.

Another Way to Integrate with Sumo Logic Search: Webhooks

Most users will probably find the Sumo Logic search API the most extensible method of integrating their apps with log data. But there is another way to go about this, too, which is worth mentioning before we wrap up.

That’s Webhook alerts, a feature that was added to Sumo Logic last fall. Webhooks make it easy to feed Sumo Logic search data to external apps, like Slack, PagerDuty, VictorOps and Datadog. I won’t explain how to use Webhooks in this post, because that topic is already covered on our blog.

Integrating Apps with the Sumo Logic Search API is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Dan Stevens is the founder of StickyWeb (stickyweb.biz), a custom Web Technology development company. Previously, he was the Senior Product Manager for Java Technologies at Sun Microsystems and for broadcast video technologies at Sony Electronics, Accom and Ampex.

How to Configure a Docker Cluster Using Swarm

$
0
0

Docker swarm configurationIf your data center were a beehive, Docker Swarm would be the pheromone that keeps all the bees working efficiently together.

Here’s what I mean by that. In some ways, Docker containers are like bumblebees. Just as an individual bee can’t carry much of anything on her own, a single container won’t have a big impact on your data center’s bottom line.

It’s only by deploying hundreds or thousands of containers in tandem that you can leverage their power, just like a colony of bees prospers because of the collective efforts of each of its individual members.

Unlike bumblebees, however, Docker containers don’t have pheromones that help them coordinate with one another instinctively. They don’t automatically know how to pool their resources in a way that most efficiently meets the needs of the colony (data center). Instead, containers on their own are designed to operate independently.

So, how do you make containers work together effectively, even when you’re dealing with many thousands of them? That’s where Docker Swarm comes in.

Swarm is a cluster orchestration tool for Docker containers. It provides an easy way to configure and manage large numbers of containers across a cluster of servers by turning all of them into a virtual host. It’s the hive mind that lets your containers swarm like busy bees, as it were.

Why Use Swarm for Cluster Configuration?

There are lots of similar cluster orchestration tools beyond Swarm. Kubernetes and Mesos are among the most popular alternatives, but the full list of options is long.

Deciding which orchestrator is right for you is fodder for a different post. I won’t delve too deeply into that discussion here. But it’s worth briefly noting a couple of characteristics about Swarm.

First, know that Swarm happens to be Docker’s homegrown cluster orchestration platform. That means it’s as tightly integrated into the rest of the Docker ecosystem as it can be. If you like consistency, and you have built the rest of your container infrastructure with Docker components, Swarm is probably a good choice for you.

Docker also recently published data claiming that Swarm outperforms Kubernetes. Arguably, the results in that study do not necessarily apply to all real-world data centers. (For a critique of Docker’s performance claims by Kelsey Hightower, an employee of the company — Google — where Kubernetes has its roots, click here.) But if your data center is similar in scale to the one used in the benchmarks, you might find that Swarm performs well for you, too.

Setting Up a Docker Swarm Cluster

Configuring Swarm to manage a cluster involves a little bit of technical know-how. But as long as you have some basic familiarity with the Docker CLI interface and Unix command-line tools, it’s nothing you can’t handle.

Here’s a rundown of the basic steps for setting up a Swarm cluster:

Step 0. Set up hosts. This is more a prerequisite than an actual step. (That’s why I labeled it step 0!) You can’t orchestrate a cluster till you have a cluster to orchestrate. So before all else, create your Docker images — including both the production containers that comprise your cluster and at least one image that you’ll use to host Swarm and related services.

You should also make sure your networking is configured to allow SSH connections to your Swarm image(s), since I’ll use this later on to access them.

Step 1. Install Docker Engine. Docker Engine is a Docker component that lets images communicate with Swarm via a CLI or API. If it’s not already installed on your images, install it with:

curl -sSL https://get.docker.com/ | sh

Then start Engine to listen for Swarm connections on port 2375 with a command like this:

sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock

Step 2. Create a discovery backend. Next, I need to launch a Docker daemon that Swarm can use to find and authenticate different images that are part of the cluster.

To do this, SSH into an image that you want to use to host the discovery backend. Then run this command:

docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap

This will fire up the discovery backend on port 8500 on the image.

Step 3. Start Swarm. With that out of the way, the last big step is to start the Swarm instance. For this, SSH into the image you want to use to host Swarm. Then run:

docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise :4000 consul://

Fill in the and fields in the command above with the IP addresses of the images you used in steps 1 and 2 for setting up Engine and the discovery backend, respectively. (It’s fine if you do all these using the same server, but you can use different ones if you like.)

Step 4. Connect to Swarm. The final step is to connect your client images to Swarm. You do that with a command like this:

docker run -d swarm join --advertise=:2375 consul://:8500

is the IP address of the image, and is the IP from steps 2 and 3 above.

Using Swarm: Commands

The hard part’s done! Once Swarm is set up as per the instructions above, using it to manage clusters is easy. Just run the docker command with the -H flag and the Swarm port number to monitor and control your Swarm instance.

For example, this command would give information about your cluster if it is configured to listen on port 4000:

docker -H :4000 info

You can also use a command like this to start an app on your cluster directly from Swarm, which will automatically decide how best to deploy it based on real-time cluster metrics:

docker -H :4000 run some-app

Getting the Most out of Swarm

Here are some quick pointers for getting the best performance out of Swarm at massive scale:

  • Consider creating multiple Swarm managers and nodes to increase reliability.
  • Make sure your discovery backend is running on a highly available image, since it needs to be up for Swarm to work.
  • Lock down networking so that connections are allowed only for the ports and services (namely, SSH, HTTP and the Swarm services themselves) that you need. This will increase security.
  • If you have a lot of nodes to manage, you can use a more sophisticated method for allowing Swarm to discover them. Docker explains that in detail here.

If you’re really into Swarm, you might also want to have a look at the Swarm API documentation. The API is a great resource if you need to build custom container-based apps that integrate seamlessly with the rest of your cluster (and that don’t already have seamless integration built-in, like the Sumo Logic log collector does).

How to Configure a Docker Cluster Using Swarm is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Hemant Jain is the founder and owner of Rapidera Technologies, a full service software development shop. He and his team focus a lot on modern software delivery techniques and tools. Prior to Rapidera he managed large scale enterprise development projects at Autodesk and Deloitte.


Setting Up a Docker Environment Using Docker Compose

$
0
0

Docker Compose is a handy tool for solving one of the biggest inherent challenges posed by container-based infrastructure. That challenge is this: While Docker containers provide a very easy and convenient way to make apps portable, they also abstract your apps from the host system — since that is the whole point of containers. As a result, connecting one container-based app to another — and to resources like data storage and networking — is tricky.

If you’re running a simple container environment, this isn’t a problem. A containerized web server that doesn’t require multiple containers can exist happily enough on its own, for example.

But if life were always simple, you wouldn’t need containers in the first place. To do anything serious in your cloud, you will probably want your containers to be able to interact with one another and to access system resources.

That’s where Docker Compose comes in. Compose lets you define the containers and services that need to work together to power your application. Compose allows you to configure everything in plain text files, then use a simple command-line utility to control all of the moving parts that make your app run.

Another way to think of Compose is as an orchestrator for a single app. Just as Swarm and Kubernetes automate management of all of the hundreds or thousands of containers that span your data center, Compose automates a single app that relies on multiple containers and services.

Using Docker Compose

Setting up a Docker environment using Compose entails multiple steps. But if you have any familiarity with basic cloud configuration — or just text-based interfaces on Unix-like operating systems — Compose is pretty simple to use.

Deploying the tool involves three main steps. First, you create a Dockerfile to define your app. Second, you create a Compose configuration file that defines app services. Lastly, you fire up a command-line tool to start and control the app.

I’ll walk through each of these steps below.

Step 1. Make a Dockerfile

This step is pretty straightforward if you are already familiar with creating Docker images. Using any text editor, open up a blank file and define the basic parameters for your app.

The Dockerfile contents will vary depending on your situation, but the format should basically look like this:


FROM [ name of the base Docker image you're using ]
ADD . [ /path/to/workdir ]
WORKDIR [ directory where your code lives ]
RUN [ command(s) to run to set up app dependencies ]
CMD [ command you'll use to call the app ]

Save your Dockerfile. Then build the image by calling docker build -t [ image name ]

Step 2. Define Services

If you can build a Dockerfile, you can also define app services. Like the first step, this one is all about filling in fields in a text file.

You’ll want to name the file docker-compose.yml and save it in the workdir that you defined in your Dockerfile. The contents of docker-compose.yml should look something like this:
version: '2'
services:
[ name of a service ]:
build: [ code directory ]
ports:
- "[ tcp and udp ports ] "
volumes:
- .: [ /path/to/code directory ]
depends_on:
- [ name of dependency image ]
[ name of another service ]:
image: [ image name ]

You can define as many services, images and dependencies as you need. For a complete overview of the values you can include in your Compose config file, check out Docker’s documentation.

Don’t forget that another cool thing you can do with Compose is configure log collection using Powerstrip and the Sumo Logic collector container.

Step 3. Run the app

Now comes the really easy part. With your container image built and the app services defined, you just need to turn the key and get things running.

You do that with a command-line utility called (simply enough) docker-compose.

The syntax is pretty simple, too. To start your app, call docker-compose up from within your project directory.

You don’t need any arguments (although you can supply some if desired; see below for more on that). As long as your Dockerfile and Compose configuration file are in the working directory, Compose will find and parse them for you.

Even sweeter, Compose is smart enough to build dependencies automatically, too.

After being called, docker-compose will respond with some basic output telling you what it is doing.

To get the full list of arguments for docker-compose, call it with the help flag:

docker-compose —help

When you’re all done, just run (you guessed it!) docker-compose down to turn off the app.

Some Docker Compose Tips

If you’re just getting started with Compose, knowing about a few of the tool’s quirks ahead of time can save you from confusion.

One is that there are multiple ways to start an app with Compose. I covered docker-compose up above. Another option is docker-compose run.

Both of these commands do the same general thing — start your app — but run is designed for starting a one-time instance, which can be handy if you’re just testing out your build. up is the command you want for production scenarios.

There’s also a third option: docker-compose start. This call only restarts containers that already exist. Unlike up, it doesn’t build the containers for you.

Another quirk: You may find that Compose seems to hang or freeze when you tell it to shut down an app using docker-compose stop. Panic not! Compose is not hanging. It’s just waiting patiently for the container to shut down in response to a SIGTERM system call.

If the container doesn’t shut down within ten seconds, Compose will hit it with SIGKILL, which should definitely shut it down. (If your containers aren’t responding to standard SIGTERM requests, by the way, you may want to read more about how Docker processes signals to figure out why.)

That’s Compose in a nutshell — or about a thousand words, at least. For all of the nitty-gritty details, you can refer to Docker’s Compose reference guide.

Setting Up a Docker Environment Using Docker Compose is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Hemant Jain is the founder and owner of Rapidera Technologies, a full service software development shop. He and his team focus a lot on modern software delivery techniques and tools. Prior to Rapidera he managed large scale enterprise development projects at Autodesk and Deloitte.

Getting the Most Out of SaltStack Logs

$
0
0

Learn about SaltStack log storage and customization, and how to analyze the logs with Sumo Logic to gain useful insights into your server configuration.SaltStack, also known simply as Salt, is a handy configuration management platform. Written in Python, it’s open source and allows ITOps teams to define “Infrastructure as Code” in order to provision and orchestrate servers.

But SaltStack’s usefulness is not limited to configuration management. The platform also generates logs, and like all logs, that data can be a useful source of insight in all manner of ways.

This article provides an overview of SaltStack logging, as well as a primer on how to analyze SaltStack logs with Sumo Logic.

Where does SaltStack store logs?

The first thing to understand is where SaltStack logs live. The answer to that question depends on where you choose to place them.

You can set the log location by editing your SaltStack configuration file on the salt-master. By default, this file should be located at /etc/salt/master on most Unix-like systems.

The variable you’ll want to edit is log_file. If you want to store logs locally on the salt-master, you can simply set this to any location on the local file system, such as /var/log/salt/salt_master.

Storing Salt logs with rsyslogd

If you want to centralize logging across a cluster, however, you will benefit by using rsyslogd, a system logging tool for Unix-like systems. With rsyslogd, you can configure SaltStack to store logs either remotely or on the local file system.

For remote logging, set the log_file parameter in the salt-master configuration file according to the format:

<file|udp|tcp>://<host|socketpath>:/.

For example, to connect to a server named mylogserver (whose name should be resolveable on your local network DNS, of course) via UDP on port 2099, you’d use a line like this one:

log_file: udp://mylogserver:2099

Colorizing and bracketing your Salt logs

Another useful configuration option that SaltStack supports is custom colorization of console logs. This can make it easier to read the logs by separating high-priority events from less important ones.

To set colorization, you change the log_fmt_console parameter in the Salt configuration file. The colorization options available are:

'%(colorlevel)s' # log level name colorized by level
'%(colorname)s' # colorized module name
'%(colorprocess)s' # colorized process number
'%(colormsg)s' # log message colorized by level

Log files can’t be colorized. That would not be as useful, since the program you use to read the log file may not support color output, but they can be padded and bracketed to distinguish different event levels. The parameter you’ll set here is log_fmt_logfile and the options supported include:

'%(bracketlevel)s' # equivalent to [%(levelname)-8s]
'%(bracketname)s' # equivalent to [%(name)-17s]
'%(bracketprocess)s' # equivalent to [%(process)5s]

How to Analyze SaltStack logs with Sumo Logic

So far, we’ve covered some handy things to know about configuring SaltStack logs. You’re likely also interested in how you can analyze the data in those logs. Here, Sumo Logic, which offers easy integration with SaltStack, is an excellent solution.

Sumo Logic has an official SaltStack formula, which is available from GitHub. To install it, you can use GitFS to make the formula available to your system, but the simpler approach (for my money, at least) is simply to clone the formula repository in order to save it locally. That way, changes to the formula won’t break your configuration. (The downside, of course, is that you also won’t automatically get updates to the formula, but you can always update your local clone of the repository if you want them.)

To set up the Sumo Logic formula, run these commands:

mkdir -p /srv/formulas # or wherever you want to save the formula
cd /srv/formulas
git clone https://github.com/saltstack-formulas/sumo-logic-formula.git

Then simply edit your configuration by adding the new directory to the file_roots parameter, like so:

file_roots:
base:
- /srv/salt
- /srv/formulas/sumo-logic-formula

Restart your salt-master and you’re all set. You’ll now be able to analyze your SaltStack logs from Sumo Logic, along with any other logs you work with through the platform.

Getting the Most Out of SaltStack Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

Managing Container Data Using Docker Data Volumes

$
0
0

docker data volumesDocker data volumes are designed to solve one of the deep paradoxes of containers, which is this: For the very same reasons that containers make apps highly portable — and, by extension, create more nimble data centers — they also make it hard to store data persistently. That’s because, by design, containerized apps are ephemeral. Once you shut down a container, everything inside it disappears. That makes your data center more flexible and secure, since it lets you spin up apps rapidly based on clean images. But it also means that data stored inside your containers disappears by default.

How do you resolve this paradox? There are actually several ways. You could jerry-rig a system for loading data into a container each time it is spun up ( via SSH, for example), then exporting it somehow, but that’s messy. You could also turn to traditional distributed storage systems, like NFS, which you can access directly over the network. But that won’t work well if you have a complicated (software-defined) networking situation (and you probably do in a large data center). You’d think someone would have solved the Docker container storage challenge in a more elegant way by now — and someone has! Docker data volumes provide a much cleaner, straightforward way to provide persistent data storage for containers.

That’s what I’ll cover here. Keep reading for instructions on setting up and deploying Docker data volumes (followed by brief notes on storing data persistently directly on the host).

Creating a Docker Data Volume

To use a data volume in Docker, you first need to create a container to host the volume. This is pretty basic. Just use a command like:

docker create -v /some/directory mydatacontainer debian

This command tells Docker to create a new container named mydatacontainer based on the Debian Docker image. (You could use any of Docker’s other OS images here, too.) Meanwhile, the -v flag in the command above sets up a storage container in the directory /some/directory inside the container.

To repeat: That means the data is stored at /some/directory inside the container called mydatacontainer — not at /some/directory on your host system.

The beauty of this, of course, is that we can now write data to /some/directory inside this container, and it will stay there as long as the container remains up.

Using a Data Volume in Docker

So that’s all good and well. But how do you actually get apps to use the new data volume you created?

Pretty easily. The next and final step is just to start another container, using the –volumes-from flag to tell Docker that this new container should store data in the data volume we created in the first container.

Our command would look something like this:

docker run --volumes-from mydatacontainer --volumes-from debian

Now, any data changes made inside the container debian will be saved inside mydatacontainer at the directory /some/directory.

And they’ll stay there if you stop debian — which means this is a persistent data storage solution. (Of course, if you stop mycontainervolume, then you’ll also lose the data inside.)

You can have as many data volumes as you want, by the way. Just specify multiple ones when you run the container that will access the volumes.
Data Storage on Host instead of a container?

You may be thinking, “What if I want to store my data directly on the host instead of inside another container?”

There’s good news. You can do that, too. We won’t use data storage volumes for this, though. Instead, we’ll run a command like:

docker run -v /host/dir:/container/dir -i image

This starts a new container based on the image image and maps the directory /host/dir on the host system to the directory /container/dir inside the container. That means that any data that is written by the container to /container/dir will also appear inside /host/dir on the host, and vice versa.

There you have it. You can now have your container data and eat it, too. Or something like that.

About the Author

Hemant Jain is the founder and owner of Rapidera Technologies, a full service software development shop. He and his team focus a lot on modern software delivery techniques and tools. Prior to Rapidera he managed large scale enterprise development projects at Autodesk and Deloitte.

Managing Container Data Using Docker Data Volumes is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

Application Containers vs. System Containers: Understanding the Difference

$
0
0

When people talk about containers, they usually mean application containers. Docker is automatically associated with application containers and is widely used to package applications and services. But there is another type of container: system containers. Let us look at the differences between application containers vs. system containers and see how each type of container is used:

 

Application Containers System Containers
Images
  • Application/service centric
  • Growing tool ecosystem
  • Machine-centric
  • Limited tool ecosystem
Infrastructure
  • Security concerns
  • Networking challenges
  • Hampered by base OS limitations
  • Datacenter-centric
  • Isolated & secure
  • Optimized networking

The Low-Down on Application Containers

Application containers are used to package applications without launching a virtual machine for each app or each service within an app. They are especially beneficial when making the move to a microservices architecture, as they allow you to create a separate container for each application component and provide greater control, security and process restriction. Ultimately, what you get from application containers is easier distribution. The risks of inconsistency, unreliability and compatibility issues are reduced significantly if an application is placed and shipped inside a container.

Docker is currently the most widely adopted container service provider with a focus on application containers. However, there are other container technologies like CoreOS’s Rocket. Rocket promises better security, portability and flexibility of image sharing. Docker already enjoys the advantage of mass adoption, and Rocket might just be too late to the container party. Even with its differences, Docker is still the unofficial standard for application containers today.

System Containers: How They’re Used

System containers play a similar role to virtual machines, as they share the kernel of the host operating system and provide user space isolation. However, system containers do not use hypervisors. (Any container that runs an OS is a system container.) They also allow you to install different libraries, languages, and databases. Services running in each container use resources that are assigned to just that container.

System containers let you run multiple processes at the same time, all under the same OS and not a separate guest OS. This lowers the performance impact, and provides the benefits of VMs, like running multiple processes, along with the new benefits of containers like better portability and quick startup times.

Useful System Container Tools

Joyent’s Triton is a Container as a Service that implements its proprietary OS called SmartOS. It not only focuses on packing apps into containers but also provides the benefits of added security, networking and storage, while keeping things lightweight, with very little performance impact. The key differentiator is that Triton delivers bare-metal performance. With Samsung’s recent acquisition of Joyent, it’s left to be seen how Triton progresses.

Giant Swarm is a hosted cloud platform that offers a Docker-based virtualization system that is configured for microservices. It helps businesses manage their development stack, spend less time on operations setup, and more time on active development.

LXD is a fairly new OS container that was released in 2016 by Canonical, the creators of Ubuntu. It combines the speed and efficiency of containers with the famed security of virtual machines. Since Docker and LXD share the same kernels, it is easy to run Docker containers inside LXD containers.

Ultimately, understanding the differences and values of each type of container is important. Using both to provide solutions for different scenarios can’t be ruled out, either, as different teams have different uses. The development of containers, just like any other technology, is quickly advancing and changing based on newer demands and the changing needs of users.

Monitoring Your Containers

Whatever the type of container, monitoring and log analysis is always needed. Even with all of the advantages that containers offer as compared to virtual machines, things will go wrong.

That is why it is important to have a reliable log-analysis solution like Sumo Logic. One of the biggest challenges of Docker adoption is scalability, and monitoring containerized apps. Sumo Logic addresses this issue with its container-native monitoring solution. The Docker Log Analysis app from Sumo Logic can visualize your entire Docker ecosystem, from development to deployment. It uses advanced machine learning algorithms to detect outliers and anomalies when troubleshooting issues in distributed container-based applications. Sumo Logic’s focus on containers means it can provide more comprehensive and vital log analysis than traditional Linux-based monitoring tools.

About the Author

Twain began his career at Google, where, among other things, he was involved in technical support for the AdWords team. His work involved reviewing stack traces, and resolving issues affecting both customers and the Support team, and handling escalations. Later, he built branded social media applications, and automation scripts to help startups better manage their marketing operations. Today, as a technology journalist he helps IT magazines, and startups change the way teams build and ship applications.

Application Containers vs. System Containers: Understanding the Difference is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

Monitoring and Analyzing Puppet Logs With Sumo Logic

$
0
0

Monitoring and Analyzing Puppet Logs - Sumo LogicThe top Puppet question on ServerFault is How can the little guys effectively learn and use Puppet? Learning Puppet requires learning a DSL that’s thorny enough that the final step in many migrations is to buy Puppet training classes for team.

While there is no getting around learning the Puppet DSL, the “little guys” can be more effective if they avoid extending Puppet beyond the realm of configuration management (CM). It can be tempting to extend Puppet to become a monitoring hub, a CI spoke, or many other things. After all, if it’s not in Puppet, it won’t be in your environment, so why not build on that powerful connectedness?

The cons of Puppet for log analysis and monitoring

Here’s one anecdote from scriptcrafty explaining some of the problems with extending beyond CM:

Centralized logic where none is required, Weird DSLs and templating languages with convoluted error messages, Deployment and configuration logic disembodied from the applications that required them and written by people who have no idea what the application requires, Weird configuration dependencies that are completely untestable in a development environment, Broken secrets/token management and the heroic workarounds, Divergent and separate pipelines for development and production environments even though the whole point of these tools is to make things re-usable, and so on and so forth.

Any environment complex enough to need Puppet is already too complex to be analyzed with bash and PuppetDB queries. These tools work well for spot investigation and break/fix, but do not extend easily into monitoring and analysis.

I’ll use “borrow-time” as an example. To paraphrase the Puppet analytics team, “borrow-time” is the amount of time that the JRuby instances handling Puppet tasks spend on each request. If this number gets high, then there may be something unusually expensive going on. For instance, when the “borrow-timeout-count” metric is > 0, some build request has gone unfilled.

It’s tempting to think that the problem is solved by setting a “borrow-timeout-count” trigger in PuppetDB for >0. After all, just about any scripting language will do, and then analysis can be done in the PuppetDB logs. Puppet even has some guides for this in Puppet Server – What’s Going on in There?

Monitoring a tool with only its own suggested metrics is not just a convenience sample, but one that is also blind to the problem at hand—uptime and consistency across an inconsistent and complex environment. Knowing that some request has gone unhandled is a good starting point.

A closer look at Puppet logs and metrics

But look at everything else that Puppet shows when pulling metrics:is trying approach is it runs a risk so let’s look at what one “borrow-time” metrics pull brings up: In the Puppet server: pe-jruby-metrics->status->experimental->metrics


"metrics": {
"average-borrow-time": 75,
"average-free-jrubies": 1.86,
"average-lock-held-time": 0,
"average-lock-wait-time": 0,
"average-requested-jrubies": 1.8959058782351241,
"average-wait-time": 77,
"borrow-count": 10302,
"borrow-retry-count": 0,
"borrow-timeout-count": 0,
"borrowed-instances": [
{
"duration-millis": 2888,
"reason": {
"request": {
"request-method": "post",
"route-id": "puppet-v3-catalog-/*/",
"uri": "/puppet/v3/catalog/foo.puppetlabs.net"
}
},
},
...],
"num-free-jrubies": 0,
"num-jrubies": 4,
"num-pool-locks": 0,
"requested-count": 10305,
"requested-instances": [
{
"duration-millis": 134,
"reason": {
"request": {
"request-method": "get",
"route-id": "puppet-v3-file_metadata-/*/",
"uri": "/puppet/v3/file_metadata/modules/catalog_zero16/catalog_zero16_impl83.txt"
}
},
},
...],
"return-count": 10298
}

If you are lucky, you’ll have an intuitive feeling about the issue before asking whether the retry count is too high, or if it was only a problem in a certain geo. If the problem is severe, you won’t have time to check the common errors (here and here); you’ll want context.

How Sumo Logic brings context to Puppet logs

Adding context—such as timeseries, geo, tool, and user—is the primary reason to use Sumo for Puppet monitoring and analysis. Here is an overly simplified example Sumo Logic query where jruby borrowing is compared with the Apache log 2**/3**/4** errors:

_sourceName=*jruby-metrics* AND _sourceCategory=*apache*
| parse using public/apache/access
| if(status_code matches "2*", 1, 0) as successes
| if(status_code matches "5*", 1, 0) as server_errors
| if(status_code matches "4*", 1, 0) as client_errors
| if (num-free-jrubies matches “0”,1,0) as borrowrequired
| timeslice by 1d
| sum(successes) as successes, sum(client_errors) as client_errors, sum(server_errors) as server_errors sum(borrowrequired) as borrowed_jrubies by _timeslice

Centralizing monitoring across the environment means not only querying and joining siloed data, but also allowing for smarter analysis. By appending an “outlier” query to something like the above, you can set baselines and spot trends in your environment instead of guessing and then querying.


| timeslice 15d
| max(borrowed_jrubies) as borrowed_jrubies by _timeslice
| outlier response_time

source: help.sumologic.com/Search/Search_Query_Language/Search_Operators/outlier

About the Author

Alex Entrekin served on the executive staff of Cloudshare where he was primarily responsible for advanced analytics and monitoring systems. His work extending Splunk into actionable user profiling was featured at VMworld: “How a Cloud Computing Provider Reached the Holy Grail of Visibility.” Alex is currently an attorney, researcher and writer based in Santa Barbara, CA. He holds a J.D. from the UCLA School of Law.

Monitoring and Analyzing Puppet Logs With Sumo Logic is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

Working With Field Extraction Rules in Sumo Logic

$
0
0

field-extraction-rulesField extraction rules compress queries into short phrases, filter out unwanted fields and drastically speed up query times. Fifty at a time can be stored and used in what Sumo Logic calls a “parser library.”

These rules are a must once you move from simple collection to correlation and dashboarding. Since they tailor searches prior to source ingestion, the rules never collect unwanted fields, which can drastically speed up query times. Correlations and dashboards require many queries to load simultaneously, so the speed impact can be significant.

Setting Up Field Extraction Rules

The Sumo Logic team has written some templates to help you get started with common logs like IIS and Apache. While you will need to edit them, they take a lot of the pain out of writing regex parsers from scratch (phew). And if you write your own reusable parsers, save them as a template so you can help yourself to them later.

To get started, find a frequently used query snippet. The best candidates are queries that (1) are used frequently and (2) take a while to load. These might pull from dense sources (like iis) or just crawl back over long periods of time. You can also look at de facto high usage queries saved in dashboards, alerts and pinned searches.

Once you have the query, first take a look at what the source pulls without any filters. This is important both to ensure that you collect what’s needed, and that you don’t include anything that will throw off the rules. Since rules are “all or nothing,” only include persistent fields. In the example below, I am pulling from a safend collector. Here’s the output from a collector on a USB:

2014-10-09T15:12:33.912408-04:00 safend.host.com [Safend Data Protection] File Logging Alert details: User: user@user.com, Computer: computer.host.com, Operating System: Windows 7, Client GMT: 10/9/2014 7:12:33 PM, Client Local Time: 10/9/2014 3:12:33 PM, Server Time: 10/9/2014 7:12:33 PM, Group: , Policy: Safend for Cuomer Default Policy, Device Description: Disk drive, Device Info: SanDisk Cruzer Pattern USB Device, Port: USB, Device Type: Removable Storage Devices, Vendor: 0781, Model: 550A, Distinct ID: 3485320307908660, Details: , File Name: F:\SOME_FILE_NAME, File Type: PDF, File Size: 35607, Created: 10/9/2014 7:12:33 PM, Modified: 10/9/2014 7:12:34 PM, Action: Write

There are certainly reasons to collect all of this (and note that the rule won’t limit collection on the source collector) but I only want to analyze a few parameters.

To get it just right, filter it in the Field Extraction panel:

field-extraction-rules-sumo-logic

Below is the simple Parse Expression I used. Note that more parsing tools are supported that can grep nearly anything that a regular query can. But in this case, I just used parse and nodrop.

nodrop-field-extraction-rules-sumo-logic

Nodrop tells the query to pass results along even if the query returns nothing from that field. In this case, it acts like an OR function that concatenates the first three parse functions along with the last one. So if ‘parse regex “Action…”‘ returns nothing, nodrop commands the query to “not drop”, return a blank, and in this case, continue to the next function.

Remember that Field Extraction Rules are “all or nothing” with respect to fields. If you add a field that doesn’t exist, nodrop will not help since it only works within existing fields.

Use Field Extraction Rules to Speed Up Dashboard Load Time

The above example would be a good underlying rule for a larger profiling dashboard. It returns file information only—Action on the File, File ID, File Size, and Type. Another extraction rule might return only User and User Activities, while yet another might include only host server actions.
These rules can then be surfaced as dashboard panes, combined into profiles and easily edited. They load only the fields extracted, significantly improving load time, and the modularity of the rules provides a built-in library that makes editing and sharing useful snippets much simpler.

Working With Field Extraction Rules in Sumo Logic is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Alex Entrekin served on the executive staff of Cloudshare where he was primarily responsible for advanced analytics and monitoring systems. His work extending Splunk into actionable user profiling was featured at VMworld: “How a Cloud Computing Provider Reached the Holy Grail of Visibility.” Alex is currently an attorney, researcher and writer based in Santa Barbara, CA. He holds a J.D. from the UCLA School of Law.

Sumo Logic Launches Ultimate Log Bible Project

$
0
0

Greetings Sumo Logic community members, today, we are pleased to announce the launch of the Sumo Logic Log Bible project.

Log Bible

** Put your log expertise to work and win $100! **

Log data contains a wealth of operational and security information, although it’s not always as easy as you’d like to extract it. In an effort to help drive some insight into logging basics, and the different kinds of logs data out there, we are kicking off the Sumo Logic Ultimate Log Bible project.

The goal is to create a repository of log data reference materials that can be shared across our community of IT Operations, Developers and Security Professionals. And to do this we need your help. This will be a community-sourced project where the resulting deliverables will be an aggregation of all of our combined efforts.

We created the following taxonomy of technologies that generate log data. They are broken down into 11 categories, and 46 sub-categories as follows:

Log Bible Taxonomy

Log Bible Taxonomy

To help get things started, we have created the following 4 examples that can serve as templates for additional reference materials to be created:

As an incentive, we are offering a $100 Amazon or Starbucks gift card for community members who post log bible entries from the taxonomy list above (not including the ones already posted). In general, the documents added to the community topic should be about 2-3 pages in length and cover the following topic areas:

  1. What is it
  2. Key log files
  3. Location of log files
  4. Example log line(s) with explanations of what the entries mean
  5. Source links back to Sumo Logic help (i.e. Windows logging points to help.sumologic.com/apps/windows_app)

How to Submit a New Log Reference to the Project

We are using the Sumo Logic Community, Sumo Dojo, to facilitate the Ultimate Log Bible project. Sumo Dojo enables the community to comment on, review and approve new entries.

To post your submission, do the following:

  1. Log into Sumo Dojo using your Sumo Logic credentials.
  2. Select Ultimate Log Bible from the Topics menu.
  3. Click the POST button on the right to create a new Thread.

screen-shot-2016-10-12-at-12-18-12-pm

You should then see the following dialog box appear:

screen-shot-2016-10-12-at-12-18-21-pm

  1. Under Discussion, enter the title of your Log entry. For example, if you’re creating a new reference for Apache Server, enter “Apache Server Log Bible Reference”
  1. Under Details, please use the rich-text editor to copy/paste your log reference information. You can also attach your document using the paperclip option (see icon bottom left). However it’s best practice to do both.
  1. Under Add Topic, please add the topic you are writing about (in this example, one would use Apache Server Logs).
  1. Click the “Ask” button at the bottom of the form to submit your new Log reference to the project.

In the event you are unable to log into the community, you can email your log bible entry to log-bible@sumologic.com.

Next Steps

Once submitted, an internal team will review your submission and reach out with any comments or questions. We’ll also coordinate delivery of your gift card.

We thank you in advance for your help and cooperation in this project that will benefit everyone.


Best Practices for Analyzing Elastic Load Balancer Logs

$
0
0

logo amazon elastic load-balancer logsThe AWS Elastic Cloud Compute (EC2) service offers a simple, robust vehicle for performing real-time load balancing of applications hosted within the Amazon Cloud. Elastic Load Balancer (ELB) is designed to optimize performance and scalability, and maximize resource utilization by balancing loads across multiple AWS instances. To use AWS effectively, you need continuous monitoring, detection, troubleshooting and reporting. For these tasks, analyzing Elastic Load Balancer logs is crucial. This post explains what you need to know about load balancing logs, and how to analyze them.

Overview of Elastic Load Balancer Logs

EC2 load balancing works by taking a single cloud-based application and creating two or more EC2 instances. Each instance is capable of resolving an access request in its own right. Access requests can be routed in real time to the EC2 instance under least load.

By recording each and every access request made to the EC2 platform, the resulting Elastic Load Balancing logs that are produced can be used to:

  • Analyze access and traffic patterns
  • Troubleshoot applications
  • Perform security monitoring
  • Improve the user experience
  • Discover and debug problems with the EC2 platform

Planning for ELB Log Analysis

To get the most from ELB logs, you should perform the following tasks before you begin logging:

  • Turn on access logging at every point possible within ELB.
  • Make a plan for log storage. Determine where logs will be stored, how long they will be kept, and which users will be able to access them.
  • Develop a set of processes for analyzing logs. This includes setting up multiple analytics views over the same logs to discern specific facts. It also includes defining how frequently each type of log entry needs to be analyzed.
  • Set up a standard convention for naming log entries, reports and all other log-related data.

Using ELB Logs

Elastic Load Balancer logs can be produced by EC2 at a rate ranging from every five minutes to every 60 minutes. Deciding how frequently logs need to be produced will depend on how often there is a need to re-analyze logs.

Each load balancer will have its own log, and the filename of each log created will have the following format:

bucket[/prefix]/AWSLogs/aws-account-id/elasticloadbalancing/region/yyyy/mm/dd/aws-account-id_elasticloadbalancing_region_load-balancer-name_end-time_ip-address_random-string.log

A full explanation of how this filename is composed is available in the AWS documentation.

Each log file entry also has a standard format, which looks like this:

timestamp elb client:port backend:port request_processing_time backend_processing_time response_processing_time elb_status_code backend_status_code received_bytes sent_bytes "request" "user_agent" ssl_cipher ssl_protocol

Once again, the AWS documentation provides additional info on this format.

Using Analytics Tools to Examine EC2 Logs

You can begin the process of analyzing ELB logs manually by downloading log files in a format suitable and feeding them to a spreadsheet or database application. But that would be very time-consuming and inefficient. It would also be very difficult to derive real value from a large amount of log data if you attempt to analyze it by hand.

A much better and more effective approach is to leverage an analytics platform like Sumo Logic. Sumo Logic offers an analytics app designed specifically for ELB. It provides quick visualizations to help users interpret traffic data, discover choke points in AWS app performance, and so on.

Plus, the Sumo Logic app for AWS ELB can go further than simple analysis by allowing you to configure triggers to automate changes to the ELB configuration in response to given events. This feature allows you to correct load balancing issues automatically in order to prevent them from affecting users.

Additional Best Practices for ELB Logging

Whether you are going to export raw EC2 logs and perform analysis by hand or use a pre-built application such as Sumo Logic, there is a need to operate in a methodical and logical manner. Towards this end, here are some additional best practices that apply to all forms of logging, not just ELB logs:

  • Keep it simple: Develop a bare-minimum approach to log file analysis. Log files are there to help, not guide. Spending too much time obsessing over log files takes valuable time away from other tasks such as application support and development.
  • Log as much as you can: If you can log it, then log it. Even if you never actually analyze the additional things you log, you never know when you may need to in the future. Log it now, so that you have a full set of historic log data.
  • Keep logs: Log files are so small, and current storage technology so vast in volume, that there is rarely a good reason for deleting historical log files. Keep them, as they can be used as part of a long-term analysis of application efficiency. They might also be necessary for auditing purposes.
  • Centralize Storage: Keep all log files in a secure, centralized repository with easy access for real-time log monitoring and analysis.

About the Author

Ali Raza is a DevOps consultant who analyzes IT solutions, practices, trends and challenges for large enterprises and promising new startup firms.

Best Practices for Analyzing Elastic Load Balancer Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

How We Survived the Dyn DNS Outage

$
0
0

What Happened?

On Friday October 21st, Dyn, a major DNS provider, started having trouble due to a DOS attack. Many companies including PagerDuty, Reddit, Twitter, and others suffered significant downtime. Sumo Logic had a short blip of failures, but stayed up, allowing our customers to continue to seamlessly use our service for monitoring and troubleshooting within their organizations.

 

How did Sumo Logic bear the outage?

Several months ago, we suffered a DNS outage and had a postmortem that focused on being more resilient to such incidents. We decided to create a primary-secondary setup for DNS. After reading quite a bit about how this should work in theory, we implemented a solution with two providers: Neustar and Dyn. This setup saved us during today’s outage. I hope you can learn from our setup and make your service more resilient as well.

 

How is a primary-secondary DNS setup supposed to work?

  • You maintain the DNS zone on the primary only. Any update to that zone gets automatically replicated to the secondary via two methods: A push notification from the primary and a periodic pull from the secondary. The two providers stay in sync and you do not have to worry about maintenance of the zone.
  • Your registrar is configured with nameservers from both providers.
    • Order does NOT matter.
    • DNS Resolvers do not know which nameservers are primary and which are secondary. They just choose between all the configured nameservers.
    • Most DNS Resolvers choose which name server to use based on latency of the prior responses.
    • The rest of the DNS Resolvers choose at random.
    • If you have 4 nameservers with 1 from one provider and 3 from another, the more simplistic DNS Resolvers will split traffic 1/4 to 3/4, whereas the ones that track latency will still hit the faster provider more often.
  • When there is a problem contacting a nameserver, DNS Resolvers will pick another nameserver from the list until one works.

 

How to set up a primary-secondary DNS?

  1. Sign up for two different companies who provide high-speed DNS services and offer primary/secondary setup.
    • My recommendation is: NS1, Dyn, Neustar (ultradns) and Akamai.
    • Currently Amazon’s Route53 does not provide transfer ability and therefore cannot support primary/secondary setup. ( You would have to change records in both providers and keep them in sync.)
    • Slower providers will not take on as much traffic as fast ones, so you have to be aware of how fast the providers are for your customers.
  2. Configure one to be primary. This is the provider who you use when you make changes to your DNS.
  3. Follow the primary provider’s and secondary provider’s instructions to set up the secondary provider.
    • This usually involves configuring whitelisting the secondary’s IPs at the primary, adding notifications to primary, and telling the secondary what IPs to use to get the transfer at the primary.
  4. Ensure that the secondary is syncing your zones with the primary. (Check on their console and try doing a dig @nameserver domain for the secondary’s nameservers.)
  5. Configure your registrar with both the primary’s and secondary’s name servers.
    • We found out that the order does not matter at all.

 

Our nameserver setup at the registrar:

  • ns1.p29.dynect.net
  • ns2.p29.dynect.net
  • udns1.ultradns.net
  • udns2.ultradns.net

 

What happened during the outage?

We got paged at 8:53 AM for DNS problem hitting service.sumologic.com. This was from our internal as well as external monitors. The oncalls ran a “dig” against all four of our nameservers and discovered that Dyn was down hard.

We knew that we had a primary/secondary DNS setup, but neither provider had experienced any outages since we set it up. We also knew that it would take DNS Resolvers some time to decide to use Neustar nameservers as opposed to Dyn ones. Our alarms went off, so, we posted a status page telling our customers that we are experiencing an incident with our DNS and to let us know if they see a problem.

Less than an hour later, our alarms stopped going off (although Dyn was still down). No Sumo Logic customers reached out to Support to let us know that they had issues.

Here is a graph of the traffic decreases for one of the Sumo Logic domains during the Dyn Outage:

screenshot-2016-10-21-14-28-15
Here is a graph of Neustar (UltraDNS) pulling in more traffic during the outage:

pic3

 

In conclusion:

This setup worked for Sumo Logic. We do not have control over DNS providers, but we can prevent their problems from affecting our customers. You can easily do the same.

Logging S3 API Calls with CloudTrail

$
0
0

aws-cloudtrailAmazon Simple Storage Service, or Amazon S3 for short allows for the simple storage of data in many forms. Users can upload and access data through the Amazon console, or by executing calls against the S3 API, either programmatically or through the command line. The method for Logging S3 API calls may not be immediately obvious.

Manipulating data with S3 is as simple as uploading a file, or it can be done through an easy-to-construct API call. But this simplicity can come with a problem. Let’s consider the potential situation wherein you realize that the data in your S3 bucket is not quite right. Perhaps files are missing, files have been modified when they shouldn’t have been, or some suspect files have started to appear. This need not be the result of nefarious activity—It may simply be the result of a microservice mishandling data due to a software bug.

To detect and address issues with your S3 data, you need a way to monitor and audit calls to the S3 API. That is the subject of this post.

Logging S3 API Calls and Tracking Changes with CloudTrail

Monitoring API calls wasn’t always easy, at least not before the introduction in late 2013 of AWS CloudTrail. Essentially, CloudTrail is an AWS Service which tracks calls to the APIs in your account, keeping track of:

  • Time of the API call
  • Identity of the caller, including the IP address
  • Request parameters
  • The resulting response

In order to enable CloudTrail on your S3 API calls, log into your AWS Management Console and navigate to the AWS CloudTrail home page. Alternately, you can simply append /cloudtrail/home to the URL for your AWS console. The resultant URL should look similar to the following, depending on which region you are in:


https://us-west-2.console.aws.amazon.com/cloudtrail/home

logging s3 api calls with cloudtrail

Click on the Get Started Now button. Select a name for the CloudTrail service, and determine if you want to enable this service for all regions. Then determine if you want to monitor an existing S3 bucket, or if you want to create a new bucket, and choose the appropriate option. If you intend to monitor an existing bucket, select it from the available option next to the S3 bucket label. Other options are available, but this will give you visibility into basic operations on the selected bucket.

Enabling CloudTrail

Select Turn On

Data is great, but I want to know the instant things go awry

With a trail enabled on your S3 bucket, any actions through the console, or through API calls to that bucket, are going to be logged. Upload a new file, or perform another action on your “trailed” S3 bucket and then return to the CloudTrail home page. You should now see a listing of events detailing the actions you just performed. You can click to view more information on each.

This is really handy if you know what you’re looking for, but you’ll probably want to know exactly when things start performing contrary to what you expect.

In order to do that, you’ll want to feed your CloudTrail logs into CloudWatch. From CloudWatch you can set up alarms, track metrics and monitor trends and maintain a view of activity in your S3 buckets in addition to your other services in the AWS ecosystem.

Configuring CloudTrail to Send Logs to CloudWatch

Return to the CloudTrail home page, and click on the name of your trail. This will bring you to the configuration page for the trail. Scroll to the bottom of the page, and click on the Configure button under the CloudWatch Logs header.

monitoring-api-calls-cloudtrail

First, you’ll want to select either an existing Log Group to send the logs to, or create a new one. Specify either the name of the existing group or enter a new group name, and click Continue.

cloudwatch-logging-s3-api-calls

We will now need to create a role to allow the interaction between CloudTrail and CloudWatch. AWS will walk you through this process. Click on the View Details link. By default, the IAM role should be set to CloudTrail_CloudWatchLogs. If you want to use a different IAM role, you can specify it here.

aws-cloudwatch-iam-role

If you have an existing policy you would like to use, you can select it next to Policy Name, or you can select Create a New Role Policy in that same dropbox. Finally, click on Allow to finish the process and return yourself to the CloudTrail configuration screen. You should now see the configuration for the CloudWatch logs under the section of the same name, and all CloudTrail logs should be included in your CloudWatch logs.

Further reading

For more information on setting up alarms and general use of CloudWatch, check out the CloudWatch user guide.

About the Author

Mike Mackrory is a Global citizen who has settled down in the Pacific Northwest – for now. By day he works as a Senior Engineer on a Quality Engineering team and by night he writes, consults on several web based projects and runs a marginally successful eBay sticker business. When he’s not tapping on the keys, he can be found hiking, fishing and exploring both the urban and the rural landscape with his kids. Always happy to help out another developer, he has a definite preference for helping those who bring gifts of gourmet donuts, craft beer and/or Single-malt Scotch.

Solaris Containers: What You Need to Know

$
0
0

Solaris ContainerSolaris and containers may not seem like two words that go together—at least not in this decade. For the past several years, the container conversation has been all about platforms like Docker, CoreOS and LXD running on Linux (and, in the case of Docker, now Windows and Mac OS X, too).

But Solaris, Oracle’s Unix-like OS, has actually had containers for a long time. In fact, they go all the way back to the release of Solaris 10 in 2005 (technically, they were available in the beta version of Solaris 10 starting in 2004), long before anyone was talking about Linux containers for production purposes. And they’re still a useful part of the current version of the OS, Solaris 11.3.

Despite the name similarities, Solaris containers are hardly identical to Docker or CoreOS containers. But they do similar things by allowing you to virtualize software inside isolated environments without the overhead of a traditional hypervisor.

Even as Docker and the like take off as the container solutions of choice for Linux environments, Solaris containers are worth knowing about, too—especially if you’re the type of developer or admin who finds himself, against his will, stuck in the world of proprietary, commercial Unix-like platforms because some decision-maker in his company’s executive suite is still wary of going wholly open source….

Plus, as I note below, Oracle now says it is working to bring Docker to Solaris containers—which means containers on Solaris could soon integrate into the mainstream container and DevOps scene.

Below, I’ll outline how Solaris containers work, what makes them different from Linux container solutions like Docker, and why you might want to use containers in a Solaris environment.

The Basics of Solaris Containers

Let’s start by defining the basic Solaris container architecture and terminology.

On Solaris, each container lives within what Oracle calls a local zone. Local zones are software-defined boundaries to which specific storage, networking and/or CPU resources are assigned. The local zones are strictly isolated from one another in order to mitigate security risks and ensure that no zone interferes with the operations of another.

Each Solaris system also has a global zone. This consists of the host system’s resources. The global zone controls the local zones (although a global zone can exist even if no local zones are defined). It’s the basis from which you configure and assign resources to local zones.

Each zone on the system, whether global or local, gets a unique name (the name of the global zone is always “global”—boring, I know, but also predictable) and a unique numerical identifier.

So far, this probably sounds a lot like Docker, and it is. Local zones on Solaris are like Docker containers, while the Solaris global zone is like the Docker engine itself.

Working with Zones and Containers on Solaris

The similarities largely end there, however, at least when it comes to the ways in which you work with containers on Solaris.

On Docker or CoreOS, you would use a tool like Swarm or Kubernetes to manage your containers. On Solaris, you use Oracle’s Enterprise Manager Ops Center to set up local zones and define which resources are available to them.

Once you set up a zone, you can configure it to your liking (for the details, check out Oracle’s documentation), then run software inside the zones.

One particularly cool thing that Solaris containers let you do is migrate a physical Solaris system into a zone. You can also migrate zones between host machines. So, yes, Solaris containers can come in handy if you have a cluster environment, even though they weren’t designed for native clustering in the same way as Docker and similar software.

Solaris Containers vs. Docker/CoreOS/LXD: Pros and Cons

By now, you’re probably sensing that Solaris containers work differently in many respects from Linux containers. You’re right.

In some ways, the differences make Solaris a better virtualization solution. In others, the opposite is true. Mostly, though, the distinction depends on what is most important to you.

Solaris’s chief advantages include:

  • Easy configuration: As long as you can point and click your way through Enterprise Manager Ops Center, you can manage Solaris containers. There’s no need to learn something like the Docker CLI.
  • Easy management of virtual resources: On Docker and CoreOS, sharing storage or networking with containerized apps via tools like Docker Data Volumes can be tedious. On Solaris it’s more straightforward (largely because you’re splicing up the host resources of only a single system, not a cluster).

But there are also drawbacks, which mostly reflect the fact that Solaris containers debuted more than a decade ago, well before people were talking about the cloud and hyper-scalable infrastructure.

Solaris container cons include:

  • Solaris container management doesn’t scale well. With Enterprise Manager Ops Center, you can only manage as many zones as you can handle manually.
  • You can’t spin up containers quickly based on app images, as you would with Docker or CoreOS, at least for now. This makes Solaris containers impractical for continuous delivery scenarios. But Oracle says it is working to change that by promising to integrate Docker with Solaris zones. So far, though, it’s unclear when that technology will arrive in Solaris.
  • There’s not much choice when it comes to management. Unlike the Linux container world, where you can choose from dozens of container orchestration and monitoring tools, Solaris only gives you Oracle solutions.

The bottom line: Solaris containers are not as flexible or nimble as Linux containers, but they’re relatively easy to work with. And they offer powerful features, especially when you consider how old they are. If you work with Oracle data centers, Solaris containers are worth checking out, despite being a virtualization solution that gets very little press these days.

Solaris Containers: What You Need to Know is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

Benchmarking Microservices for Fun and Profit

$
0
0

Charts on WallWhy should I benchmark microservices?

The ultimate goal of benchmarking is to better understand the software, and test out the effects of various optimization techniques for microservices. In this blog, we describe our approach to benchmarking microservices here at Sumo Logic.

Create a spreadsheet for tracking your benchmarking

We found a convenient way to document a series of benchmarks is in a Google Spreadsheet. It allows collaboration and provides the necessary features to analyze and sum up your results. Structure your spreadsheet as follows:

  • Title page
    • Goals
    • Methodology
    • List of planned and completed experiments (evolving as you learn more)
    • Insights
  • Additional pages
    • Detailed benchmark results for various experiments

Be clear about your benchmark goals

Before you engage in benchmarking, clearly state (and document) your goal. Examples of goals are:

“I am trying to understand how input X affects metric Y”

“I am running experiments A, B and C to increase/decrease metric X”

Pick one key metric (Key Performance Indicator – KPI)

State clearly which one metric you are concerned about and how the metric affects users of the system. If you choose to capture additional metrics for your test runs, ensure that the key metrics stands out.

Think like a scientist

You’re going to be performing a series of experiments to better understand which inputs affect your key metric, and how. Consider and document the variables you devise, and create a standard control set to compare against. Design your series of experiments in a fashion that leads to the understanding in the least amount of time and effort.

Define, document and validate your benchmarking methodology

Define a methodology for running your benchmarks. It is critical your benchmarks be:

  • Fairly fast (several minutes, ideally)
  • Reproducible in the exact same manner, even months later
  • Documented well enough so another person can repeat them and get identical results

Document your methodology in detail. Also document how to re-create your environment. Include all details another person needs to know:

  • Versions used
  • Feature flags and other configuration
  • Instance types and any other environmental details

Use load generation tools, and understand their limitations

In most cases, to accomplish repeatable, rapid-fire experiments, you need a synthetic load generation tool. Find out whether one already exists. If not, you may need to write one.

Understand that load generation tools are at best an approximation of what is going on in production. The better the approximation, the more relevant the results you’re going to obtain. If you find yourself drawing insights from benchmarks that do not translate into production, revisit your load generation tool.

Validate your benchmarking methodology

Repeat a baseline benchmark at least 10 times and calculate the standard deviation over the results. You can use the following spreadsheet formula:

=STDEV(<range>)/AVERAGE(<range>)

Format this number as a percentage, and you’ll see how big the relative variance in your result set is. Ideally, you want this value to be < 10%. If your benchmarks have larger variance, revisit your methodology. You may need to tweak factors like:

  • Increase the duration of the tests.
  • Eliminate variance from the environments.
    • Ensure all benchmarks start in the same state (i.e. cold caches, freshly launched JVMs, etc).
    • Consider the effects of Hotspot/JITs.
  • Simplify/stub components and dependencies on other microservices that add variance but aren’t key to your benchmark.
    • Don’t be shy to make hacky code changes and push binaries you’d never ship to production.

Important: Determine the number of results you need to get the standard deviation below a good threshold. Run each of your actual benchmarks at least that many times. Otherwise, your results may be too random.

Execute the benchmark series

Now that you have developed a sound methodology, it’s time to gather data. Tips:

  • Only vary one input/knob/configuration setting at a time.
  • For every run of the benchmark, capture start and end time. This will help you correlate it to logs and metrics later.
  • If you’re unsure whether the input will actually affect your metric, try extreme values to confirm it’s worth running a series.
  • Script the execution of the benchmarks and collection of metrics.
  • Interleave your benchmarks to make sure what you’re observing aren’t slow changes in your test environment. Instead of running AAAABBBBCCCC, run ABCABCABCABC.

 Create enough load to be able to measure a difference

There are two different strategies for generating load.

Strategy 1: Redline it!

In most cases, you want to ensure you’re creating enough load to saturate your component. If you do not manage to accomplish that, how would you see that you increased it’s throughput?

If your component falls apart at redline (i.e. OOMs, throughput drops, or otherwise spirals out of control), understand why, and fix the problem.

Strategy 2: Measure machine resources

In cases where you cannot redline the component, or you have reason to believe it behaves substantially different in less-than-100%-load situations, you may need to resort to OS metrics such as CPU utilization and IOPS to determine whether you’ve made a change.

Make sure your load is large enough for changes to be visible. If your load causes 3% CPU utilization, a 50% improvement in performance will be lost in the noise.

Try different amounts of load and find a sweet spot, where your OS metric measurement is sensitive enough.

Add new benchmarking experiments as needed

As you execute your benchmarks and develop a better understanding of the system, you are likely to discover new factors that may impact your key metric. Add new experiments to your list and prioritize them over the previous ones if needed.

Hack the code

In some instances, the code may not have configuration or control knobs for the inputs you want to vary. Find the fastest way to change the input, even if it means hacking the code, commenting out sections or otherwise manipulating the code in ways that wouldn’t be “kosher” for merges into master. Remember: The goal here is to get answers as quickly as possible, not to write production-quality code—that comes later, once we have our answers.

Analyze the data and document your insights

Once you’ve completed a series of benchmarks, take a step back and think about what the data is telling you about the system you’re benchmarking. Document your insights and how the data backs them up.

It may be helpful to:

    • Calculate the average for each series of benchmarks you ran and to use that to calculate the difference (in percent) between series — i.e. “when I doubled the number of threads, QPS increased by 23% on average.”
    • Graph your results — is the relationship between your input and the performance metric linear? Logarithmic? Bell curve?

Present your insights

  1. When presenting your insights to management and/or other engineering teams, apply the Pyramid Principle. Engineers often make the mistake of explaining methodology, results and concluding with the insights. It is preferable to reverse the order and start with the insight. Then, if needed/requested, explain methodology and how the data supports your insight.
  2. Omit nitty-gritty details of any experiments that didn’t lead to interesting insights.
  3. Avoid jargon, and if you cannot, explain it. Don’t assume your audience knows the jargon.
  4. Make sure your graphs have meaningful, human-readable units.
  5. Make sure your graphs can be read when projected onto a screen or TV.
Viewing all 1036 articles
Browse latest View live