Open Source Projects at Sumo Logic

January 18, 2016, 8:25 am

≫ Next: Introducing Sumo Logic Live Tail

≪ Previous: DevOps Visibility – Monitor, Track, Troubleshoot

Someone recently asked me, rather smugly I might add, “who’s ever made money from open source?” At the time I naively answered with the first person who came to mind, which was Rod Johnson, the creator of Java’s Spring Framework. My mind quickly began retrieving other examples, but in the process I began to wonder about the motivation behind the question.

The inference, of course, was that open source is free. Such a sentiment speaks not only to monetization but to the premise of open source, which raises a good many questions. As Karim R. Lakhani and Robert G Wolf wrote, “Many are puzzled by what appears to be irrational and altruistic behavior… giving code away, revealing proprietary information, and helping strangers solve their technical problems.”

While many thought that better jobs, career advancement, and so on are the main drivers, Lakhani and Wolf discovered it is how creative a person feels when working on the project (what they call “enjoyment-based intrinsic motivation”) is the strongest and most pervasive driver. They also found that user need, intellectual stimulation derived from writing code, and improving programming skills are top motivators for project participation.

Open Source Projects at Sumo Logic

Here at Sumo Logic, we have some very talented developers on the engineering team and they are passionate about both the Sumo Logic application and giving back. To showcase some of the open-source projects our developers are working on, as well as other commits from our community we’ve created a gallery on our developer site where you can quickly browse projects and dive into the repos, code, and gists we’ve committed. Here’s a sampling of what you’ll find:

Sumoshell

Parsing out fields on the command line can be cumbersome. Aggregating is basically impossible, and there is no good way to view the results. Written by Russell Cohen, Sumoshell is collection of CLI utilities written in Go that you can use to improve analyzing log files. Grep can’t tell that some log lines span multiple individual lines. In Sumoshell, each individual command acts as a phase in a pipeline to get the answer you want. Sumoshell brings a lot of the functionality of Sumo Logic to the command line.

Sumobot

As our Chief Architect, Stefan Zier, explains in this blog post, all changes to production environments at Sumo Logic follow a well-documented change management process. In the past, we manually tied together JIRA and Slack to get from a proposal to approved change in the most expedient manner. So we built a plugin for our sumobot Slack bot. Check out both the post and the plugin.

Sumo Logic Python SDK

Written by Yoway Buorn, the SDK provides a Python interface to the Sumo Logic REST API. The idea is to make it easier to hit the API in Python code. Feel free to add your scripts and programs to the scripts folder.

Sumo Logic Java Client

Sumo Logic provides a cloud-based log management solution. It can process and analyze log files in petabyte scale. This library provides a Java client to execute searches on the data collected by the Sumo Logic service.

Growing Number of Projects

Machine data and analytics is about more than just server logging and aggregation. There are some interesting problems yet to be solved. Currently, you’ll find numerous appenders for .Net and Log4j, search utilities for Ruby and Java, Chef Cookbooks, and more. We could additional examples calling our REST API’s from different languages. As we build our developer community, we’d like to invite you contribute. Check out the open-source projects landing page and browse through the projects. Feel free to fork a project and share, or add examples to folders where indicated.

↧

Introducing Sumo Logic Live Tail

January 21, 2016, 9:54 am

≫ Next: Who Broke My Test? A Git Bisect Tutorial

≪ Previous: Open Source Projects at Sumo Logic

In my last post I wrote about how DevOps’ emphasis on frequent release cycles leads to the need for more troubleshooting in production, and that developers are being frequently being drawn into that process. Troubleshooting applications in production isn’t always easy: For developers, the first course of action is to drop down to terminal, ssh into the environment (assuming you have access) and begin tailing log files to determine the current state. When the problem isn’t immediately obvious, they might tail -f the logs to a file, then grep for specific patterns. But there’s no easy way to search log tails in real time. Until now.

Now developers and team members have a new tool, called Sumo Logic Live Tail, that lets you tail log files into a window, filter for specific conditions and utilize other cool features to troubleshoot in real time. Specifically, Live Tail lets you:

Pause the log stream, scroll up to previous messages, then jump to the latest log line and resume the stream.
Create keywords that will then be used to highlight occurrences within the log stream.
Filter log files on-the-fly in real time
Tail multiple log files simultaneously by multi-tailing
Launch Sumo Logic Search in context of Sumo Logic Live Tail (and vice versa)

Live Tail is immediately available from with the Sumo Logic environment, and coming soon is a command line interface (CLI) that will allow developers to launch live tail directly from the command line.

What Can I Do With Live Tail?

Troubleshoot Production Logs in Real Time

You can now troubleshoot without having to log into business critical applications. Users also can harness the power of Sumo Logic by being able to launch Search in the context of Live Tail and vice versa. There is simply no need to go between different tools to get the data you need.

Save Time Requesting and Exporting Log Files

As I mentioned, troubleshooting applications in production with tail-f isn’t always easy. First, you need to gain access to production log files. For someone managing sensitive data, admins may be reluctant to grant that access. Live Tail allows you to view your most recent logs in real time, analyze them in context, copy and share every time via secure email when there’s an outage, and set up searches based on live tail results using Sumo Logic.

Consolidate Tools to Reduce Costs

In the past, you may have toggled between two tools: one for tailing your logs and another for advanced analytics for pattern recognition to help with troubleshooting, proactive problem identification and user analysis. With Sumo Logic Live Tail, you can now troubleshoot from the Sumo Logic browser interface or from a Sumo Logic Command Line Interface without investing in a separate solution for live tail, thereby reducing the cost of owning licenses for multiple tools.

Getting Started

There are a couple of ways to initiate a Live Tail session. From the Sumo Logic web app:

Go directly to Live Tail by hovering over the Search menu and clicking on the Live Tail menu item; or
From an existing search, click the Live Tail link (just below the search interface).

In both instances, you’ll need to enter the name of the _sourceCategory, _sourceHost, _sourceName, _source, or _collector of the log you want to tail, along with any filters. Click Run to initiate the search query. That will bring up a session similar to Figure 1.

Figure 1. A Live Tail session.

To find specific information, such as errors and exceptions you can filter by keyword. Just add your keywords to the Live Tail query and click Run or press Enter. The search will be rerun with the new filter and those keywords will be highlighted on incoming messages, making easy to spot conditions. The screen clears, and new results automatically scroll.

Figure 2. Using Keyword Highlighting to quickly locate items in the log stream.

To highlight keywords that appear in your running Live Tail, click the A button. A dialog will open — enter the term you’d like to highlight. You may enter multi-term keywords separated by spaces. Hit enter to add additional keywords. The different keywords are then highlighted using different colors, so that they are easy to find on the screen. You can highlight up to eight keywords at a time.

Multi-tailing

A single log file doesn’t always give you a full view. Using the multi-tail feature, you can tail multiple logs simultaneously. For example, after a database reboot, you can check if it was successful by validating that the application is querying the database. But if there’s an error on one server, you’ll need to check the other servers to see if they may be affected.

You can start a second Live Tail session from the Live Tail page, or from the Search page, and the browser opens in split-screen mode, and streams 300 – 400 messages per minute. You can also open, or “pop out” a running Live Tail session into a new browser window. This way, you can move the new window to another screen, or watch it separately from the browser window where Sumo Logic is running.

Figure 3. Multi-tailing in split screen mode

Launch In Context

One of the highlights of Sumo Logic Live Tail is the ability to launch in context, which allows you to seamlessly alternate between Sumo Logic Search and Live Tail in browser mode. For example, when you are on the search page and need to start tailing a log file to view the most recent log files coming in (raw log lines), you click on a button to launch the Live Tail page from Search and the source name gets carried forward automatically. If you are looking to perform more advanced operations like parsing, using operators or increasing the time range for the previous day, simply click “Open in Search”. This action launches a new search tab which automatically includes the parameters you entered on the Live Tail page. There is no delay to re-enter the parameters.

For more information about using Live Tail, check out the documentation in Sumo Logic Help.

↧

Who Broke My Test? A Git Bisect Tutorial

August 26, 2016, 8:00 am

≫ Next: Building Software Release Cycle Health Dashboards in Sumo Logic

≪ Previous: Introducing Sumo Logic Live Tail

Git bisect is a great way to help narrow down who (and what) broke something in your test—and it is incredibly easy to learn. This Git Bisect tutorial will guide you through the basic process.

So, what is Git bisect?

What exactly is it? Git bisect is a binary search to help find the commit that introduced a bug. Not only is it a great tool for developers, but it’s also very useful for testers (like myself).

I left for the weekend with my Jenkins dashboard looking nice and green. It was a three-day weekend—time to relax! I came in Tuesday, fired up my machine, logged into Jenkins…RED. I had to do a little bit of detective work.

Luckily for me, a developer friend told me about git bisect (I’m relatively new to this world), and helped me quickly track down which commit broke my tests.

Getting started with Git bisect

First, I had to narrow down the timeline. (Side note—this isn’t really a tool if you’re looking over the last few months, but if it’s within recent history—days—it’s handy). Looking at my build history in Jenkins, I noted the date/times I had a passing build (around 11 AM), and when it started showing up red (around 5 PM).

I went into SourceTree and found a commit from around 11 AM that I thought would be good. A simple double-click of that commit and I was in. I ran some tests against that commit, and all passed, confirming I had a good build. It was time to start my bisect session!

git bisect start git bisect good

Narrowing down the suspects

Now that I’d established my GOOD build, I had time to figure out where I thought the bad build occured. Back to SourceTree! I found a commit from around 5 PM (where I noticed the first failure), so I thought I’d check that one out. I ran some more tests. Sure enough, they failed. I marked that as my bad build.

git bisect bad

I had a bunch of commits between the good and bad (in my case, 15), and needed to find which one between our 11 AM and 5 PM run broke our tests. Now, without bisect, I might have had to pull down each commit, run my tests, and see which started failing between good and bad. That’s very time-consuming. But git bisect prevents you from having to do that.

When I ran git bisect bad, I got a message in the following format:
Bisecting: revisions left to test after this (roughly steps)

[<commit number>] <Commit Description>

This helped identify the middle commit between what I identified as good and bad, cutting my options in half. It told me how many revisions were between the identified commit and my bad commit (previously identified), how many more steps I should have to find the culprit, and which commit I next needed to look at.

Then, I needed to test the commit that bisect came up with. So—I grabbed it, ran my tests—and they all passed.
git bisect good
This narrowed down my results even further and gave me a similar message. I continued to grab the commit, run tests, and set them as git good until I found my culprit—and it took me only three steps (versus running through about 15 commits)! When my tests failed, I had to identify the bad commit.

git bisect bad

<commit number> is the first bad commit
<commit number>
Author: <name>
Date: <date and time of commit>

Aha! I knew the specific commit that broke my test, who did it, and when. I had everything I needed to go back to the engineer (with proof!) and start getting my tests green again.

git-bisect-sumo-logic

Getting to the Root Cause

When you are in the process of stabilizing tests, it can be fairly time-consuming to determine if a failure is a result of a test, or the result of an actual bug. Using git bisect can help reduce that time and really pinpoint what exactly went wrong. In this case, we were able to quickly go to the engineer, alert the engineer that a specific commit broke the tests, and work together to understand why and how to fix it.

Of course, in my perfect world, it wouldn’t only be my team that monitors and cares about the results. But until I live in Tester’s Utopia, I’ll use git bisect.

Who Broke My Test? A Git Bisect Tutorial is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Ashley Ashley Hunsberger is a Quality Architect at Blackboard, Inc. and co-founder of Quality Element. She’s passionate about making an impact in education and loves coaching team members in product and client-focused quality practices. Most recently, she has focused on test strategy implementation and training, development process efficiencies, and preaching Test Driven Development to anyone that will listen. In her downtime, she loves to travel, read, quilt, hike, and spend time with her family.

↧

Building Software Release Cycle Health Dashboards in Sumo Logic

August 29, 2016, 12:25 pm

≫ Next: AWS Elastic Load Balancing: Load Balancer Best Practices

≪ Previous: Who Broke My Test? A Git Bisect Tutorial

Gauging the health and productivity of a software release cycle is notoriously difficult. Atomic age metrics like “man months” and LOCs may be discredited, but they are too often a reflexive response for DevOps problems.

Instead of understanding the cycle itself, management may hire a “DevOps expert” or homebrew one by taking someone off their project and focusing them on “automation.” Or they might add man months and LOCs with more well-intentioned end-to-end tests.

What could go wrong? Below, I’ve compiled some metrics and tips for building a release cycle health dashboard using Sumo Logic.

Measuring Your Software Release Cycle Speed

Jez Humble points to some evidence that delivering faster not only shortens feedback but also makes people happier, even on deployment days. Regardless, shorter feedback cycles do tend to bring in more user involvement in the release, resulting in more useful features and fewer bugs. Even if you are not pushing for only faster releases, you will still need to allocate resources between functions and services. Measuring deployment speed will help.

Change lead time: Time between ticket accepted and ticket closed.
Change frequency: Time between deployments.
Recovery Time: Time between a severe incident and resolution.

To get this data to Sumo Logic, ingest your SCM and incident management tools. While not typical log streams, the tags and timestamps are necessary to tracking the pipeline. You can return deployment data from your release management tools.
Tracking Teams, Services with the Github App
To avoid averaging out insights, Separately tag services and teams in each of the tests above. For example, if a user logic group works on identities and billing, track billing and identity services separately.

For Github users, there is an easy solution, the Sumo Logic App for Github, which is currently available in preview. It generates pre-built dashboards in common monitoring areas like security, commit/pipeline and issues. More importantly, each panel provides queries that can be repurposed for separately tagged, team-specific panels.

Reusing these queries allows you to build clear pipeline visualizations very quickly. For example, let’s build a “UI” team change frequency panel.

First, create a lookup table designating UserTeams. Pin it to saved queries as it can be used across the dashboard to break out teams:

"id","user","email","team", "1","Joe","joe@example.com","UI" "2","John","john@example.com","UI" "3","Susan","susan@example.com","UI" "4","John","another_john@example.com","backspace" "5","John","yet_another_john@example.com","backspace"

Next, copy the “Pull Requests by Repository” query from the panel:

_sourceCategory=github_logs and ( "opened" or "closed" or "reopened" ) | json "action", "issue.id", "issue.number", "issue.title" , "issue.state", "issue.created_at", "issue.updated_at", "issue.closed_at", "issue.body", "issue.user.login", "issue.url", "repository.name", "repository.open_issues_count" as action, issue_ID, issue_num, issue_title, state, createdAt, updatedAt, closedAt, body, user, url, repo_name, repoOpenIssueCnt | count by action,repo_name | where action != "assigned" | transpose row repo_name column action

Then, pipe in the team identifier with a lookup command:

_sourceCategory=github_logs and ( "opened" or "closed" or "reopened" ) | json "action", "issue.id", "issue.number", "issue.title" , "issue.state", "issue.created_at", "issue.updated_at", "issue.closed_at", "issue.body", "issue.user.login", "issue.url", "repository.name", "repository.open_issues_count" as action, issue_ID, issue_num, issue_title, state, createdAt, updatedAt, closedAt, body, user, url, repo_name, repoOpenIssueCnt | lookup team from https://toplevelurlwithlookups.com/UserTeams.csv on user=user | count by action,repo_name, team | where action != "assigned" | transpose row repo_name team column action

This resulting query tracks commits — open, closed or reopened — by team. The visualization can be controlled on the panel editor, and the lookup can be easily piped to other queries to break the pipeline by teams.

Don’t Forget User Experience

It may seem out of scope to measure user experience alongside a deployment schedule and recovery time, but it’s a release cycle health dashboard, and nothing is a better measure of a release cycle’s health than user satisfaction.

There are two standards worth including: Apdex and Net Promoter Score.

Apdex: measures application performance on a 0-1 satisfaction scale calculated by…

Apdex equation for search - sumo logic

If you want to build an Apdex solely in Sumo Logic, you could read through this blog post and use the new Metrics feature in Sumo Logic. This is a set of numeric metrics tools for performance analysis. It will allow you to set, then tune satisfaction and tolerating levels without resorting to a third party tool.

Net Promoter Score: How likely is it that you would recommend our service to a friend or colleague? This one-question survey correlates with user satisfaction, is simple to embed anywhere in an application or marketing channel, and can easily be forwarded to a Sumo Logic dashboard through a webhook. When visualizing these UX metrics, do not use the single numerical callout. Take advantage of Sumo Logic’s time-series capabilities by tracking a line chart with standard deviation. Over time, this will give you an expected range of satisfaction and visual cues of spikes in dissatisfaction that sit on the same timeline as your release cycle.

Controlling the Release Cycle Logging Deluge

A release cycle has a few dimensions that involve multiple sources, which allow you to query endlessly. For example, speed requires ticketing, CI and deployment logs. Crawling all the logs in these sources can quickly add up to TBs of data. That’s great fun for ad hoc queries, but streams like comment text are not necessary for a process health dashboard, and their verbosity can result in slow dashboard load times and costly index overruns.

To avoid this, block this and other unnecessary data by partitioning sources in Sumo Logic’s index tailoring menus. You can also speed up the dashboard by scheduling your underlying query runs for once a day. A health dashboard doesn’t send alerts, so it doesn’t need to be running in real-time.

More Resources:

Building Software Release Cycle Health Dashboards in Sumo Logic is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Alex Entrekin served on the executive staff of Cloudshare where he was primarily responsible for advanced analytics and monitoring systems. His work extending Splunk into actionable user profiling was featured at VMworld: “How a Cloud Computing Provider Reached the Holy Grail of Visibility.”

Alex is currently an attorney, researcher and writer based in Santa Barbara, CA. He holds a J.D. from the UCLA School of Law.

↧

AWS Elastic Load Balancing: Load Balancer Best Practices

August 30, 2016, 8:00 am

≫ Next: A Beginner’s Guide to GitHub Events

≪ Previous: Building Software Release Cycle Health Dashboards in Sumo Logic

elb load balancer best practices You are probably aware that Amazon Web Services provides Elastic Load Balancing (ELB) as their preferred load balancing solution on the AWS platform. ELB distributes incoming application traffic across multiple Amazon EC2 instances in the AWS Cloud. Considering the costs involved, and the expertise required to set up your own load balancer solution, ELB can be a very attractive option for rapid deployment of your application stack.

What can often be overlooked involve these few tips and tricks to make sure you are utilizing ELB to best support your application use case and infrastructure. Following is a list of some of the top tips and best practices I’ve found helpful when helping others to set up Elastic Load Balancing.

Load Balancer Best Practices

Read the Docs!

This may seem obvious, but reading the docs and having a good fundamental understanding of how things work will save you a lot of trouble in the long run. There’s nothing like a short hands-on tutorial to get you started while conveying key features. Amazon provides very detailed documentation on how to set up and configure ELB for your environment. A few helpful docs for ELB have been included in the references section of this article. On that note, Amazon also provides recordings of their events and talks on Youtube. These can be very helpful when trying to understand the various AWS services and how they work.

Plan your Load Balancer Installation

Thorough planning is another key practice that can often be overlooked in the rush to implement a new application stack. Take the time to understand your application requirements thoroughly so you know the expected application behavior—especially as it relates to ELB.

Be sure to factor in considerations like budget and costs, use case, scalability, availability, and disaster recovery. This can’t be stressed enough. Proper planning will mean less unforeseen issues down the road, and can also serve to provide your business with a knowledge base when others need to understand how your systems work.

SSL Termination

Along the lines of planning, if you are using SSL for your application (and you should be), plan to terminate it on the ELB. You have the option of terminating your SSL connections on your instances, or on the load balancer.

Terminating on the ELB effectively offloads the SSL processing to Amazon, saving CPU time on your instances which is better used for application processing. It can also save on costs, because you will need less instances to handle application load.

This can also save administrative overhead since ELB termination effectively moves the management to a single point in the infrastructure, rather than requiring management of the SSL certs on multiple servers.

Unless your security requirements dictate otherwise, this can make your life quite a bit simpler.

Configure Cross-Zone Load Balancing

Don’t just place all of your instances into one availability zone inside of an AWS region, and call it a day. When mapping out your infrastructure, plan on placing your instances into multiple AZs to take advantage of cross-zone load balancing.

This can help with application availability and resiliency, and can also make your maintenance cycles easier as you can take an entire availability zone offline at a time to perform your maintenance, then add it back to the load balancer, and repeat with the remaining availability zones.

It’s worth noting that ELB does not currently support cross-region load balancing, so you’ll need to find another way to support multiple regions. One way of doing that is to implement global load balancing.

Global Load Balancing

AWS has a feature in Route 53 called routing policies. With routing policies, you can direct global traffic in a variety of ways, but more specifically direct client traffic to whichever data center is most appropriate for your client and application stack.

This is a fairly advanced topic, with a lot of ins and outs that can’t be covered in this article. See my post, Global Load Balancing Using AWS Route 53, for more details. In short, the best way to learn about this feature in Route 53 is probably to start with the docs, then try implementing it through the use of Route 53 traffic flow.

Pre-warming your ELB

Are you expecting a large spike in traffic for an event? Get in touch with Amazon and have them pre-warm your load balancers. ELBs scale best with a gradual increase in traffic load. They do not respond well to spiky loads, and can break if too much flash traffic is directed their way.

This is covered under the ELB best practices, and can be critical to having your event traffic handled gracefully, or being left wondering what just happened to your application stack when there is a sudden drop in traffic. Key pieces of information to relay to Amazon when contacting them are the total number of expected requests, and the average request response size.

One other key piece of information has to do with operating at scale. If you are terminating your SSL on the ELB, and the HTTPS request response size is small, be sure to stress that point with Amazon support. Small request responses coupled with SSL termination on the ELB may result in overloading the ELBs, even though Amazon will have scaled them to meet your anticipated demand.

Monitoring

One final item, but a very important one. Be sure and monitor your ELBs. This can be done through the AWS dashboard, and can provide a great deal of insight into application behavior, and identifying problems that may arise. You can also use the Sumo Logic App for Elastic Load Balancing for greater visibility into events that, in turn, help you understand the overall health of your EC2 deployment. For example, you can use the Sumo Logic App to analyze raw Elastic Load Balancing data to investigate the availability of applications running behind Elastic Load Balancers.

Conclusion

In this article, we’ve covered best practices for Amazon Elastic Load Balancing, including documentation, planning, SSL termination, regional and global load balancing, ELB pre-warming, and monitoring. Hopefully, these tips and tricks will help you to better plan and manage your AWS ELB deployment.

AWS Elastic Load Balancing: Load Balancer Best Practices is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

References

About the Author

Steve Tidwell has been working in the tech industry for over two decades, and has done everything from end-user support to scaling a global data ingestion and analysis platform to handle data analysis for some of the largest streaming events on the Web. He is currently Lead Architect for a well-known tech news site, where he plots to take over the world with cloud- based technologies from his corner of the office.

↧

A Beginner’s Guide to GitHub Events

August 31, 2016, 8:00 am

≫ Next: Solaris Containers: What You Need to Know

≪ Previous: AWS Elastic Load Balancing: Load Balancer Best Practices

Github events - Sumo Logic DevOps Do you like GitHub, but don’t like having to log in to check on the status of your project or code? GitHub events are your solution.

GitHub events provide a handy way to receive automated status updates from your GitHub repos concerning everything from code commits to new users joining a project. And because they are accessible via a Web API as GET requests, it’s easy to integrate them into the notification system of your choosing.

Keep reading for a primer on GitHub events and how to get the most out of them.

What GitHub events are, and what they are not

Again, GitHub events provide an easy way to keep track of your GitHub repository without monitoring its status manually. They’re basically a notification system that offers a high level of customizability.

You should keep in mind, however, that GitHub events are designed only as a way to receive notifications. They don’t allow you to interact with your GitHub repo. You can’t trigger events; you can only receive notifications when specific events occur.

That means that events are not a way for you to automate the maintenance of your repository or project. You’ll need other tools for that. But if you just want to monitor changes, they’re a simple solution.

How to use GitHub events

GitHub event usage is pretty straightforward. You simply send GET requests to https://api.github.com. You specify the type of information you want by completing the URL information accordingly.

For example, if you want information about the public events performed by a given GitHub user, you would send a GET request to this URL:

https://api.github.com/users//events
(If you are authenticated, this request will generate information about private events that you have performed.)

Here’s a real-world example, in which we send a GET request using curl to find information about public events performed by Linus Torvalds (the original author of Git), whose username is torvalds:

curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X GET https://api.github.com/users/torvalds/events

Another handy request lets you list events for a particular organization. The URL to use here looks like:

https://api.github.com/users/:username/events/orgs/

The full list of events, with their associated URLs, is available from the GitHub documentation.

Use GitHub Webhooks for automated events reporting

So far, we’ve covered how to request information about an event using a specific HTTP request. But you can take things further by using GitHub Webhooks to automate reporting about events of a certain type.

Webhooks allow you to “subscribe” to particular events and receive an HTTP POST response (or, in GitHub parlance, a “payload”) to a URL of your choosing whenever that event occurs. You can create a Webhook in the GitHub Web interface that allows you to specify the URL to which GitHub should send your payload when an event is triggered.

Alternatively, you can create Webhooks via the GitHub API using POST requests.

However you set them up, Webhooks allow you to monitor your repositories (or any public repositories) and receive alerts in an automated fashion.

Like most good things in life, Webhooks are subject to certain limitations, which are worth noting. Specifically, you can only configure up to a maximum of twenty events per each GitHub organization or repository.

Authentication and GitHub events

The last bit of information we should go over is how to authenticate with the GitHub API. While you can monitor public events without authentication, you’ll need to authenticate in order to keep track of private ones.

Authentication via the GitHub API is detailed here, but it basically boils down to having three options. The simplest is to do HTTP authentication using a command like:

curl -u "username" https://api.github.com

If you want to be more sophisticated, you can also authenticate using OAuth2 via either key/secrets or tokens. For example, authenticating with a token would look something like:

curl https://api.github.com/?access_token=OAUTH-TOKEN

If you’re monitoring private events, you’ll want to authenticate with one of these methods before sending requests about the events.

Solaris Containers: What You Need to Know

September 1, 2016, 8:00 am

≫ Next: Benchmarking Microservices for Fun and Profit

≪ Previous: A Beginner’s Guide to GitHub Events

Solaris Container Solaris and containers may not seem like two words that go together—at least not in this decade. For the past several years, the container conversation has been all about platforms like Docker, CoreOS and LXD running on Linux (and, in the case of Docker, now Windows and Mac OS X, too).

But Solaris, Oracle’s Unix-like OS, has actually had containers for a long time. In fact, they go all the way back to the release of Solaris 10 in 2005 (technically, they were available in the beta version of Solaris 10 starting in 2004), long before anyone was talking about Linux containers for production purposes. And they’re still a useful part of the current version of the OS, Solaris 11.3.

Despite the name similarities, Solaris containers are hardly identical to Docker or CoreOS containers. But they do similar things by allowing you to virtualize software inside isolated environments without the overhead of a traditional hypervisor.

Even as Docker and the like take off as the container solutions of choice for Linux environments, Solaris containers are worth knowing about, too—especially if you’re the type of developer or admin who finds himself, against his will, stuck in the world of proprietary, commercial Unix-like platforms because some decision-maker in his company’s executive suite is still wary of going wholly open source….

Plus, as I note below, Oracle now says it is working to bring Docker to Solaris containers—which means containers on Solaris could soon integrate into the mainstream container and DevOps scene.

Below, I’ll outline how Solaris containers work, what makes them different from Linux container solutions like Docker, and why you might want to use containers in a Solaris environment.

The Basics of Solaris Containers

Let’s start by defining the basic Solaris container architecture and terminology.

On Solaris, each container lives within what Oracle calls a local zone. Local zones are software-defined boundaries to which specific storage, networking and/or CPU resources are assigned. The local zones are strictly isolated from one another in order to mitigate security risks and ensure that no zone interferes with the operations of another.

Each Solaris system also has a global zone. This consists of the host system’s resources. The global zone controls the local zones (although a global zone can exist even if no local zones are defined). It’s the basis from which you configure and assign resources to local zones.

Each zone on the system, whether global or local, gets a unique name (the name of the global zone is always “global”—boring, I know, but also predictable) and a unique numerical identifier.

So far, this probably sounds a lot like Docker, and it is. Local zones on Solaris are like Docker containers, while the Solaris global zone is like the Docker engine itself.

Working with Zones and Containers on Solaris

The similarities largely end there, however, at least when it comes to the ways in which you work with containers on Solaris.

On Docker or CoreOS, you would use a tool like Swarm or Kubernetes to manage your containers. On Solaris, you use Oracle’s Enterprise Manager Ops Center to set up local zones and define which resources are available to them.

Once you set up a zone, you can configure it to your liking (for the details, check out Oracle’s documentation), then run software inside the zones.

One particularly cool thing that Solaris containers let you do is migrate a physical Solaris system into a zone. You can also migrate zones between host machines. So, yes, Solaris containers can come in handy if you have a cluster environment, even though they weren’t designed for native clustering in the same way as Docker and similar software.

Solaris Containers vs. Docker/CoreOS/LXD: Pros and Cons

By now, you’re probably sensing that Solaris containers work differently in many respects from Linux containers. You’re right.

In some ways, the differences make Solaris a better virtualization solution. In others, the opposite is true. Mostly, though, the distinction depends on what is most important to you.

Solaris’s chief advantages include:

Easy configuration: As long as you can point and click your way through Enterprise Manager Ops Center, you can manage Solaris containers. There’s no need to learn something like the Docker CLI.
Easy management of virtual resources: On Docker and CoreOS, sharing storage or networking with containerized apps via tools like Docker Data Volumes can be tedious. On Solaris it’s more straightforward (largely because you’re splicing up the host resources of only a single system, not a cluster).

But there are also drawbacks, which mostly reflect the fact that Solaris containers debuted more than a decade ago, well before people were talking about the cloud and hyper-scalable infrastructure.

Solaris container cons include:

Solaris container management doesn’t scale well. With Enterprise Manager Ops Center, you can only manage as many zones as you can handle manually.
You can’t spin up containers quickly based on app images, as you would with Docker or CoreOS, at least for now. This makes Solaris containers impractical for continuous delivery scenarios. But Oracle says it is working to change that by promising to integrate Docker with Solaris zones. So far, though, it’s unclear when that technology will arrive in Solaris.
There’s not much choice when it comes to management. Unlike the Linux container world, where you can choose from dozens of container orchestration and monitoring tools, Solaris only gives you Oracle solutions.

The bottom line: Solaris containers are not as flexible or nimble as Linux containers, but they’re relatively easy to work with. And they offer powerful features, especially when you consider how old they are. If you work with Oracle data centers, Solaris containers are worth checking out, despite being a virtualization solution that gets very little press these days.

Solaris Containers: What You Need to Know is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

↧

Benchmarking Microservices for Fun and Profit

September 2, 2016, 10:05 am

≫ Next: How Hudl and Cloud Cruiser use Sumo Logic Unified Logs and Metrics

≪ Previous: Solaris Containers: What You Need to Know

Why should I benchmark microservices?

The ultimate goal of benchmarking is to better understand the software, and test out the effects of various optimization techniques for microservices. In this blog, we describe our approach to benchmarking microservices here at Sumo Logic.

Create a spreadsheet for tracking your benchmarking

We found a convenient way to document a series of benchmarks is in a Google Spreadsheet. It allows collaboration and provides the necessary features to analyze and sum up your results. Structure your spreadsheet as follows:

Title page
- Goals
- Methodology
- List of planned and completed experiments (evolving as you learn more)
- Insights
Additional pages
- Detailed benchmark results for various experiments

Be clear about your benchmark goals

Before you engage in benchmarking, clearly state (and document) your goal. Examples of goals are:

“I am trying to understand how input X affects metric Y”

“I am running experiments A, B and C to increase/decrease metric X”

Pick one key metric (Key Performance Indicator – KPI)

State clearly which one metric you are concerned about and how the metric affects users of the system. If you choose to capture additional metrics for your test runs, ensure that the key metrics stands out.

Think like a scientist

You’re going to be performing a series of experiments to better understand which inputs affect your key metric, and how. Consider and document the variables you devise, and create a standard control set to compare against. Design your series of experiments in a fashion that leads to the understanding in the least amount of time and effort.

Define, document and validate your benchmarking methodology

Define a methodology for running your benchmarks. It is critical your benchmarks be:

Fairly fast (several minutes, ideally)
Reproducible in the exact same manner, even months later
Documented well enough so another person can repeat them and get identical results

Document your methodology in detail. Also document how to re-create your environment. Include all details another person needs to know:

Versions used
Feature flags and other configuration
Instance types and any other environmental details

Use load generation tools, and understand their limitations

In most cases, to accomplish repeatable, rapid-fire experiments, you need a synthetic load generation tool. Find out whether one already exists. If not, you may need to write one.

Understand that load generation tools are at best an approximation of what is going on in production. The better the approximation, the more relevant the results you’re going to obtain. If you find yourself drawing insights from benchmarks that do not translate into production, revisit your load generation tool.

Validate your benchmarking methodology

Repeat a baseline benchmark at least 10 times and calculate the standard deviation over the results. You can use the following spreadsheet formula:

=STDEV(<range>)/AVERAGE(<range>)

Format this number as a percentage, and you’ll see how big the relative variance in your result set is. Ideally, you want this value to be < 10%. If your benchmarks have larger variance, revisit your methodology. You may need to tweak factors like:

Increase the duration of the tests.
Eliminate variance from the environments.
- Ensure all benchmarks start in the same state (i.e. cold caches, freshly launched JVMs, etc).
- Consider the effects of Hotspot/JITs.
Simplify/stub components and dependencies on other microservices that add variance but aren’t key to your benchmark.
- Don’t be shy to make hacky code changes and push binaries you’d never ship to production.

Important: Determine the number of results you need to get the standard deviation below a good threshold. Run each of your actual benchmarks at least that many times. Otherwise, your results may be too random.

Execute the benchmark series

Now that you have developed a sound methodology, it’s time to gather data. Tips:

Only vary one input/knob/configuration setting at a time.
For every run of the benchmark, capture start and end time. This will help you correlate it to logs and metrics later.
If you’re unsure whether the input will actually affect your metric, try extreme values to confirm it’s worth running a series.
Script the execution of the benchmarks and collection of metrics.
Interleave your benchmarks to make sure what you’re observing aren’t slow changes in your test environment. Instead of running AAAABBBBCCCC, run ABCABCABCABC.

Create enough load to be able to measure a difference

There are two different strategies for generating load.

Strategy 1: Redline it!

In most cases, you want to ensure you’re creating enough load to saturate your component. If you do not manage to accomplish that, how would you see that you increased it’s throughput?

If your component falls apart at redline (i.e. OOMs, throughput drops, or otherwise spirals out of control), understand why, and fix the problem.

Strategy 2: Measure machine resources

In cases where you cannot redline the component, or you have reason to believe it behaves substantially different in less-than-100%-load situations, you may need to resort to OS metrics such as CPU utilization and IOPS to determine whether you’ve made a change.

Make sure your load is large enough for changes to be visible. If your load causes 3% CPU utilization, a 50% improvement in performance will be lost in the noise.

Try different amounts of load and find a sweet spot, where your OS metric measurement is sensitive enough.

Add new benchmarking experiments as needed

As you execute your benchmarks and develop a better understanding of the system, you are likely to discover new factors that may impact your key metric. Add new experiments to your list and prioritize them over the previous ones if needed.

Hack the code

In some instances, the code may not have configuration or control knobs for the inputs you want to vary. Find the fastest way to change the input, even if it means hacking the code, commenting out sections or otherwise manipulating the code in ways that wouldn’t be “kosher” for merges into master. Remember: The goal here is to get answers as quickly as possible, not to write production-quality code—that comes later, once we have our answers.

Analyze the data and document your insights

Once you’ve completed a series of benchmarks, take a step back and think about what the data is telling you about the system you’re benchmarking. Document your insights and how the data backs them up.

It may be helpful to:

Calculate the average for each series of benchmarks you ran and to use that to calculate the difference (in percent) between series — i.e. “when I doubled the number of threads, QPS increased by 23% on average.”
Graph your results — is the relationship between your input and the performance metric linear? Logarithmic? Bell curve?

Present your insights

When presenting your insights to management and/or other engineering teams, apply the Pyramid Principle. Engineers often make the mistake of explaining methodology, results and concluding with the insights. It is preferable to reverse the order and start with the insight. Then, if needed/requested, explain methodology and how the data supports your insight.
Omit nitty-gritty details of any experiments that didn’t lead to interesting insights.
Avoid jargon, and if you cannot, explain it. Don’t assume your audience knows the jargon.
Make sure your graphs have meaningful, human-readable units.
Make sure your graphs can be read when projected onto a screen or TV.

↧

How Hudl and Cloud Cruiser use Sumo Logic Unified Logs and Metrics

September 6, 2016, 11:54 am

≫ Next: Tutorial: How to Run Artifactory as a Container

≪ Previous: Benchmarking Microservices for Fun and Profit

We launched the Sumo Logic Unified Logs and Metrics (ULM) solution a couple of weeks ago, and we already are seeing massive success and adoption of this solution. So how are how real-world customers using the Sumo Logic ULM product?

Today, we ran a webinar with a couple of our early ULM product customers and got an inside view into their processes, team makeup, and how ULM is changing the way they monitor and troubleshoot. The webinar was hosted by Ben Newton, Sumo Logic Product Manager extraordinaire for ULM, and featured two outstanding customer speakers: Ben Abrams, Lead DevOps Engineer, Cloud Cruiser, and Jon Dokulil, VP of Engineering, Hudl.

Ben and Jon both manage mission-critical AWS-based applications for their organizations and are tasked with ensuring excellent customer experience. Needless to say, they know application and infrastructure operations well.

In the webinar, Ben and Jon described their current approaches to operations (paraphrased below for readability and brevity):

Jon: Sumo Logic log analytics is a critical part of the day to day operations at Hudl. Hudl engineers use Sumo Dashboards to identify issues when they deploy apps; they also use Sumo Logic reports extensively to troubleshoot application and infrastructure performance issues.

Ben: Everything in our world starts with an alert. And before ULM, we had to use many systems to correlate the data. We use Sumo Logic extensively in the troubleshooting process.

Ben and Jon also described their reasons to consider Sumo Logic ULM:

Ben: Both logs and metrics tell critical parts of the machine data story and we want to see them together in one single pane of glass so that we can correlate the data better and faster and reduce our troubleshooting time. Sumo Logic ULM provides this view to us.

Jon: We have many tools serving the DevOps team, and the team needs to check many systems when things go wrong and not all team members are skilled in all tools. Having a single tool that can help diagnose problems is better, so consolidating across logs and metrics has provided us significant value.

Hudl

ULM Dashboards at Hudl

Finally, the duo explains where they want to go with Sumo Logic ULM:

Ben: We would like to kill off our siloed metrics solution. We would also like to use AWS auto-scale policies to automate the remediation process, without human intervention.

Jon: We would like to provide full log and metrics visibility with the IT alerts so that the DevOps team can get full context and visibility to fix issues quickly.

All in all, this was a fantastic discussion and it validates why IT shops that are tasked with 100% performance SLA’s should consider Sumo Logic Unified Logs and Metrics solution. To hear the full story, check out the full webinar on-demand.
If you are interested in trying out Sumo Logic ULM solution, sign up for Sumo Logic Free.

↧

Tutorial: How to Run Artifactory as a Container

September 8, 2016, 8:00 am

≫ Next: 5 Bintray Security Best Practices

≪ Previous: How Hudl and Cloud Cruiser use Sumo Logic Unified Logs and Metrics

If you use Artifactory, JFrog’s artifact repository manager, there’s a good chance you’re already invested in a DevOps-inspired workflow. And if you do DevOps, you’re probably interested in containerizing your apps and deploying them through Docker or another container system. That’s a key part of being agile, after all.

From this perspective, it only makes sense to want to run Artifactory as a Docker container. Fortunately, you can do that easily enough. While Artifactory is available for installation directly on Linux or Windows systems, it can also run easily enough on Docker. In fact, running Artifactory as a container gives you some handy features that would not otherwise be available.

In this tutorial, I’ll explain how to run Artifactory as a container, and discuss some of the advantages of running it this way.

Pulling the Artifactory Container

Part of the reason why running Artifactory as a Docker container is convenient is that pre-built images for it already exist. The images come with the Nginx Web server and Docker repositories built in.

The Artifactory container images are available from Bintray. You can pull them with a simple Docker pull command, like so:

docker pull docker.bintray.io/jfrog/artifactory-oss:latest

This would pull the image for the latest version of the open source edition of Artifactory. If you want a different version, you can specify that in the command. Images are also available for the Pro Registry and Pro versions of Artifactory.

Running Artifactory as a Container

Once you’ve pulled the container image, start it up. A command like this one would do the trick:

docker run -d --name artifactory -p 80:80 -p 8081:8081 -p 443:443 \ -v $ARTIFACTORY_HOME/data \ -v $ARTIFACTORY_HOME/logs \ -v $ARTIFACTORY_HOME/backup \ -v $ARTIFACTORY_HOME/etc \ docker.bintray.io/jfrog/artifactory-oss:latest

The -v flags specify volume mounts to use. You could use whichever volume mounts you like, but the ones specified above follow JFrog’s suggestions. To configure them correctly, you should run

export ARTIFACTORY_HOME=/var/opt/jfrog/artifactory

prior to starting the Artifactory container, so that the ARTIFACTORY_HOME environment variable is set correctly.

Configuring the Client

Artifactory is now up and running as a container. But there’s a little extra tweaking you need to do to make it accessible on your local machine.

In particular, edit /etc/hosts so that it includes this line:
localhost docker-virtual.art.local docker-dev-local2.art.local docker-prod-local2.art.local

Also run:
DOCKER_OPTS="$DOCKER_OPTS --insecure-registry \ docker-virtual.art.local --insecure-registry \ docker-dev-local2.art.local --insecure-registry \ docker-prod-local2.art.local --insecure-registry \ docker-remote.art.local"

This command tells Docker to work with a self-signed certificate. It’s necessary because the Artifactory container image has a self-signed certificate built in. (You would want to change this if you were running Artifactory in production, of course.)

After this, restart Docker and you’re all set. Artifactory is now properly configured, and accessible from your browser at http://localhost:8081/artifactory.

Why run Artifactory as a Container?

Before wrapping up the tutorial, let’s go over why you might want to run Artifactory as a container in the first place. Consider the following benefits:

It’s easy to install. You don’t have to worry about configuring repositories or making your Linux distribution compatible with the Artifactory packages. You can simply install Docker, then run the Artifactory package.
It’s easy to get a particular version. Using RPM or Debian packages, pulling a particular version of an app (and making sure the package manager doesn’t automatically try to update it) can be tricky. With the container image, it’s easy to choose whichever version you want.
It’s more secure and isolated. Rather than installing Artifactory to your local file system, you keep everything inside a container. That makes removing or upgrading clean and easy.
It’s easy to add to a cluster. If you want to make Artifactory part of a container cluster and manage it with Kubernetes or Swarm, you can do that in a straightforward way by running it as a container.

How to Run Artifactory as a Container is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Chris tozzi on npm artifactory Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO.

↧

5 Bintray Security Best Practices

September 9, 2016, 11:18 am

≫ Next: Getting Started with AWS Kinesis Streams

≪ Previous: Tutorial: How to Run Artifactory as a Container

Bintray, JFrog’s software hosting and distribution platform, offers lots of exciting features, like CI integration and REST APIs.

If you’re like me, you enjoy thinking about those features much more than you enjoy thinking about software security. Packaging and distributing software is fun; worrying about the details of Bintray security configurations and access control for your software tends to be tedious (unless security is your thing, of course).

Like any other tool, however, Bintray is only effective in a production environment when it is run securely. That means that, alongside all of the other fun things you can do with Bintray, you should plan and run your deployment in a way that mitigates the risk of unauthorized access, the exposure of private data, and so on.

Below, I explain the basics of Bintray security, and outline strategies for making your Bintray deployment more secure.

Bintray Security Basics

Bintray is a cloud service hosted by JFrog’s data center provider. JFrog promises that the service is designed for security, and hardened against attack. (The company is not very specific about how it mitigates security vulnerabilities for Bintray hosting, but I wouldn’t be either, since one does not want to give potential attackers information about the configuration.) JFrog also says that it restricts employee access to Bintray servers and uses SSH over VPN when employees do access the servers, which adds additional security.

The hosted nature of Bintray means that none of the security considerations associated with on-premises software apply. That makes life considerably easier from the get-go if you’re using Bintray and are worried about security.

Still, there’s more that you can do to ensure that your Bintray deployment is as robust as possible against potential intrusions. In particular, consider adopting the following policies.

Set up an API key for Bintray

Bintray requires users to create a username and password when they first set up an account. You’ll need those when getting started with Bintray.

Once your account is created, however, you can help mitigate the risk of unauthorized access by creating an API key. This allows you to authenticate over the Bintray API without using your username or password. That means that even if a network sniffer is listening to your traffic, your account won’t be compromised.

Use OAuth for Bintray Authentication

Bintray also supports authentication using the OAuth protocol. That means you can log in using credentials from a GitHub, Twitter or Google+ account.

Chances are that you pay closer attention to one of these accounts (and get notices from the providers about unauthorized access) than you do to your Bintray account. So, to maximize security and reduce the risk of unauthorized access, make sure your Bintray account itself has login credentials that cannot be brute-forced, then log in to Bintray via OAuth using an account from a third-party service that you monitor closely.

Sign Packages with GPG

Bintray supports optional GPG signing of packages. To do this, you first have to configure a key pair in your Bintray profile. For details, check out the Bintray documentation.

GPG signing is another obvious way to help keep your Bintray deployment more secure. It also keeps the users of your software distributions happier, since they will know that your packages are GPG-signed, and therefore, are less likely to contain malicious content.

Take Advantage of Bintray’s Access Control

The professional version of Bintray offers granular control over who can download packages. (Unfortunately this feature is only available in that edition.) You can configure access on a per-user or per-organization basis.

While gaining Bintray security shouldn’t be the main reason you use granular access control (the feature is primarily designed to help you fine-tune your software distribution), it doesn’t hurt to take advantage of it in order to reduce the risk that certain software becomes available to a user to whom you don’t want to give access.

5 Bintray Security Best Practices is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

↧

Getting Started with AWS Kinesis Streams

September 12, 2016, 1:54 pm

≫ Next: Using Logs to Speed Your DevOps Workflow

≪ Previous: 5 Bintray Security Best Practices

Sumo Logic Kinesis Connector In December 2013, Amazon Web Services released Kinesis, a managed, dynamically scalable service for the processing of streaming big data in real-time. Since that time, Amazon has been steadily expanding the regions in which Kinesis is available, and as of this writing, it is possible to integrate Amazon’s Kinesis producer and client libraries into a variety of custom applications to enable real-time processing of streaming data from a variety of sources.

Kinesis acts as a highly available conduit to stream messages between data producers and data consumers. Data producers can be almost any source of data: system or web log data, social network data, financial trading information, geospatial data, mobile app data, or telemetry from connected IoT devices. Data consumers will typically fall into the category of data processing and storage applications such as Apache Hadoop, Apache Storm, and Amazon Simple Storage Service (S3) and ElasticSearch.

Understanding Key Concepts in Kinesis

It is helpful to understand some key concepts when working with Kinesis Streams.

Kinesis Stream Shards

The basic unit of scale when working with streams is a shard. A single shard is capable of ingesting up to 1MB or 1,000 PUTs per second of streaming data, and emitting data at a rate of 2MB per second.

Shards scale linearly, so adding shards to a stream will add 1MB per second of ingestion, and emit data at a rate of 2MB per second for every shard added. Ten shards will scale a stream to handle 10MB (10,000 PUTs) of ingress, and 20MB of data egress per second. You choose the number of shards when creating a stream, and it is not possible to change this via the AWS Console once you’ve created a stream.

It is possible to dynamically add or remove shards from a stream using the AWS Streams API. This is called resharding. Resharding cannot be done via the AWS Console, and is considered an advanced strategy when working with Kinesis. A solid understanding of the subject is required prior to attempting these operations.

Adding shards essentially splits shards in order to scale the stream, and removing shards merges them. Data is not discarded when adding (splitting) or removing (merging) shards. It is not possible to split a single shard into more than two, nor to merge more than two shards into a single shard at a time.

Adding and removing shards will increase or decrease the cost of your stream accordingly. Per the Amazon Kinesis Streams FAQ, there is a default limit of 10 shards per region. This limit can be increased by contacting Amazon Support and requesting a limit increase. There is no limit to the number of shards or streams in an account.

Types of Shards

Records are units of data stored in a stream and are made up of a sequence number, partition key, and a data blob. Data blobs are the payload of data contained within a record. The maximum size of a data blob before Base64-encoding is 1MB, and is the upper limit of data that can be placed into a stream in a single record. Larger data blobs must be broken into smaller chunks before putting them into a Kinesis stream.

Partition keys are used to identify different shards in a stream, and allow a data producer to distribute data across shards.

Sequence numbers are unique identifiers for records inserted into a shard. They increase monotonically, and are specific to individual shards.

Amazon Kinesis Offerings

Amazon Kinesis is currently broken into three separate service offerings.

Kinesis Streams is capable of capturing large amounts of data (terabytes per hour) from data producers, and streaming it into custom applications for data processing and analysis. Streaming data is replicated by Kinesis across three separate availability zones within AWS to ensure reliability and availability of your data.

Kinesis Streams is capable of scaling from a single megabyte up to terabytes per hour of streaming data. You must manually provision the appropriate number of shards for your stream to handle the volume of data you expect to process. Amazon helpfully provides a shard calculator when creating a stream to correctly determine this number. Once created, it is possible to dynamically scale up or down the number of shards to meet demand, but only with the AWS Streams API at this time.

It is possible to load data into Streams using a number of methods, including HTTPS, the Kinesis Producer Library, the Kinesis Client Library, and the Kinesis Agent.

By default, data is available in a stream for 24 hours, but can be made available for up to 168 hours (7 days) for an additional charge.

Monitoring is available through Amazon Cloudwatch. If you want to add more verbose visualizations of that data, you can use Sumo Logic’s open source Kinesis Connector to fetch data from the Kinesis Stream and send it to the Sumo Logic service. Kinesis Connector is a Java connector that acts as a pipeline between an [Amazon Kinesis] stream and a [Sumologic] Collection. Data gets fetched from the Kinesis Stream, transformed into a POJO and then sent to the Sumologic Collection as JSON.

Kinesis Firehose is Amazon’s data-ingestion product offering for Kinesis. It is used to capture and load streaming data into other Amazon services such as S3 and Redshift. From there, you can load the streams into data processing and analysis tools like Elastic Map Reduce, and Amazon Elasticsearch Service. It is also possible to load the same data into S3 and Redshift at the same time using Firehose.

Firehose can scale to gigabytes of streaming data per second, and allows for batching, encrypting and compressing of data. It should be noted that Firehose will automatically scale to meet demand, which is in contrast to Kinesis Streams, for which you must manually provision enough capacity to meet anticipated needs.

As with Kinesis Streams, it is possible to load data into Firehose using a number of methods, including HTTPS, the Kinesis Producer Library, the Kinesis Client Library, and the Kinesis Agent. Currently, it is only possible to stream data via Firehose to S3 and Redshift, but once stored in one of these services, the data can be copied to other services for further processing and analysis.

Monitoring is available through Amazon Cloudwatch.

Kinesis Analytics is Amazon’s forthcoming product offering that will allow running of standard SQL queries against data streams, and send that data to analytics tools for monitoring and alerting.

This product has not yet been released, and Amazon has not published details of the service as of this date.

Kinesis Pricing

Here’s a pricing guide for the various Kinesis offerings.

Kinesis Streams

There are no setup or minimum costs associated with using Amazon Kinesis Streams. Pricing is based on two factors — shard hours, and PUT Payload Units, and will vary by region. US East (Virginia), and US West (Oregon) are the least expensive, while regions outside the US can be significantly more expensive depending on region.

At present, shard hours in the US East (Virginia) region are billed at $0.015 per hour, per shard. If you have 10 shards, you would be billed at a rate of $0.15 per hour.

PUT Payload Units are counted in 25KB chunks. If a record is 50KB, then you would be billed for two units. If a record is 15KB, you will be billed for a single unit. Billing per 1 million units in the US East (Virginia) region is $0.014.

Extended Data Retention for up to 7 days in the US East (Virginia) region is billed at $0.020 per shard hour. By default, Amazon Kinesis stores your data for 24 hours. You must enable Extended Data Retention via the Amazon API.

Kinesis Streams is not available in the AWS Free Tier. For more information and pricing examples, see Amazon Kinesis Streams Pricing.

Kinesis Firehose

There are also no setup or minimum costs associated with using Amazon Kinesis Firehose. Pricing is based on a single factor — data ingested per GB. Data ingested by Firehose in the US East (Virginia) region is billed at $0.035 per GB.

You will also be charged separately for data ingested by Firehose and stored in S3 or Redshift. Kinesis Firehose is not available in the AWS Free Tier. For more information and pricing examples, see Amazon Kinesis Firehose Pricing.

Kinesis vs SQS

Amazon Kinesis is differentiated from Amazon’s Simple Queue Service (SQS) in that Kinesis is used to enable real-time processing of streaming big data. SQS, on the other hand, is used as a message queue to store messages transmitted between distributed application components.

Kinesis provides routing of records using a given key, ordering of records, the ability for multiple clients to read messages from the same stream concurrently, replay of messages up to as long as seven days in the past, and the ability for a client to consume records at a later time. Kinesis Streams will not dynamically scale in response to increased demand, so you must provision enough streams ahead of time to meet the anticipated demand of both your data producers and data consumers.

SQS provides for messaging semantics so that your application can track the successful completion of work items in a queue, and you can schedule a delay in messages of up to 15 minutes. Unlike Kinesis Streams, SQS will scale automatically to meet application demand. SQS has lower limits to the number of messages that can be read or written at one time compared to Kinesis, so applications using Kinesis can work with messages in larger batches than when using SQS.

Competitors to Kinesis

There are a number of products and services that provide similar feature sets to Kinesis. Three well-known options are summarized below.

Apache Kafka is a high performance message broker originally developed by LinkedIn, and now a part of the Apache Software Foundation. It is downloadable software written in Scala. There are quite a few opinions as to whether one should choose Kafka or Kinesis, but there are some simple use cases to help make that decision.

If you are looking for an on-prem solution, consider Kafka since you can set up and manage it yourself. Kafka is generally considered more feature-rich and higher- performance than Kinesis, and offers the flexibility that comes with maintaining your own software. On the other hand, you must set up and maintain your own Kafka cluster(s), and this can require expertise that you may not have available in-house.

It is possible to set up Kafka on EC2 instances, but again, that will require someone with Kafka expertise to configure and maintain. If your use case requires a turnkey service that is easy to set up and maintain, or integrate with other AWS services such as S3 or Redshift, then you should consider Kinesis instead.

There are a number of comparisons on the web that go into more detail about features, performance, and limitations if you’re inclined to look further.

Microsoft Azure Event Hubs is Microsoft’s entry in the streaming messaging space. Event Hubs is a managed service offering similar to Kinesis. It supports AMQP 1.0 in addition to HTTPS for reading and writing of messages. Currently, Kinesis only supports HTTPS and not AMQP 1.0. (There is an excellent comparison of Azure Event Hubs vs Amazon Kinesis if you are looking to see a side-by-side comparison of the two services.)

Google Cloud Pub/Sub is Google’s offering in this space. Pub/Sub supports both HTTP access, as well as gRPC (alpha) to read and write streaming data.

At the moment, adequate comparisons of this service to Amazon Kinesis (or Azure Event Hubs) are somewhat lacking on the web. This is expected; Google only released version 1 of this product in June of 2015. Expect to see more sometime in the near future.

Google provides excellent documentation on using the service in their Getting Started guide.
Beginner Resources for Kinesis

Amazon has published an excellent tutorial on getting started with Kinesis in their blog post Building a Near Real-Time Discovery Platform with AWS. It is recommended that you give this a try first to see how Kinesis can integrate with other AWS services, especially S3, Lambda, Elasticsearch, and Kibana.

Once you’ve taken Kinesis for a test spin, you might consider integrating with an external service such as SumoLogic to analyze log files from your EC2 instances using their Amazon Kinesis Connector. (The code has been published in the SumoLogic Github repository.)

Getting Started with AWS Kinesis Streams is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Resources

↧

Using Logs to Speed Your DevOps Workflow

September 20, 2016, 12:11 pm

≫ Next: Integrate Azure Functions with Sumo Logic Schedule Search

≪ Previous: Getting Started with AWS Kinesis Streams

Using Logs to Speed Your DevOps Workflow - Sumo Logic Log aggregation and analytics may not be a central part of the DevOps conversation. But effective use of logs is key to leveraging the benefits associated with the DevOps movement and the implementation of continuous delivery.

Why is this the case? How can organizations take advantage of logging to help drive their DevOps workflow? Keep reading for a discussion of how logging and DevOps fit together.

Defining DevOps

If you are familiar with DevOps, you know that it’s a new (well, new as of the past five years or so) philosophy of software development and delivery. It prioritizes the following goals:

Constant communication. DevOps advocates the removal of “silos” between different teams. That way, everyone involved in development and delivery, from programmers to QA folks to production-environment admins, can collaborate readily and remain constantly aware of the state of development.
Continuous integration and delivery. Rather than making updates to code incrementally, DevOps prioritizes a continuous workflow. Code is updated continuously, the updates are tested continuously, and validated code is delivered to users continuously.
Flexibility. DevOps encourages organizations to use whichever tool is the best for solving a particular challenge, and to update tool sets regularly as new solutions emerge. Organizations should not wed themselves to a particular framework.
Efficient testing. Rather than waiting until code is near production to test it, DevOps prioritizes testing early in the development cycle. (This strategy is sometimes called “shift-left” testing.) This practice assures that problems are discovered when the time required to fix them is minimal, rather than waiting until they require a great deal of back peddling.

DevOps and Logging

None of the goals outlined above deal with logging specifically. And indeed, because DevOps is all about real-time workflows and continuous delivery, it can be easy to ignore logging entirely when you’re trying to migrate to a DevOps-oriented development strategy.

Yet logs are, in fact, a crucial part of an efficient DevOps workflow. Log collection and analysis help organizations to optimize their DevOps practices in a number of ways.

Consider how log aggregation impacts the following areas.

Communication and Workflows in DevOps

Implementing effective communication across your team is about more than simply breaking down silos and ensuring that everyone has real-time communication tools available. Logs also facilitate efficient communication. They do this in two ways.

First, they maximize visibility into the development process for all members of the team. By aggregating logs from across the delivery pipeline—from code commits to testing logs to production server logs—log analytics platforms assure that everyone can quickly locate and analyze information related to any part of the development cycle. That’s crucial if you want your team members to be able to remain up to speed with the state of development.

Second, log analytics help different members of the organization understand one another. Your QA team is likely to have only a basic ability to interpret raw code commits. Your programmers are not specialists in reading test results. Your admins, who deploy software into production, are experts in only that part of the delivery pipeline.

Log analytics, however, can be used to help any member of the team interpret data associated with any part of the development process. Rather than having to understand raw data from a part of the workflow with which they are not familiar, team members can rely on analytics results to learn what they need quickly.

Continuous visibility

To achieve continuous delivery, you need to have continuous visibility into your development pipeline. In other words, you have to know, on a constant, ongoing basis, what is happening with your code. Otherwise, you’re integrating in the dark.

Log aggregation and analytics help deliver continuous visibility. They provide a rich source of information about your pipeline at all stages of development. If you want to know the current quality and stability of your code, for example, you can quickly pull analytics from testing logs to find that information. If you want to know how your app is performing in production, you can do the same with server logs.

Flexibility from log analytics

In order to switch between development frameworks and programming languages at will, you have to ensure that moving between platforms requires as little change as possible to your overall workflow. Without log aggregation, however, you’ll have to overhaul the way you store and analyze logs every time you add or subtract a tool from your pipeline.

A better approach is to use a log aggregation and analytics platform like Sumo Logic. By supporting a wide variety of logging configurations, Sumo Logic assures that you can modify your development environment as needed, while keeping your logging solution constant.

Faster testing through log analytics

Performing tests earlier in the development cycle leads to faster, more efficient delivery only if you are able to fix the problems discovered by tests quickly. Otherwise, the bugs that your tests reveal will hold up your workflow, no matter how early the bugs are found.

Log analytics are a helpful tool for getting to the bottom of bugs. By aggregating and analyzing logs from across your pipeline, you can quickly get to the source of a problem with your code. Logs help keep the continuous delivery pipeline flowing smoothly, and maximize the value of shift-left testing.

Moving Towards a DevOps Workflow

Log aggregation and analytics may not be the first things that come to mind when you think of DevOps. But effective collection and interpretation of log data is a crucial part of achieving a DevOps-inspired workflow.

Logging on its own won’t get you to continuous delivery, of course. That requires many other ingredients, too. But it will get you closer if you’re not already there. And if you are currently delivering continuously, effective logging can help to make your pipeline even faster and more efficient.

Using Logs to Speed Your DevOps Workflow API Workflows is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

↧

Integrate Azure Functions with Sumo Logic Schedule Search

September 21, 2016, 6:00 am

≫ Next: Customers Share their AWS Logging with Sumo Logic Use Cases

≪ Previous: Using Logs to Speed Your DevOps Workflow

Azure Functions

Azure functions are event driven pieces of code that can used to integrate systems, build APIs, process data and trigger any action/reaction to events without having to worry about infrastructure to run it.
More info on Azure functions can be found here.

Sumo Logic Scheduled Search

Scheduled searches are standard saved searches that are executed on a schedule you set. Once configured, scheduled searches run continuously, making them a great tool for continuously monitoring your stack, infrastructure, process, build environment, etc.

Why integrate Sumo Logic Scheduled Search with Azure Function?

Answer is very simple: Using Sumo Logic’s machine learning algorithm and search capabilities, you can Monitor and alert on key metrics and KPIs in real time to rapidly identify problems, detect outliers, abnormal behavior using dynamic thresholds or any other event which is important for you. Once you have detected the event for your use case, you can have the Azure function respond to your event and take an appropriate action.

More info on real time monitoring using Sumo Logic can be found here.

Three Steps guide to Integrate Azure Function with Sumo Logic Schedule Search.

Case in point: Web app scheduled search detects an outage -> Sumo Logic triggers Azure Function via Webhook Connection –> Azure function gets executed, and takes preventive/corrective action.

Step 1: Create Azure Function and write the preventive/corrective action you want to take.

Step 2: Set up Sumo Logic Webhook Connection which will trigger Azure Function created in #1. To set up connection, follow the steps under ‘Setting up Webhook Connections’

Step 3: Create a Schedule Search that will monitor your infrastructure for any outage, call the Webhook connection created in #2.

Example

Sumo Logic with it’s machine learning capabilities can detect an outlier in incoming traffic. Given a series of time-stamped numerical values, using the Sumo Logic’s Outlier operator in a query can identify values in a sequence that seem unexpected, and would identify an alert or violation, for example, for a scheduled search. To do this, the Outlier operator tracks the moving average and standard deviation of the value, and detects or alerts when the difference between the value exceeds mean by some multiple of standard deviation, for example, 3 standard deviation.

In this example, we want to trigger an Azure Function whenever there is an outlier in incoming traffic for Azure Web Apps.

Step 1: Create an Azure function – for this example I have following Node.js function

#r "Newtonsoft.Json"
using System;
using System.Net;
using Newtonsoft.Json;

public static async Task&amp;amp;amp;amp;amp;lt;object&amp;amp;amp;amp;amp;gt; Run(HttpRequestMessage req, TraceWriter log)
{
    log.Info($"Webhook was triggered Version 2.0!");
    string jsonContent = await req.Content.ReadAsStringAsync();
    dynamic data = JsonConvert.DeserializeObject(jsonContent);
    log.Info($"Webhook was triggered - TEXT: {data.text}!");
    log.Info($"Webhook was triggered - RAW : {data.raw} !");
    log.Info($"Webhook was triggered - NUM : {data.num} !");
    log.Info($"Webhook was triggered - AGG : {data.agg}!");
   /* Add More Logic to handle an outage */
    return req.CreateResponse(HttpStatusCode.OK, new {
        greeting = $"Hello"
    });
}

Copy and paste Function Url in a separate notepad, you will need this in Step 2

Step 2: Create Sumo Logic Webhook Connection.

From your Sumo Logic account: Go to Manage -> Connections, Click Add and then click Webhook.

Provide appropriate name and Description.
Copy paste the Azure Function Url (from step #1) in URL field.
For Payload, add following JSON.
Test connection, and Save it.

{
    "text": "$SearchName ran over $TimeRange at $FireTime",
    "raw": "$RawResultsJson",
     "num": "$NumRawResults",
     "agg": "$AggregateResultsJson"
}

Step 3: Create a Schedule Search.

Scheduled searches are saved searches that run automatically at specified intervals. When a scheduled search is configured to send an alert, it can be sent to another tool via a Webhook Connection.

From your Sumo Logic account, copy paste following search and click Save As

_sourceCategory=Azure/webapp
| parse regex "\d+-\d+-\d+ \d+:\d+:\d+ (?&amp;amp;amp;amp;lt;s_sitename&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_method&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_uri_stem&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_uri_query&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;src_port&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;src_user&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;client_ip&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_user_agent&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_cookie&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_referrer&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_host&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;sc_status&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;sc_substatus&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;sc_win32_status&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;sc_bytes&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;cs_bytes&amp;amp;amp;amp;gt;\S+) (?&amp;amp;amp;amp;lt;time_taken&amp;amp;amp;amp;gt;\S+)"
| timeslice 5m
| count by _timeslice
| outlier _count
| where _count_violation=1

Note: This assumes you have _sourceCategory set up with Azure/webapp. If you don’t have this source set up, then you can use your own search to schedule it.

In the Save Search As dialog, enter a name for the search and an optional description.
Click Schedule this search.
Choose 60 Minutes for Run Frequency
Last 60 Minutes for Time range for scheduled search
For Alert condition, choose Send notification only if the condition below is satisfied
Number of results Greater than > 0
For Alert Type choose Webhook
For Webhook, choose the Webhook connection you created in Step 2 from dropdown.
Click Save

Depending upon Run Frequency of your scheduled search, you can check the logs of your Azure function from portal to confirm it got triggered.

2016-08-25T20:50:36.349 Webhook was triggered Version 2.0!
2016-08-25T20:50:36.349 Webhook was triggered - TEXT: Malicious  Client ran over 2016-08-25 19:45:00 UTC - 2016-08-25 20:45:00 UTC at 2016-08-25 20:45:00 UTC!
2016-08-25T20:50:36.349 Webhook was triggered - RAW :  !
2016-08-25T20:50:36.349 Webhook was triggered - NUM : 90 !
2016-08-25T20:50:36.351 Webhook was triggered - AGG : [{"Approxcount":13,"client_ip":"60.4.192.44"},{"Approxcount":9,"client_ip":"125.34.187"},{"Approxcount":6,"client_ip":"62.64.0.1"},{"Approxcount":6,"client_ip":"125.34.14"}]!
2016-08-25T20:50:36.351 Function completed (Success, Id=72f78e55-7d12-49a9-aa94-8bb347f72672)
2016-08-25T20:52:25  No new trace in the past 1 min(s).
2016-08-25T20:52:49.248 Function started (Id=d22f92cf-0cf7-4ab2-ad0e-fa2f23e25e09)
2016-08-25T20:52:49.248 Webhook was triggered Version 2.0!
2016-08-25T20:52:49.248 Webhook was triggered - TEXT: Errors Last Hour ran over 2016-08-25 19:45:00 UTC - 2016-08-25 20:45:00 UTC at 2016-08-25 20:45:00 UTC!
2016-08-25T20:52:49.248 Webhook was triggered - RAW :  !
2016-08-25T20:52:49.248 Webhook was triggered - NUM : 90 !
2016-08-25T20:52:49.248 Webhook was triggered - AGG : [{"server_errors":39.0}]!
2016-08-25T20:52:49.248 Function completed (Success, Id=d22f92cf-0cf7-4ab2-ad0e-fa2f23e25e09)

Summary

We created a scheduled search which runs every 60 minutes, to find an outlier in last 60 minutes of incoming traffic data. If there is an outlier, webhook connection gets activated and triggers Azure function.

Customers Share their AWS Logging with Sumo Logic Use Cases

September 21, 2016, 8:00 am

≫ Next: Using HTTP Request Builders to Create Repeatable API Workflows

≪ Previous: Integrate Azure Functions with Sumo Logic Schedule Search

In June Sumo Dojo (our online community) launched a contest to learn more about how our customers are using Amazon Web Services like EC2, S3, ELB, and AWS Lambda. The Sumo Logic service is built on AWS and we have deep integration into Amazon Web Services. And as an AWS Technology Partner we’ve collaborated closely with AWS to build apps like the Sumo Logic App for Lambda.

So we wanted to see how our customers are using Sumo Logic to do things like collecting logs from CloudWatch to gain visibility into their AWS applications. We thought you’d be interested in hearing how others are using AWS and Sumo Logic, too. So in this post I’ll share their stories along with announcing the contest winner.

The contest narrowed down to two finalists – SmartThings, which is a Samsung company operates in the home automation industry and provides access to a wide range of connected devices to create smarter homes that enhance the comfort, convenience, security and energy management for the consumer.

WHOmentors, Inc. our second finalist, is a publicly supported scientific, educational and charitable corporation, and fiscal sponsor of Teen Hackathon. The organization is, according to their site, “primarily engaged in interdisciplinary applied research to gain knowledge or understanding to determine the means by which a specific, recognized need may be met.”

At stake was a DJI Phantom 3 Drone. All entrants were awarded a $10 Amazon gift card.

dji_phantom3_drone - sumo logic contest aws logging

AWS Logging Contest Rules

The Drone winner was selected based on the following criteria:

You have to be a user of Sumo Logic and AWS
To enter the contest, a comment had to be placed on this thread in Sumo Dojo.
The post could not be anonymous – you were required to log in to post and enter.
Submissions closed August 15th.

As noted in the Sumo Dojo posting, the winner would be selected based on our own editorial judgment and community reactions to the post (in the form of comments or “likes”) to select one that’s most interesting, useful and detailed.

SmartThings

SmartThings has been working on a feature to enable Over-the-air programming (OTA) firmware updates of Zigbee Devices on user’s home networks. For the uninitiated, Zigbee is an IEEE specification for a suite of high-level communication protocols used to create personal area networks with small, low-power digital radios. See the Zigbee Alliance for more information.

According to one of the firmware engineers at SmartThings, there are a lot of edge cases and potential points of failure for an OTA update including:

The Cloud Platform
An end user’s hub
The device itself
Power failures
RF inteference on the mesh network

Disaster in this scenario would be a user’s device ending up in a broken state. As Vlad Shtibin related:

“Our platform is deployed across multiple geographical regions, which are hosted on AWS. Within each region we support multiple shards, furthermore within each shard we run multiple application clusters. The bulk of the services involved in the firmware update are JVM based application servers that run on AWS EC2 instances.

Our goal for monitoring was to be able to identify as many of these failure points as possible and implement a recovery strategy. Identifying these points is where Sumo Logic comes into the picture. We use a key-value logger with a specific key/value for each of these failure points as well as a correlation ID for each point of the flow. Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.

Using Sumo Logic, we are able to aggregate all of these logs by passing the correlation ID when we make calls between the systems.

We then created a search query (eventually a dashboard) to view the flow of the firmware updates as they went from our cloud down to the device and back up to the cloud to acknowledge that the firmware was updated. This query parses the log messages to retrieve the correlation ID, hub, device, status, firmware versions, etc.. These values are then fed into a Sumo Logic transaction enabling us to easily view the state of a firmware update for any user in the system at a micro level and the overall health of all OTA updates on the macro level.

Depending on which part of the infrastructure the OTA update failed, engineers are then able to dig in deeper into the specific EC2 instance that had a problem. Because our application servers produce logs at the WARN and ERROR level we can see if the update failed because of a timeout from the AWS ElasticCache service, or from a problem with a query on AWS RDS. Having quick access to logs across the cluster enables us to identify issues across our platform regardless of which AWS service we are using.

As Vlad noted, This feature is still being tested and hasn’t been rolled out fully in PROD yet. “The big take away is that we are much more confident in our ability identify updates, triage them when they fail and ensure that the feature is working correctly because of Sumo Logic.”

WHOmentors.com

WHOmentors.com, Inc. is a nonprofit scientific research organization and the 501(c)(3) fiscal sponsor of Teen Hackathon. To facilitate their training to learn languages like Java, Python, and Node.js, each individual participate begins with the Alexa Skills Kit, a collection of self-service application program interfaces (APIs), tools, documentation and code samples that make it fast and easy for teens to add capabilities for use Alexa-enabled products such as the Echo, Tap, or Dot.

According WHOmentors.com CEO, Rauhmel Fox, “The easiest way to build the cloud-based service for a custom Alexa skill is by using AWS Lambda, an AWS offering that runs inline or uploaded code only when it’s needed and scales automatically, so there is no need to provision or continuously run servers.

With AWS Lambda, WHOmentors.com pays only for what it uses. The corporate account is charged based on the number of requests for created functions and the time the code executes. While the AWS Lambda free tier includes one million free requests per month and 400,000 gigabyte (GB)-seconds of compute time per month, it becomes a concern when the students create complex applications that tie Lambda to other expensive services or the size of their Lambda programs are too long.

Ordinarily, someone would be assigned to use Amazon CloudWatch to monitor and troubleshoot the serverless system architecture and multiple applications using existing AWS system, application, and custom log files. Unfortunately, there isn’t a central dashboard to monitor all created Lambda functions.

With the integration of a single Sumo Logic collector, WHOmentors.com can automatically route all Amazon CloudWatch logs to the Sumo Logic service for advanced analytics and real-time visualization using the Sumo Logic Lambda functions on Github.”

Using the Sumo Logic Lambda Functions

“Instead of a “pull data” model, the “Sumo Logic Lambda function” grabs files and sends them to Sumo Logic web application immediately. Their online log analysis tool offers reporting, dashboards, and alerting as well as the ability to run specific advanced queries as needed.

The real-time log analysis combination of the “SumoLogic Lambda function” assists me to quickly catch and troubleshoot performance issues such as the request rate of concurrent executions that are either stream-based event sources, or event sources that aren’t stream-based, rather than having to wait hours to identify whether there was an issue.

I am most concerned about AWS Lambda limits (i.e., code storage) that are fixed and cannot be changed at this time. By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 100. Why? The default limit is a safety limit that protects the corporate from costs due to potential runaway or recursive functions during initial development and testing.

As a result, I can quickly determine the performance of any Lambda function and clean up the corporate account by removing Lambda functions that are no longer used or figure out how to reduce the code size of the Lambda functions that should not be removed such as apps in production.”

The biggest relief for Rauhmel is he is able to encourage the trainees to focus on coding their applications instead of pressuring them to worry about the logs associated with the Lambda functions they create.

And the Winner of AWS Logging Contest is…

Just as at the end of an epic World-Series battle between two MLB teams, you sometimes wish both could be declared winner. Alas, there can only be one. We looked closely at the use cases, which were very different from one another. Weighing factors like the breadth in the usage of the Sumo Logic and AWS platforms added to our drama. While SmartThings uses Sumo Logic broadly to troubleshoot and prevent failure points, WHOmentors.com use case is specific to AWS Lambda. But we couldn’t ignore the cause of helping teens learn to write code in popular programming languages, and building skills that may one day lead them to a job.

Congratulations to WHOmentors.com. Your Drone is on its way!

↧

Using HTTP Request Builders to Create Repeatable API Workflows

September 26, 2016, 11:14 am

≫ Next: Integrated Container Security Monitoring with Twistlock

≪ Previous: Customers Share their AWS Logging with Sumo Logic Use Cases

paw http request builder As an API Engineer, you’ve probably spent hours carefully considering how API will be consumed by client software, what data you are making available at which points within particular workflows, and strategies for handling errors that bubble up when a client insists on feeding garbage to your API. You’ve written tests for the serializers and expected API behaviors, and you even thought to mock those external integrations so you can dive right into the build. As you settle in for a productive afternoon of development, you notice a glaring legacy element in your otherwise modern development setup:

Latest and greatest version of your IDE: Check.
Updated compiler and toolchain: Installed.
Continuous Integration: Ready and waiting to put your code through its paces.
That random text file containing a bunch of clumsily ordered cURL commands.

…one of these things is not like the others.

It turns out we’ve evolved…and so have our API tools

Once upon a time, that little text file was state-of-the-art in API development. You could easily copy-paste commands into a terminal and watch your server code spring into action; however, deviating from previously built requests required careful editing. Invariably, a typo would creep into a crucial header declaration, or revisions to required parameters were inconsistently applied, or perhaps a change in HTTP method resulted in a subtly different API behavior that went unnoticed release over release.

HTTP Request Builders were developed to take the sting out of developing and testing HTTP endpoints by reducing the overhead in building and maintaining test harnesses, allowing you to get better code written with higher quality. Two of the leaders in the commercial space are Postman and Paw, and they provide a number of key features that will resonate with those who either create or consume APIs:

Create HTTP Requests in a visual editor: See the impact of your selected headers and request bodies on the request before you send it off to your server. Want to try an experiment? Toggle parameters on or off with ease or simply duplicate an existing request and try two different approaches!
Organize requests for your own workflow…or collaborate with others: Create folders, reorder, and reorganize requests to make it painless to walk through sequential API calls.
Test across multiple environments: Effortlessly switch between server environments or other variable data without having to rewrite every one of your requests.
Inject dynamic data: Run your APIs as you would expect them to run in production, taking data from a previous API as the input to another API.

From here, let’s explore the main features of HTTP Request Builders via Paw and show how those features can help make your development and test cycles more efficient. Although Paw will be featured in this post, many of these capabilities exist in other HTTP Builder packages such as Postman.

How to Streamline your HTTP Request Pipeline

Command-line interfaces are great for piping together functionality in one-off tests or when building out scripts for machines to follow, but quickly become unwieldy when you have a need to make sweeping changes to the structure or format of an API call. This is where visual editors shine, giving the human user an easily digestible view of the structure of the HTTP request, including its headers, querystring and body so that you can review and edit requests in a format that puts the human first. Paw’s editor is broken up into three areas. Working from left to right, these areas are:

Request List: Each distinct request in your Paw document gets a new row in this panel and represents the collection of request data and response history associated with that specific request.
HTTP Request Builder: This is the primary editor for constructing HTTP requests. Tabs within this panel allow you to quickly switch between editing headers, URL parameters, and request bodies. At the bottom of the panel is the code generator, allowing you to quickly spawn code for a variety of languages including Objective-C, Swift, Java, and even cURL!
HTTP Exchange: This panel reflects the most recent request and associated response objects returned by the remote server. This panel also offers navigation controls for viewing historical requests and responses.

Paw Document containing three sample HTTP Requests and the default panel arrangement

Figure 1. Paw Document containing three sample HTTP Requests and the default panel arrangement.

As you work through building up the requests that you use in your API workflows, you can easily duplicate, edit, and execute a request all in a matter of a few seconds. This allows you to easily experiment with alternate request formats or payloads while also retaining each of your previous versions. You might even score some brownie points with your QA team by providing a document with templated requests they can use to kick-start their testing of your new API!

Organize Request Lists for Yourself and Others

The Request List panel also doubles as the Paw document’s organization structure. As you add new requests, they will appear at the bottom of the list; however, you can customize the order by dragging and dropping requests, or a create folders to group related requests together. The order and names attached to each request help humans understand what the request does, but in no way impacts the actual requests made of the remote resource. Use these organization tools to make it easy for you to run through a series of tests or to show others exactly how to replicate a problem.

If the custom sort options don’t quite cover your needs, or if your document starts to become too large, Sort and Filter bars appear at the bottom of the Request List to help you focus only on the requests you are actively working with. Group by URL or use the text filter to find only those requests that contain the URL you are working with.

Request List panel showing saved requests, folder organization, and filtering options

Figure 2. Request List panel showing saved requests, folder organization, and filtering options.

Dealing with Environments and Variables

Of course, many times you want to be able to test out behaviors across different environments — perhaps your local development instance, or the development instance updated by the Continuous Integration service. Or perhaps you may even want to compare functionality to what is presently available in production.

It would be quite annoying to have to edit each of your requests and change the URL from one host to another. Instead, let Paw manage that with a quick switch in the UI.

Paw’s Environment Switcher changes variables with just a couple of clicks.

Figure 3. Paw’s Environment Switcher changes variables with just a couple of clicks.

The Manage Environments view allows you to create different “Domains” for related kinds of variables, and add “Environments” as necessary to handle permutations of these values:

Paw’s Environment Editor shows all Domains and gives easy access to each Environment.

Figure 4. Paw’s Environment Editor shows all Domains and gives easy access to each Environment.

This allows you flexibility in adjusting the structure of a payload with a few quick clicks instead of having to handcraft an entirely new request. The Code Generator pane at the bottom of the Request Builder pane updates to show you exactly how your payload changes:

Paw Document showing the rebuilt request based on the Server Domain’s Environment

Figure 5. Paw Document showing the rebuilt request based on the Server Domain’s Environment.

One of the most common setups is to have a Server Domain with Environments for the different deployed versions of code. From there, you could build out a variable for the Base URL, or split it into multiple variables so that the protocol could be changed, independent of the host address — perhaps in order to quickly test whether HTTP to HTTPS redirection still works after making changes to a load balancer or routing configuration. Paw’s variables can even peer into other requests and responses and automatically rewrite successive APIs.

Many APIs require some form of authentication to read or write privileged material. Perhaps the mechanism is something simple like a cookie or authentication header, or something more complex like an oAuth handshake. Either way, there is a bit of data in the response of one API that should be included in the request to a subsequent API. Paw variables can parse data from prior requests and prior responses, dynamically updating subsequent requests:

Paw Document revealing the Response Parsed Body Variable extracting data from one request and injecting it into another.

Figure 6. Paw Document revealing the Response Parsed Body Variable extracting data from one request and injecting it into another.

In the case shown above, we’ve set a “Response parsed body” variable as a Querystring parameter to a successive API, specifically grabbing the UserId key for the post at index 0 in the Top 100 Posts Request. Any indexable path in the response of a previous request is available in the editor. You may need to extract a session token from the sign-in API and apply it to subsequent authenticated-only requests. Setting this variable gives you the flexibility to change server environments or users, execute a sign-in API call, then proceed to hit protected endpoints in just a few moments rather than having to make sweeping edits to your requests.

Request Builders: Fast Feedback, Quick Test Cycles

HTTP Request Builders help give both API developers and API consumers a human-centric way of interacting with what is primarily a machine-to-machine interface. By making it easy to build and edit HTTP requests, and providing mechanisms to organize, sort, and filter requests, and allowing for fast or automatic substitution of request data, working with any API becomes much easier to digest. The next time someone hands you a bunch of cURL commands, take a few of those minutes you’ve saved from use of these tools, and help a developer join us here in the future!

Using HTTP Request Builders to Create Repeatable API Workflows is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Bryan Musial (@BKMu) is a Full-Stack Senior Software Engineer with the San Francisco-based startup, Tally (www.meettally.com), working to make managing your credit cards automatic, simple, and secure. Previously, Bryan worked for the Blackboard Mobile team where he built and shipped iOS and Android applications used by millions of students and teachers every day.

↧

Integrated Container Security Monitoring with Twistlock

September 27, 2016, 9:35 am

≫ Next: 5 Log Monitoring Moves to Wow Your Business Partner

≪ Previous: Using HTTP Request Builders to Create Repeatable API Workflows

Twistlock provides dev-to-production security for the container environment. More specifically, The Twistlock container security suite offers 4 major areas of functionality:

Vulnerability management that inspects the full stack of components in a container image and allows you to eradicate vulnerabilities before deployment.
Compliance which enforces compliance with industry best practices and configuration policies, with 90+ built-in settings covering the entire CIS Docker benchmark.
Access control that applies granular policies to managing user access to Docker, Swarm, and Kubernetes APIs. This capability builds on Twistlock’s authorization plugin framework that’s been shipping as a part of Docker itself since 1.10.
Runtime defense, which combines static analysis, machine learning, Twistlock Labs research, and active threat feeds to protect container environments at scale, without human intervention.

Integration with Sumo Logic

Because Twistlock has a rich set of data about the operations of a containerized environment, integrating with powerful operational analytics tools like Sumo Logic is a natural fit. In addition to storing all event data in its own database, Twistlock also writes events out via standard syslog messages so it’s easy to harvest and analyze using tools like Sumo Logic.

Setting up integration is easy, simply follow the standard steps for collecting logs from a Linux host that Sumo Logic has already automated. After a collector is installed on a host Twistlock is protecting, configure Sumo Logic to harvest the log files from /var/lib/twistlock/log/*.log:

In this case, the log collection is named “twistlock_logs” to make it easy to differentiate between standard Linux logs.

Note that Twistlock produces 2 main types of logs, aligned with our distributed architecture as illustrated below.

Console logs track centralized activities such as rule management, configuration changes, and overall system health.
Defender logs are produced on each node that Twistlock protects and are local in scope. These logs track activities such as authentication to the local node and runtime events that occur on the node.

Once log files are collected, searching, slicing, and visualizing data is done using the standard Sumo Logic query language and tools. Here’s a simple example of just looking across all Twistlock logs using thesource=”twistlock_logs” query:

Of course, the real power of a tool like Sumo Logic is being able to easily sort, filter, and drill down into log data. So, let’s assume you want to drill down a little further and look for process violations that Twistlock detected on a specific host. This is a common incident response scenario and this illustrates the power of Twistlock and Sumo Logic working together to identify the anomaly and to understand it more completely. To do this, we simply add a little more logic to the query:

(_sourceCategory=twistlock_logs (Process violation)) AND _sourcehost = “cto-stable-ubuntu.c.cto-sandbox.internal”

Perhaps you’re looking for a specific action that an attacker took, like running netcat, something that should likely never happen in your production containers. Again, because of Twistlock’s runtime defense, this anomaly is automatically detected as soon as it occurs without any human having to create a rule to do so. Because Twistlock understands the entrypoint on the image, how the container was launched via Docker APIs, and builds a predictive runtime model via machine learning, it can immediately identify the unexpected process activity. Once this data is in Sumo Logic, it’s easy to drill down even further and look for it:

(_sourceCategory=twistlock_logs (Process violation)) AND _sourcehost = “cto-stable-ubuntu.c.cto-sandbox.internal” AND nc

Of course, with Sumo Logic, you could also build much more sophisticated queries, for example, looking for any process violation that occurs on hosts called prod-* and is not caused by a common shell launching. Even more powerfully, you can correlate and visualize trends across multiple hosts. To take our example further, imagine we wanted to not just look for a specific process violation, but instead to visualize trends over time. The Twistlock dashboard provides good basic visualizations for this, but if you want to have full control of slicing and customizing the views, that’s where a tool like Sumo Logic really shines.

Here’s an example of us looking for process violations over time, grouped in 5 minute timeslices, and tracked per host, then overlaid on a line chart:

_sourceCategory=twistlock_logs (Process violation)| timeslice 5m | count as count by _timeslice, _sourceHost| transpose row _timeslice column _sourceHost

Of course, this just touches on some of the capabilities once Twistlock’s container security data is in a powerful tool like Sumo Logic. You may also build dashboards to summarize and visualize important queries, configure specific views of audit data to be available to only specific teams, and integrate container security event alerting into your overall security alert management process. Once the data is in, the possibilities are limitless.

Create a dashboard

Here we go over the steps of which to create a dashboard in Sumologic to show and analyze some of this data

Login to Sumo Logic
Create a new search
Use the following query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
- _sourceCategory=twistlock/example (violation) | timeslice 24h | count by _timeslice | order by _timeslice desc
- Run the query and select the Aggregates tab
- You should be looking at a list of dates and their total count of violations

Select the single value viewer from the Aggregate Tab’s toolbar

Click the “Add to dashboard” button on the right hand side to start creating a new dashboard by adding this chart as a panel
Create the new panel
- Enter a title for example: Violations (last 24 hours)
- Enter a new dashboard name for example: Overview Dashboard

Click Add

As an optional step you can set coloring ranges for these values. This will help you quickly identify areas that need attention.
- When editing the value choose Colors by Value Range… from the cog in the Aggregate Tab’s toolbar

- Enter 1 – 30 and choose green for the color
- Click Save
- Enter 31-70 and choose orange for the color
- Enter 71 – (leave blank) and choose red for the color
- Click Save

Create single value viewers using the same process as above for each of the queries below: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
1. Network Violations
  - _sourceCategory=twistlock/example (Network violation) | timeslice 24h | count by _timeslice | order by _timeslice desc
2. Process Violations
  - _sourceCategory=twistlock/example (Network violation) | timeslice 24h | count by _timeslice | order by _timeslice desc
Your dashboard should look similar to this

Create another chart using the same process as above but this time use the search query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)

_sourceCategory=”twistlock/kevin” (violation) | timeslice 1d | count by _timeslice | order by _timeslice asc
Run the query and select the Aggregates tab
You should be looking at a list of dates and their total number of violations

Select the area chartfrom the Aggregate Tab’s toolbar
Click the “Add to dashboard” button on the right hand side to start creating a new dashboard by adding this chart as a panel
Create the new dashboard panel
- Enter a title for example: Violations by day
- Select Overview Dashboard as the dashboard

Click Add
Resize the line chart so it extends the full width of the dashboard by clicking and dragging on the bottom right corner of the area chart panel
Your dashboard should now look similar to the one below

Use the following query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)
- _sourceCategory=”twistlock/example” (Denied)|parse “The command * * for user * by rule *’” as command, action, user, rulename | count by user | order by user asc
- Run the query and select the Aggregates tab
- You should be looking at a list of users and their total count of violations

Select the column chart iconfrom the Aggregate Tab’s toolbar

Click the “Add to dashboard” button on the right hand side to start creating a new dashboard by adding this chart as a panel
Create the new panel
- Enter a title for example: Top Users with Violations
- Enter a new dashboard name for example: Overview Dashboard

Click Add

Create another chart using the same process as above but this time use the search query: (Replace twistlock/example with the tags you used when creating the Twistlock collector)

_sourceCategory=”twistlock/example” (violation) | parse “.go:* * violation ” as linenumber, violation_type | count by violation_type | order by _count desc

Create the new panel
- Enter a title for example: Top Violation by Types
- Select Overview Dashboard as the dashboard

Click Add
Your completed dashboard should now look similar to the one below

In summary, integrating Twistlock and Sumo Logic gives users powerful and automated security protection for containers and provides advanced analytic capabilities to fully understand and visualize that data in actionable ways. Because both products are built around open standards, integration is easy and users can begin reaping the benefits of this combined approach in minutes.

↧

5 Log Monitoring Moves to Wow Your Business Partner

September 27, 2016, 10:32 am

≫ Next: Setting Up a Docker Environment Using Docker Compose

≪ Previous: Integrated Container Security Monitoring with Twistlock

Log Monitoring - Sumo Logic Looking for some logging moves that will impress your business partner? In this post, we’ll show you a few. But first, a note of caution:

If you’re going to wow your business partner, make a visiting venture capitalist’s jaw drop, or knock the socks off of a few stockholders, you could always accomplish that with something that has a lot of flash, and not much more than that, or you could show them something that has real and lasting substance, and will make a difference in your company’s bottom line. We’ve all seen business presentations filled with flashy fireworks, and we’ve all seen how quickly those fireworks fade away.

Around here, though, we believe in delivering value—the kind that stays with your organization, and gives it a solid foundation for growth. So, while the logging moves that we’re going to show you do look good, the important thing to keep in mind is that they provide genuine, substantial value—and discerning business partners and investors (the kind that you want to have in your corner) will recognize this value quickly.

Why Is Log Monitoring Useful?

What value should logs provide? Is it enough just to accumulate information so that IT staff can pick through it as required? That’s what most logs do, varying mostly in the amount of information and the level of detail. And most logs, taken as raw data, are very difficult to read and interpret; the most noticeable result of working with raw log data, in fact, is the demand that it puts on IT staff time.

5 Log Monitoring Steps to Success

Most of the value in logs is delivered by means of systems for organizing, managing, filtering, analyzing, and presenting log data. And needless to say, the best, most impressive, most valuable logging moves are those which are made possible by first-rate log management. They include:

Quick, on-the-spot, easy-to-understand analytics. Pulling up instant, high-quality analytics may be the most impressive move that you can make when it comes to logging, and it is definitely one of the most valuable features that you should look for in any log management system. Raw log data is a gold mine, but you need to know how to extract and refine the gold. A high-quality analytics system will extract the data that’s valuable to you, based on your needs and interests, and present it in ways that make sense. It will also allow you to quickly recognize and understand the information that you’re looking for.
Monitoring real-time data. While analysis of cumulative log data is extremely useful, there are also plenty of situations where you need to see what is going on right at the moment. Many of the processes that you most need to monitor (including customer interaction, system load, resource use, and hostile intrusion/attack) are rapid and transient, and there is no substitute for a real-time view into such events. Real-time monitoring should be accompanied by the capacity for real-time analytics. You need to be able to both see and understand events as they happen.
Fully integrated logging and analytics. There may be processes in software development and operations which have a natural tendency to produce integrated output, but logging isn’t one of them. Each service or application can produce its own log, in its own format, based on its own standards, without reference to the content or format of the logs created by any other process. One of the most important and basic functions that any log management system can perform is log integration, bringing together not just standard log files, but also event-driven and real-time data. Want to really impress partners and investors? Bring up log data that comes from every part of your operation, and that is fully integrated into useful, easily-understood output.
Drill-down to key data. Statistics and aggregate data are important; they give you an overall picture of how the system is operating, along with general, system-level warnings of potential trouble. But the ability to drill down to more specific levels of data—geographic regions, servers, individual accounts, specific services and processes —is what allows you to make use of much of that system-wide data. It’s one thing to see that your servers are experiencing an unusually high level of activity, and quite another to drill down and see an unusual spike in transactions centered around a group of servers in a region known for high levels of online credit card fraud. Needless to say, integrated logging and scalability are essential when it comes to drill-down capability.
Logging throughout the application lifecycle. Logging integration includes integration across time, as well as across platforms. This means combining development, testing, and deployment logs with metrics and other performance-related data to provide a clear, unified, in-depth picture of the application’s entire lifecycle. This in turn makes it possible to look at development, operational, and performance-related issues in context, and see relationships which might not be visible without such cross-system, full lifecycle integration.

Use Log Monitoring to Go for the Gold

So there you have it—five genuine, knock-’em-dead logging moves. They’ll look very impressive in a business presentation, and they’ll tell serious, knowledgeable investors that you understand and care about substance, and not just flash. More to the point, these are logging capabilities and strategies which will provide you with valuable (and often crucial) information about the development, deployment, and ongoing operation of your software.

Logs do not need to be junkpiles of unsorted, raw data. Bring first-rate management and analytics to your logs now, and turn those junk-piles into gold.

5 Log Monitoring Moves to Wow Your Business Partner is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Michael Churchman started as a scriptwriter, editor, and producer during the anything-goes early years of the game industry. He spent much of the ‘90s in the high-pressure bundled software industry, where the move from waterfall to faster release was well under way, and near-continuous release cycles and automated deployment were already de facto standards. During that time he developed a semi-automated system for managing localization in over fifteen languages. For the past ten years, he has been involved in the analysis of software development processes and related engineering management issues.

↧

Setting Up a Docker Environment Using Docker Compose

September 28, 2016, 11:23 am

≫ Next: Getting the Most Out of SaltStack Logs

≪ Previous: 5 Log Monitoring Moves to Wow Your Business Partner

Docker Compose is a handy tool for solving one of the biggest inherent challenges posed by container-based infrastructure. That challenge is this: While Docker containers provide a very easy and convenient way to make apps portable, they also abstract your apps from the host system — since that is the whole point of containers. As a result, connecting one container-based app to another — and to resources like data storage and networking — is tricky.

If you’re running a simple container environment, this isn’t a problem. A containerized web server that doesn’t require multiple containers can exist happily enough on its own, for example.

But if life were always simple, you wouldn’t need containers in the first place. To do anything serious in your cloud, you will probably want your containers to be able to interact with one another and to access system resources.

That’s where Docker Compose comes in. Compose lets you define the containers and services that need to work together to power your application. Compose allows you to configure everything in plain text files, then use a simple command-line utility to control all of the moving parts that make your app run.

Another way to think of Compose is as an orchestrator for a single app. Just as Swarm and Kubernetes automate management of all of the hundreds or thousands of containers that span your data center, Compose automates a single app that relies on multiple containers and services.

Using Docker Compose

Setting up a Docker environment using Compose entails multiple steps. But if you have any familiarity with basic cloud configuration — or just text-based interfaces on Unix-like operating systems — Compose is pretty simple to use.

Deploying the tool involves three main steps. First, you create a Dockerfile to define your app. Second, you create a Compose configuration file that defines app services. Lastly, you fire up a command-line tool to start and control the app.

I’ll walk through each of these steps below.

Step 1. Make a Dockerfile

This step is pretty straightforward if you are already familiar with creating Docker images. Using any text editor, open up a blank file and define the basic parameters for your app.

The Dockerfile contents will vary depending on your situation, but the format should basically look like this:

FROM [ name of the base Docker image you're using ] ADD . [ /path/to/workdir ] WORKDIR [ directory where your code lives ] RUN [ command(s) to run to set up app dependencies ] CMD [ command you'll use to call the app ]

Save your Dockerfile. Then build the image by calling docker build -t [ image name ]

Step 2. Define Services

If you can build a Dockerfile, you can also define app services. Like the first step, this one is all about filling in fields in a text file.

You’ll want to name the file docker-compose.yml and save it in the workdir that you defined in your Dockerfile. The contents of docker-compose.yml should look something like this:version: '2' services: [ name of a service ]: build: [ code directory ] ports: - "[ tcp and udp ports ] " volumes: - .: [ /path/to/code directory ] depends_on: - [ name of dependency image ] [ name of another service ]: image: [ image name ]

You can define as many services, images and dependencies as you need. For a complete overview of the values you can include in your Compose config file, check out Docker’s documentation.

Don’t forget that another cool thing you can do with Compose is configure log collection using Powerstrip and the Sumo Logic collector container.

Step 3. Run the app

Now comes the really easy part. With your container image built and the app services defined, you just need to turn the key and get things running.

You do that with a command-line utility called (simply enough) docker-compose.

The syntax is pretty simple, too. To start your app, call docker-compose up from within your project directory.

You don’t need any arguments (although you can supply some if desired; see below for more on that). As long as your Dockerfile and Compose configuration file are in the working directory, Compose will find and parse them for you.

Even sweeter, Compose is smart enough to build dependencies automatically, too.

After being called, docker-compose will respond with some basic output telling you what it is doing.

To get the full list of arguments for docker-compose, call it with the help flag:
docker-compose —help
When you’re all done, just run (you guessed it!) docker-compose down to turn off the app.

Some Docker Compose Tips

If you’re just getting started with Compose, knowing about a few of the tool’s quirks ahead of time can save you from confusion.

One is that there are multiple ways to start an app with Compose. I covered docker-compose up above. Another option is docker-compose run.

Both of these commands do the same general thing — start your app — but run is designed for starting a one-time instance, which can be handy if you’re just testing out your build. up is the command you want for production scenarios.

There’s also a third option: docker-compose start. This call only restarts containers that already exist. Unlike up, it doesn’t build the containers for you.

Another quirk: You may find that Compose seems to hang or freeze when you tell it to shut down an app using docker-compose stop. Panic not! Compose is not hanging. It’s just waiting patiently for the container to shut down in response to a SIGTERM system call.

If the container doesn’t shut down within ten seconds, Compose will hit it with SIGKILL, which should definitely shut it down. (If your containers aren’t responding to standard SIGTERM requests, by the way, you may want to read more about how Docker processes signals to figure out why.)

That’s Compose in a nutshell — or about a thousand words, at least. For all of the nitty-gritty details, you can refer to Docker’s Compose reference guide.

Setting Up a Docker Environment Using Docker Compose is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

Hemant Jain is the founder and owner of Rapidera Technologies, a full service software development shop. He and his team focus a lot on modern software delivery techniques and tools. Prior to Rapidera he managed large scale enterprise development projects at Autodesk and Deloitte.

↧

Getting the Most Out of SaltStack Logs

September 30, 2016, 11:26 am

≫ Next: Managing Container Data Using Docker Data Volumes

≪ Previous: Setting Up a Docker Environment Using Docker Compose

Learn about SaltStack log storage and customization, and how to analyze the logs with Sumo Logic to gain useful insights into your server configuration. SaltStack, also known simply as Salt, is a handy configuration management platform. Written in Python, it’s open source and allows ITOps teams to define “Infrastructure as Code” in order to provision and orchestrate servers.

But SaltStack’s usefulness is not limited to configuration management. The platform also generates logs, and like all logs, that data can be a useful source of insight in all manner of ways.

This article provides an overview of SaltStack logging, as well as a primer on how to analyze SaltStack logs with Sumo Logic.

Where does SaltStack store logs?

The first thing to understand is where SaltStack logs live. The answer to that question depends on where you choose to place them.

You can set the log location by editing your SaltStack configuration file on the salt-master. By default, this file should be located at /etc/salt/master on most Unix-like systems.

The variable you’ll want to edit is log_file. If you want to store logs locally on the salt-master, you can simply set this to any location on the local file system, such as /var/log/salt/salt_master.

Storing Salt logs with rsyslogd

If you want to centralize logging across a cluster, however, you will benefit by using rsyslogd, a system logging tool for Unix-like systems. With rsyslogd, you can configure SaltStack to store logs either remotely or on the local file system.

For remote logging, set the log_file parameter in the salt-master configuration file according to the format:
<file|udp|tcp>://<host|socketpath>:/.

For example, to connect to a server named mylogserver (whose name should be resolveable on your local network DNS, of course) via UDP on port 2099, you’d use a line like this one:
log_file: udp://mylogserver:2099

Colorizing and bracketing your Salt logs

Another useful configuration option that SaltStack supports is custom colorization of console logs. This can make it easier to read the logs by separating high-priority events from less important ones.

To set colorization, you change the log_fmt_console parameter in the Salt configuration file. The colorization options available are:
'%(colorlevel)s' # log level name colorized by level '%(colorname)s' # colorized module name '%(colorprocess)s' # colorized process number '%(colormsg)s' # log message colorized by level

Log files can’t be colorized. That would not be as useful, since the program you use to read the log file may not support color output, but they can be padded and bracketed to distinguish different event levels. The parameter you’ll set here is log_fmt_logfile and the options supported include:
'%(bracketlevel)s' # equivalent to [%(levelname)-8s] '%(bracketname)s' # equivalent to [%(name)-17s] '%(bracketprocess)s' # equivalent to [%(process)5s]

How to Analyze SaltStack logs with Sumo Logic

So far, we’ve covered some handy things to know about configuring SaltStack logs. You’re likely also interested in how you can analyze the data in those logs. Here, Sumo Logic, which offers easy integration with SaltStack, is an excellent solution.

Sumo Logic has an official SaltStack formula, which is available from GitHub. To install it, you can use GitFS to make the formula available to your system, but the simpler approach (for my money, at least) is simply to clone the formula repository in order to save it locally. That way, changes to the formula won’t break your configuration. (The downside, of course, is that you also won’t automatically get updates to the formula, but you can always update your local clone of the repository if you want them.)

To set up the Sumo Logic formula, run these commands:
mkdir -p /srv/formulas # or wherever you want to save the formula cd /srv/formulas git clone https://github.com/saltstack-formulas/sumo-logic-formula.git

Then simply edit your configuration by adding the new directory to the file_roots parameter, like so:
file_roots: base: - /srv/salt - /srv/formulas/sumo-logic-formula

Restart your salt-master and you’re all set. You’ll now be able to analyze your SaltStack logs from Sumo Logic, along with any other logs you work with through the platform.

Getting the Most Out of SaltStack Logs is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.

About the Author

↧

Open Source Projects at Sumo Logic

Sumoshell

Sumobot

Sumo Logic Python SDK

Sumo Logic Java Client

Growing Number of Projects

What Can I Do With Live Tail?

Troubleshoot Production Logs in Real Time

Save Time Requesting and Exporting Log Files

Consolidate Tools to Reduce Costs

Getting Started

Multi-tailing

Launch In Context

So, what is Git bisect?

Getting started with Git bisect

Narrowing down the suspects

Getting to the Root Cause

About the Author

Measuring Your Software Release Cycle Speed

Don’t Forget User Experience

Controlling the Release Cycle Logging Deluge

About the Author

Load Balancer Best Practices

Read the Docs!

Plan your Load Balancer Installation

SSL Termination

Configure Cross-Zone Load Balancing

Global Load Balancing

Pre-warming your ELB

Monitoring

Conclusion

References

About the Author

What GitHub events are, and what they are not

How to use GitHub events

Use GitHub Webhooks for automated events reporting

Authentication and GitHub events

Further reading

The Basics of Solaris Containers

Working with Zones and Containers on Solaris

Solaris Containers vs. Docker/CoreOS/LXD: Pros and Cons

About the Author

Why should I benchmark microservices?

Create a spreadsheet for tracking your benchmarking

Be clear about your benchmark goals

Pick one key metric (Key Performance Indicator – KPI)

Think like a scientist

Define, document and validate your benchmarking methodology

Use load generation tools, and understand their limitations

Validate your benchmarking methodology

Execute the benchmark series

Create enough load to be able to measure a difference

Strategy 1: Redline it!

Strategy 2: Measure machine resources

Add new benchmarking experiments as needed

Hack the code

Analyze the data and document your insights

Present your insights

Pulling the Artifactory Container

Running Artifactory as a Container

Configuring the Client

Why run Artifactory as a Container?

About the Author

Bintray Security Basics

Set up an API key for Bintray

Use OAuth for Bintray Authentication

Sign Packages with GPG

Take Advantage of Bintray’s Access Control

About the Author

Understanding Key Concepts in Kinesis

Kinesis Stream Shards

Types of Shards

Amazon Kinesis Offerings

Kinesis Pricing

Kinesis Streams

Kinesis Firehose

Kinesis vs SQS

Competitors to Kinesis

About the Author

Resources