On modern enterprise networks, 100% uptime has become table stakes. Most organizations can no longer rely on a single circuit for internet connectivity. We look to carrier circuits for the redundancy and guaranteed uptime that our organizations need. When carrier outages occur, network engineers find themselves in a hot seat they can do little about. However, if we do our homework, we can improve our organizations' uptime by taking care as we provision connectivity.
Causes of carrier outages
Most network engineers have experienced the rapid-fire text messages and flurry of questions when the Internet stops working. It’s important to understand the upstream causes of these outages so we can work with our carriers to mitigate them. The first, and most common, is the dreaded fiber cut. Regardless of the cause, a fiber cut in the wrong location can have widespread impacts. Second, an upstream provider issue can interrupt service. While less frequent than a fiber cut, these outages can be frustrating because, although your circuits and peerings are healthy, traffic does not flow properly. Third, DDoS attacks, whether directly targeting your organization or another customer on your provider’s network, can have a crippling impact on service availability.
Managing Around a Fiber Cut
A few different approaches can help mitigate the impacts of a fiber cut. Your organization can purchase circuit diversity from a single carrier. In this scenario, your carrier will engineer multiple circuits into your facility. As part of this service, they will evaluate the physical path each circuit follows and ensure the circuits do not ride the same cable or poles. For true diversity, you’ll need to be certain that circuits take different paths into your facility. And, if circuits terminate into a powered cabinet, you must verify the reliability of the power source for that gear. Ask lots of questions and hold your carrier accountable. Be certain that they are contractually obligated to provide diversity because there are penalties if they fail to do so. Work with an engineer for your carrier; don’t take the sales rep’s word for it. A single provider should have a complete view of the physical path for your circuits and be able to guarantee physical diversity. Unfortunately, however, using single carrier puts you at a higher risk for an upstream configuration our routing failure with that provider.
The Multiple Carrier Route
Instead of ordering path diversity from a single carrier, you can order two circuits from different providers. This option reduces your reliance on a single carrier, but makes it more difficult to ensure full path diversity. You will need to talk to your carriers about sharing the physical path information for the circuits with you or with one another. You’ll still want to be certain the circuits enter the building via a different conduit and terminate into properly powered equipment. If you use different carriers, you will need to pay special attention to your BGP configuration to verify that the path in and out of your network is what you expect.
An Important Note about Grooming
Even if you do everything right — you validate proper path diversity when you order a circuit, you pay special attention to the entrances into your building, and you verify that all vendor equipment is properly powered — things can change. Carriers will periodically groom circuits to change the path they follow through their network. An industrious provider engineer may see that a circuit follows a less-than-optimal path through their network and then diligently re-engineer it to be more efficient. You will not be notified when the grooming takes place; it will be transparent to you, the customer. The only way to prevent grooming is to communicate clearly with your carrier and ask that they mark circuits that have been carefully engineered for path diversity to prevent them from being groomed.
As with most topics is networking, there are many factors to consider and tradeoffs to be made when ordering connectivity for your organization. You cannot have complete control over carrier-provided connectivity, but you can be diligent throughout the process, communicate the challenges clearly with your leadership, and be clear with your service provider about your expectations and the level of service being provided.
Portability means that you can move an application from one host environment to another, including cloud to cloud such as from Amazon Web Services to Microsoft Azure. The work needed to complete the porting of an application from one platform to another depends upon the specific circumstances.
Containers are one technology meant to make such porting easier, by encapsulating the application and operating systems into a bundle that can be run on a platform that supports that container standard like Docker or Kubernetes. But containers are no silver bullet.
We’ve all heard the saying, "What you see is what you get." Life isn’t quite so simple for those focused on security, as what you don’t see is more likely to be what you get. Luckily, there are places to gain visibility in places that are often overlooked.
Security policies have always included the protection of key assets such as servers, network infrastructure, and data center and perimeter devices. This approach will always be the first line of defense. And for those who are new to the security space, this is the best place to start.
More recently, security policies have been extended to the user level. The number of endpoint protection solutions has grown markedly over the last few years as security administrators have understood that protection from attacks initiated from inside an organization is critical. These attacks are able to leverage users and their devices because, like it or not, people do download things they shouldn’t, they visit websites they shouldn’t, they share files, they let their kids use their company assets, and, they often fall prey to social engineering.
Endpoint Protection (EPP) has existed since the 1980s in the form of virus-scanning clients. Over the years, the EPP landscape has become a battle of the Advanced Endpoint Protection (AEP) products. AEPs are next-gen technology, combining EPP functions, like anti-virus, with event detection and response (EDR) technology providing detection, blocking, and forensic analysis capabilities. In addition, operating systems like Windows provide a selection of endpoint tools that can be enabled out of the box.
In the Microsoft world, implementing an endpoint protection strategy can start with an often overlooked feature; Windows Event Logging. Event logging provides visibility into the activities performed on the workstation by grouping application, security, and system events into a single view. The workstation event console may then be configured to forward a customized set of these events to a log aggregator like a domain controller allowing the administrator to have a consolidated view of the activities on the workstations in the domain. These consolidated events can then be further forwarded to a SIEM and used as an alert trigger (detection of an APT) or provide contextual value (workstation state for a specific user on a device that attempted a brute force attack on a key server). More of this in a later blog.
To decide if Workstation Event Logs have a place in your overall security strategy, consider these use cases:
- Access: How secure are the local authentication policies of individual workstations? If an attempt is made to log in to a device using a local access credential rather than a domain controlled account, it will be logged in the workstation event log only.
- Persistence: Registry changes made by an attacker to provide a foothold into the system that persists over system reboots must be tracked.
- Discovery: IoCs can be recognized by anomalous actions, for example, events reporting misspelled service names, uncommon service paths or non-typical application crashes due to buffer overflows.
- Reconnaissance: Running of tools that indicate scanning, recon, and brute force attacks may have been attempted can be logged.
- Forensics: In the case of a breach, building an event timeline from initial compromise to detection is critical to understanding how to recognize the extent of the compromise across multiple machines and how to remediate these systems.
- Behavioral Analysis: Changes in user behavior or inappropriate use of company assets can have both security and legal implications. If certain event types, like failed logins or privilege escalation attempts, begin to occur, or known exploitation tools are installed on a system, this could be a sign of a compromise or a potential issue with an employee.
As with any logging tool, the trick is to create a configuration and deployment strategy. One of the downsides to event collection is that a poorly tuned system can generate far too many events to be useful or even viable. Admins must identify critical events to collect based on how they impact their environment and have an action plan defined for addressing issues. This ensures an understanding of the context and implications of an event; the rule of thumb is that proactive beats reactive.
If this post has you thinking about workstation logging, future blogs will provide more information about defining your security policy, configuring endpoints, and forwarding events to an aggregation device and making use of logs in SIEMs. Stay tuned.
In Austin this week, filming an episode of SolarWinds Lab. I heard there may be snow in the forecast there. I’m starting to get the sense that Winter hates me.
As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!
Considering they spent six months working on this, I’m not surprised. What does surprise me is the comment that they shared their fix with other companies, including competitors. Maybe we are starting to see the beginnings of cooperation that will result in better security for us all.
Move along, nothing to see here.
We aren’t far away from robot armies.
For all of the money that Microsoft spends on marketing the various products and services they have to offer, I am surprised that they didn’t jump at the chance to have Cortana featured at CES.
Well, at least Cortana wasn’t to blame. And I think this shows that we are only one 24-hour blackout away from descending into total chaos as a nation.
Good checklist to consider, especially the DON’T PANIC part.
The puzzle: Why do scientists typically respond to legitimate scientific criticism in an angry, defensive, closed, non-scientific way? The answer: We’re trained to do this during the process of responding to peer review.
Easily the longest title ever for an Actuator link. Have a read and think about how scientists are very, very human. We are all trained to be defensive, I find this especially true in the field of IT. I’ve certainly seen this happen in meetings and in online forums.
The struggle is real:
Containers provide a lightweight way to take application workloads portable, like a virtual machine but without the overhead and bulk typically associated with VMs. With containers, apps and services can be packaged up and moved freely between physical, virtual, or cloud environments.
Hardware virtualization was a great step forward in application hosting compared to the days of bare metal. Hypervisors allowed us to isolate multiple applications within one hardware platform, freeing us to use hardware resources more efficiently by hosting heterogeneous workloads on the same infrastructure. Still, virtual machines have massive overhead in terms of resource consumption, because each VM runs a fully dedicated operating system.
Containerization advances the benefits of virtualization much further by allowing containers to share the OS kernel, networking stack, file system, and other system resources of the host machine, all while using less memory and CPU overhead.
Amazon Web Services has added Google’s Go language (Golang) to the roster of supported language on its AWS Lamdba serverless computing platform. Also added is support for Microsoft’s .Net Core 2.0 when developing in the C# language.
How to get started with Go and .Net Core on AWS Lambda
To help Go developers ramp up on AWS Lambda, AWS is offering libraries, samples and tools for developing AWS Lambda functions at GitHub.