observability Archives - SD Times https://sdtimes.com/tag/observability/ Software Development News Wed, 10 May 2023 16:11:36 +0000 en-US hourly 1 https://wordpress.org/?v=6.1.1 https://sdtimes.com/wp-content/uploads/2019/06/bnGl7Am3_400x400-50x50.jpeg observability Archives - SD Times https://sdtimes.com/tag/observability/ 32 32 Harness announces new feature to proactively identify errors https://sdtimes.com/monitor/harness-announces-new-feature-to-proactively-identify-errors/ Wed, 10 May 2023 16:11:36 +0000 https://sdtimes.com/?p=51114 The new Harness Continuous Tracking (CET) release is designed to provide developer-first observability for modern applications to proactively identify and solve errors across the SDLC.  The Harness CET provides several advantages to developers, such as minimizing the occurrence of defects that go undetected, removing the need for manual troubleshooting, and enabling quicker resolution of customer … continue reading

The post Harness announces new feature to proactively identify errors appeared first on SD Times.

]]>
The new Harness Continuous Tracking (CET) release is designed to provide developer-first observability for modern applications to proactively identify and solve errors across the SDLC. 

The Harness CET provides several advantages to developers, such as minimizing the occurrence of defects that go undetected, removing the need for manual troubleshooting, and enabling quicker resolution of customer problems. This enables teams to identify and resolve issues within a matter of minutes instead of weeks, resulting in enhanced satisfaction for both the developers and end-users.

“Our goal is to empower developers by providing a solution that addresses the pain points unmet by traditional error tracking and observability tools,” said Jyoti Bansal, CEO and co-founder of Harness. “Harness Continuous Error Tracking offers unparalleled visibility and context, enabling teams to quickly identify, diagnose, and resolve issues, ultimately ensuring a better experience for both developers and customers.”

The tool includes runtime code analysis that provides complete visibility into every exception’s source code, variables, and environment state. These issues are routed directly to the right developer for faster resolution. CET also provides the full context of errors including code variables and objects up to ten levels deep into the heap.

CET creates guardrails to ensure only high-quality code advances which prevents unreliable releases from being promoted to staging and production environments.

In addition, release stability allows developers to compare current or past releases to understand trends in new, critical, and resurfaced errors.

The tool integrates with monitoring solutions such as AppDynamics, Dynatrace, Datadog, New Relic, and Splunk. It also natively integrates into Harness build and deployment pipelines or it can be used as a standalone solution.

The post Harness announces new feature to proactively identify errors appeared first on SD Times.

]]>
vFunction enables continuous monitoring, detection, and drift issues with latest release https://sdtimes.com/monitor/vfunction-enables-continuous-monitoring-detection-and-drift-issues-with-latest-release/ Tue, 04 Apr 2023 20:36:55 +0000 https://sdtimes.com/?p=50806 The vFunction Continuous Modernization Manager (CMM) platform is now available, enabling software architects to shift left and find and fix application architecture anomalies. vFunction also announced a new version of vFunction Assessment Hub and updates to vFunction Assessment Hub. CMM observes Java and .NET applications and services to set baselines and monitor for any architectural … continue reading

The post vFunction enables continuous monitoring, detection, and drift issues with latest release appeared first on SD Times.

]]>
The vFunction Continuous Modernization Manager (CMM) platform is now available, enabling software architects to shift left and find and fix application architecture anomalies. vFunction also announced a new version of vFunction Assessment Hub and updates to vFunction Assessment Hub.

CMM observes Java and .NET applications and services to set baselines and monitor for any architectural drift and erosion. It can help companies detect critical architectural anomalies such as new dead code in the application or the emergence of unnecessary code.

“Application architects today lack the architectural observability, visibility, and tooling to understand, track, and manage architectural technical debt as it develops and grows over time,” said Moti Rafalin, the founder and CEO at vFunction. “vFunction Continuous Modernization Manager allows architects to shift left into the ongoing software development lifecycle from an architectural perspective to manage, monitor, and fix application architecture anomalies on an iterative, continuous basis before they erupt into bigger problems.”

The platform also identifies the introduction of a new service or domain and newly identified common classes that can be added to a common library to prevent further technical debt. 

Finally, it monitors and alerts when new dependencies are introduced that expand architectural technical debt, and identifies the highest technical debt classes that contribute to application complexity. Users are notified of changes through Slack, email, and the vFunction Notifications Center, allowing architects to then configure schedules for learning, analysis, and baseline measurements through the vFunction Continuous Modernization Manager.

The latest release of vFunction Modernization Hub 3.0 allows modernization teams to collaborate more effectively by working on different measurements in parallel and later merging them into one measurement. Additionally, the vFunction Assessment Hub now includes a Multi-Application Assessment Dashboard that allows users to track and compare different parameters for hundreds of applications, such as technical debt, aging frameworks, complexity, and state, among others. 

All three products are available in the company’s Application Modernization Platform. 

The post vFunction enables continuous monitoring, detection, and drift issues with latest release appeared first on SD Times.

]]>
Qt launches Qt Insight to provide developers with better customer insights https://sdtimes.com/software-development/qt-launches-qt-insight-to-provide-developers-with-better-customer-insights/ Mon, 20 Mar 2023 15:02:36 +0000 https://sdtimes.com/?p=50603 The new Qt Insight platform provides real customer insights into the usage of applications or devices. The platform reveals how users navigate devices, identifies customer pain points, analyzes performance, and creates concrete, evidence-based development plans to optimize product development and lower running costs by eliminating redundant, unused features based on session activity and metrics such … continue reading

The post Qt launches Qt Insight to provide developers with better customer insights appeared first on SD Times.

]]>
The new Qt Insight platform provides real customer insights into the usage of applications or devices.

The platform reveals how users navigate devices, identifies customer pain points, analyzes performance, and creates concrete, evidence-based development plans to optimize product development and lower running costs by eliminating redundant, unused features based on session activity and metrics such as button clicks and time on screen.

“Understanding customer behaviour, needs, and pain points is essential to delivering an outstanding customer experience,” says Marko Kaasila, the senior vice president of product management at Qt Group. “We are delighted to see such a high level of interest in Qt Insight from a wide range of industries, including industrial automation, consumer electronics, medical and automotive. With the launch of Qt Insight, we are providing businesses with the information they need to truly understand their users, making it possible to develop evidence-based UX strategies that are truly tailored to customers.”

The platform is part of Qt’s portfolio of integrated software development solutions that include Qt Design Studio, Qt Creator, Qt Quality Assurance & Qt Digital Advertising and it will be available as a SaaS product. The solution also supports desktop applications 

Companies can ensure compliance with GDPR and address modern data privacy requirements by using Qt Insight, as it anonymizes their application data as standard. The tool is especially useful for developers, designers, marketers, and product owners. 

Find out more details about the platform here

The post Qt launches Qt Insight to provide developers with better customer insights appeared first on SD Times.

]]>
New Relic announces JFrog integration to provide a single point of access for monitoring https://sdtimes.com/monitoring/new-relic-announces-jfrog-integration-to-provide-a-single-point-of-access-for-monitoring/ Wed, 15 Mar 2023 15:44:34 +0000 https://sdtimes.com/?p=50569 Observability company New Relic and DevOps company JFrog today announced an integration to give engineering teams a single point of access to monitor software development operations. With this integration, users are able to access real-time visibility into CI/CD pipelines, APIs, and web application development workflows so that DevOps and security leaders can solve software supply … continue reading

The post New Relic announces JFrog integration to provide a single point of access for monitoring appeared first on SD Times.

]]>
Observability company New Relic and DevOps company JFrog today announced an integration to give engineering teams a single point of access to monitor software development operations.

With this integration, users are able to access real-time visibility into CI/CD pipelines, APIs, and web application development workflows so that DevOps and security leaders can solve software supply chain performance and security issues.

Additionally, site reliability engineers, security, and operations teams are enabled to consistently monitor the health, security, and usage trends through each stage of the software development lifecycle.

The integration allows engineering teams to track key metrics and generate alerts in New Relic to identify performance degradation so that administrators can manage performance, mitigate risks, and remediate any issues in a single view. 

“Today’s developers need a 360-degree view of applications to monitor and remediate both performance and security, no matter if they’re running on-premises, in the cloud, or at the edge,” said Omer Cohen, executive vice president of strategy at JFrog. “Our integration with New Relic gives DevOps, security, and operations teams the real-time insights needed to optimize their software supply chain environment and accelerate time to market.”

Preconfigured New Relic dashboards also bring a complete view of performance data, artifact usage, and security metrics from JFrog Artifactory and JFrog Xray environments alongside their telemetry data.

To get started, visit the website

The post New Relic announces JFrog integration to provide a single point of access for monitoring appeared first on SD Times.

]]>
New Relic introduces metrics to deliver insights into performance at the code level https://sdtimes.com/monitoring/new-relic-introduces-metrics-to-deliver-insights-into-performance-at-the-code-level/ Tue, 14 Mar 2023 20:04:52 +0000 https://sdtimes.com/?p=50561 Observability company New Relic announced CodeStream code-level metrics and service-level telemetry in order to offer users deeper insights into software performance down to the code level.  This allows developers to find issues quickly before they make it into production, as well as speed up the velocity of engineering.  According to the company, providing developers with … continue reading

The post New Relic introduces metrics to deliver insights into performance at the code level appeared first on SD Times.

]]>
Observability company New Relic announced CodeStream code-level metrics and service-level telemetry in order to offer users deeper insights into software performance down to the code level. 

This allows developers to find issues quickly before they make it into production, as well as speed up the velocity of engineering. 

According to the company, providing developers with telemetry data right where they build and flow allows them to access the data that they need without leaving the IDE, relying on operations teams, or waiting on customer feedback on issues.

This release also supports all of the core languages, including .NET, Java, PHP, Python, Ruby, Go, and Node.js.

“Observability as an engineering practice presents a future where essential workflows are fueled by data,” said Peter Pezaris, SVP of strategy and experience at New Relic. “By bringing production telemetry data into the IDE with New Relic CodeStream, our customers are able to tighten feedback loops and produce better performing software without impacting existing workflows or requiring expensive context switches. New Relic is focusing on shifting left, and CodeStream empowers engineers to get ahead of issues before they hit production and accelerate their development cycles, saving valuable time and money.”

Additionally, users gain access to service-level performance metrics. With this, metrics for related surfaces are brought to the surface so issues can be more easily identified. Performance can also be tracked against service-level objectives in order to ensure overall service health.

Lastly, CodeStream code-level metrics and service-level telemetry displays important telemetry data within pull requests and feedback requests to improve code in production. 

The post New Relic introduces metrics to deliver insights into performance at the code level appeared first on SD Times.

]]>
Why the world needs OpenTelemetry https://sdtimes.com/monitor/why-the-world-needs-opentelemetry/ Thu, 02 Mar 2023 18:19:09 +0000 https://sdtimes.com/?p=50449 Observability has really taken off in the past few years, and while in some ways observability has become a bit of a marketing buzzword, one of the main ways companies are implementing observability is not with any particular company’s solution, but with an open-source project: OpenTelemetry. Since 2019, it has been incubating at the Cloud … continue reading

The post Why the world needs OpenTelemetry appeared first on SD Times.

]]>
Observability has really taken off in the past few years, and while in some ways observability has become a bit of a marketing buzzword, one of the main ways companies are implementing observability is not with any particular company’s solution, but with an open-source project: OpenTelemetry.

Since 2019, it has been incubating at the Cloud Native Computing Foundation, but the project has its origins in two different open-source projects: OpenCensus and OpenTracing, which were merged into one to form OpenTelemetry. 

“It has become now the de facto in terms of how companies are willing to instrument their applications and collect data because it gives them flexibility back and there’s nothing proprietary, so it helps them move away from data silos, and also helps connect the data end to end to offer more effective observability,” said Spiros Xanthos, SVP and general manager of observability at Splunk

OpenTelemetry is one of the most successful open-source projects, depending on what you measure by. According to Austin Parker, head of DevRel at Lightstep and maintainer of OpenTelemetry, it is the second highest velocity project within the CNCF, only behind Kubernetes, in terms of contributions and improvements.

According to Parker, one of the reasons why OpenTelemetry has just exploded in use is that cloud native development and distributed systems have “eaten the world.” This in turn leads to increased complexity. And what do you need when complexity increases? Observability, visibility, a way to understand what is actually going on in your systems. 

RELATED ARTICLE: How to ensure open-source longevity 

Parker feels that for the past few decades, a real struggle companies have run into is that everyone has a different tool for each part of observability. They have a tool for tracing, something for handling logs, something to track metrics, etc. 

“There’s scaling issues, lack of data portability, lack of vendor agnosticism, and a lack of ability to easily correlate these things across different dimensions and across different signal types,” said Parker. “OpenTelemetry is a project whose time has come in terms of providing a single, well-supported, vendor-agnostic solution for making telemetry a built-in part of cloud native systems.” 

Morgan McLean, director of product management at Splunk and co-founder of OpenTelemetry,  has seen first-hand how the project has exploded in use as it becomes more mature. He explained that a year ago, he was having conversations with prospective users who at the time felt like OpenTelemetry didn’t meet all of their needs. Now with a more complete feature set, “it’s become a thing that organizations are now much more comfortable and confident using,” Morgan explained. 

Today when he meets with someone to tell them about OpenTelemetry, often they will say they’re already using it. 

“OpenTelemetry is maybe the best starting point in that it has universal support from all vendors,” said Xanthos. “It’s a very robust set of, let’s say, standards and open source implementation. So first of all, I know that it will be something that will be around for a while. It is, let’s say, the state of the art on how to instrument applications and collect data. And it’s supported universally. So essentially, I’m betting on something that is a standard accepted across the industry, that is probably going to be around for a while, and gives me control over the data.”

It’s not just the enterprise that has jumped on board with OpenTelemetry; the open-source community as a whole has also embraced it. 

Now there are a number of web frameworks, programming languages, and libraries stating their support for OpenTelemetry. For example, OpenTelemetry is now integrated into .NET, Parker explained. 

Having a healthy open-source ecosystem crucial to success

There are a lot of vendors in the observability space, and OpenTelemetry “threatens the moat around most of the existing vendors in the space,” said Parker. It has taken a lot of work to build a community that brings in people that work for those companies and have them say “hey, here’s what we’re going to do together to make this a better experience for our end users, regardless of which commercial solution they might pick, or which open-source project they’re using,” said Parker. 

According to Xanthos, the reason an open-source standard has become the de facto and not something from a vendor is because of demand from end users. 

“End users essentially are asking vendors to have open-source standards-based data collection, so that they can have more effective observability tools, and they can have control over the data,” said Xanthos. “So because of this demand from end users, essentially all vendors either decided or were forced to support OpenTelemetry. So essentially, there is no major vendor and observability that doesn’t support it today.”

OpenTelemetry’s governance committee seats are tied to people, not companies, which is the case for some other open-source projects as well. 

“We try to be cognizant of the fact that we all work for people that have commercial interests here, but at the end of the day, we’re people and we are not avatars of our corporate overlords,” said Parker. 

For example, Morgan and Parker work for two separate companies which are direct competitors to each other, but in the OpenTelemetry space they come together to do things for the project like form end-user working groups or running events. 

“It doesn’t matter who signs the paycheck,” Parker said. “We are all in this space for a reason. It’s because we believe that by enabling observability for our end users through OpenTelemetry, we are going to make their professional lives better, we’re going to help them work better, and make that world of work better.”

What’s next?

OpenTelemetry has a lot planned for the future, and recently published an official project roadmap

The original promise of OpenTelemetry back when it was first announced was to deliver capabilities to allow people to capture distributed traces and metrics from applications and infrastructure, then send that data to a backend analytics system for processing. 

The project has largely achieved that, which presents the opportunity to sit down and ask what comes next. 

For example, logging is something important to a large portion of the community so that is one focus. “We want to be able to capture logs as an adjacent signal type to distributed traces and to metrics,” said Morgan.

Another long-term focus will be capturing profiles from applications so that developers can delve into the performance of their code.

The maintainers are also working on client instrumentation. They want OpenTelemetry to be able to extract data from web, mobile, and desktop applications. 

“OpenTelemetry is very focused on back end infrastructure, back end services, the stuff that people run inside of AWS or Azure or GCP,” Morgan explained. “There’s also a need to monitor the performance and get crash reports from their client applications, like front end websites or mobile applications or desktop applications, so they can judge the true end to end performance of everything that they’ve built, not just the parts that are running in various data centers.”

The promise of unified telemetry

At the end of the day, it’s important to remember the main goal of the project, which is to unify telemetry. Developers and operators are dealing with increasing amounts of data, and OpenTelemetry’s purpose is to unify those streams of data and be able to do something with it. 

Parker noted the importance of using this data to deliver great user experiences. Customers don’t care whether you’re using Kubernetes or OpenTelemetry, he said. 

“Am I able to buy this PS5? Am I able to really easily put my shopping list into this app and order my groceries for the week?” According to Parker this is what really matters to customers, not what technology is making this happen. 

“OpenTelemetry is a foundational component of tying together application and system performance with end user experiences,” said Parker. “That is going to be the next generation of performance monitoring for everyone. This isn’t focused on just the enterprise; this isn’t a particular vertical. This, to me, is going to be a 30 year project almost, in terms of the horizon, where you can definitely see OpenTelemetry being part of how we think about these questions for many years to come.” 

The post Why the world needs OpenTelemetry appeared first on SD Times.

]]>
How observability prevents developers from flying blind https://sdtimes.com/monitoring/how-observability-prevents-developers-from-flying-blind/ Thu, 02 Feb 2023 17:15:45 +0000 https://sdtimes.com/?p=50217 When changing lanes on the highway, one of the most important things for drivers to remember is to always check their blind spot. Failing to do this could lead to an unforeseen, and ultimately avoidable, accident.  The same is true for development teams in an organization. Failing to provide developers with insight into their tools … continue reading

The post How observability prevents developers from flying blind appeared first on SD Times.

]]>
When changing lanes on the highway, one of the most important things for drivers to remember is to always check their blind spot. Failing to do this could lead to an unforeseen, and ultimately avoidable, accident. 

The same is true for development teams in an organization. Failing to provide developers with insight into their tools and processes could lead to unaddressed bugs and even system failures in the future.

This is why the importance of providing developers with ample observability cannot be overstated. Without it, the job of the developer becomes one big blind spot. 

Why is it important? 

“One of the important things that observability enables is the ability to see how your systems behave,” said Josep Prat, open-source engineering director at data infrastructure company Aiven. “So, developers build features which belong to a production system, and then observability gives them the means to see what is going on within that production system.”

He went on to say that developer observability tools don’t just function to inform the developer when something is wrong; rather, they dig even deeper to help determine the root cause of why that thing has gone wrong. 

David Caruana, UK-based software architect at content services company Hyland, stressed that these deep insights are especially important in the context of DevOps. 

“That feedback is essential for continuous improvement,” Caruana said. “As you go around that loop, feedback from observability feeds into the next development iteration… So, observability really gives teams the tools to increase the quality of service for customers.” 

The in-depth insights it provides are what sets observability apart from monitoring or visibility, which tend to address what is going wrong on a more surface level. 

According to Prat, visibility tools alone are not enough for development teams to address flaws with the speed and efficiency that is required today. 

The deeper insights that observability brings to the table need to work in conjunction with visibility and monitoring tools. 

With this, developers gain the most comprehensive view into their tools and processes. 

“It’s more about connecting data as well,” Prat explained. “So, if you look at monitoring or visibility, it’s a collection of data. We can see these things and we can understand what happened, which is good, but observability gives us the connection between all of these pieces that are collected. Then we can try to make a story and try to find out what was going on in the system when something happened.” 

John Bristowe, community director at deployment automation company Octopus Deploy, expanded on this, explaining that observability empowers development teams to make the best decisions possible going forward.

These decisions affect things such as increasing reliability and fixing bugs, leading to major performance enhancements. 

“And developers know this… There are a lot of moving parts and pieces and it is kind of akin to ‘The Wizard of Oz’ … ‘ignore the man behind the curtain,’” Bristowe said. “When you pull back that curtain, you’re seeing the Wizard of Oz and that is really what observability gives you.” 

According to Vishnu Vasudevan, head of product at the continuous orchestration company Opsera, developer interest in observability is still somewhat new. 

He explained that in the last five years, as DevOps has become the standard for organizations, developer interest in observability has grown exponentially. 

“Developers used to think that they can push products into the market without actually learning about anything around security or quality because they were focusing only on development,” Vasudevan said. “But without observability… the code might go well at first but sometime down the line it can break and it is going to be very difficult for development teams to fix the issue.”

The move to cloud native 

In recent years, the transition to cloud native has shaken up the software development industry. Caruana said that he believes the move into the cloud has been a major driver for observability.

He explained that with the complexity that cloud native introduces, gaining deep insights into the developer processes and tooling is more essential than ever before. 

“If you have development teams that are looking to move towards cloud-native architectures, I think that observability needs to be a core part of that conversation,” Caruana said. “It’s all about getting that data, and if you want to make decisions… having the data to drive those decisions is really valuable.” 

According to Prat, this shift to cloud native has also led to observability tools becoming more dynamic.

“When we had our own data centers, we knew we had machines A,B,C, and D; we knew that we needed to connect to certain boxes; and we knew exactly how many machines were running at each point in time,” he said. “But, when we go to the cloud, suddenly systems are completely dynamic and the number of servers that we are running depends on the load that the system is having.”

Prat explained that because of this, it is no longer enough to just know which boxes to connect; teams now have to have a full understanding of which machines are entering into and leaving the system so that connections can be made and the development team can determine what is going on.

Bristowe also explained that while the shift to cloud native can be a positive thing for the observability space, it has also made it more complicated.

“Cloud native is just a more complex scenario to support,” he said. “You have disparate systems and different technologies and different ways in which you’ll do things like logging, tracing, metrics, and things of that sort.”

Because of this, Bristowe emphasized the importance of integrating proper tooling and processes in order to work around any added complexities. 

Prat believes that the transition to cloud native not only brings new complexities, but a new level of dynamism to the observability space. 

“Before it was all static and now it is all dynamic because the cloud is dynamic. Machines come, machines go, services are up, services are down and it is just a completely different story,” he said. 

Opsera’s Vasudevan also stressed that moving into the cloud has put more of an emphasis on the security benefits that observability can offer. 

He explained that while moving into the cloud has helped the velocity of deployments, it has added a plethora of possible security vulnerabilities. 

“And this is where that shift happened and developers really started to understand that they do need to have this observability in place to understand what the bottlenecks and the inefficiencies are that the development team will face,” he said.

The risks of insufficient observability  

When companies fail to provide their development teams with high level observability, Prat said it can feel like regressing to the dark ages. 

He explained that without observability, the best developers can do is venture a guess as to why things are behaving the way that they are. 

“We would need to play a lot of guessing games and do a lot more trial and error to try and reproduce mistakes… this leads to countless hours and trying to understand what the root cause was,” said Prat.

This, of course, reduces an organization’s ability to remain competitive, something that companies cannot afford to risk. 

He emphasized that while investing in observability is not some kind of magic cure-all for bugs and system failures, it can certainly help in remediation as well as prevention. 

Bristowe went on to explain that observability is really all about the DevOps aspect of investing in people, processes, and tools alike. 

He said that while there are some really helpful tools available in the observability space, making sure the developers are onboard to learn with these tools and integrate them properly into their processes is really the key element to successful observability. 

Observability and productivity 

Prat also emphasized that investing in observability heavily correlates to more productivity in an organization. This is because it enables developers to feel more secure in the products they are building.

He said that this sense of security also helps when applying user feedback and implementing new features per customer requests, leading to heightened productivity as well as strengthening the organization’s relationship with its customer base. 

With proper observability tools, a company will be able to deliver better features more quickly as well as constantly work to improve the resiliency of its systems. Ultimately, this provides end users with a better overall experience as well as boosts speeds. 

“The productivity will improve because we can develop features faster, because we can know better when things break, and we can fix the things that break much faster because we know exactly why things are being broken,” Prat said. 

Vasudevan explained that when code is pushed to production without developers truly understanding it, technical debt and bottlenecks are pretty much a guarantee, resulting in a poorer customer experience. 

“If you don’t have the observability, you will not be able to identify the bottlenecks, you will not be able to identify the inefficiencies, and the code quality is going to be very poor when it goes into production,” he said.

Bristowe also explained that there are times when applications are deployed into production and yield unplanned results. Without observability, the development team may not even notice this until damage has already been caused. 

“The time to fix bugs, time to resolution, and things like that are critical success factors and you want to fix those problems before they are discovered in production,” Bristowe said. “Let’s face it, there is no software that’s perfect, but having observability will help you quickly discover bottlenecks, inefficiencies, bugs, or whatever it may be, and being able to gain insight into that quickly is going to help with productivity for sure.” 

Aiven’s Prat noted that observability also enables developers to see where and when they are spending most of their time so that they can tweak certain processes to make them more efficient.

When working on a project, developers strive for immediate results. Observability helps them when it comes to understanding why certain processes are not operating as quickly as desired. 

“So, if we are spending more time on a certain request, we can try and find why,” Prat explained. “It turns out there was a query on the database or that it was a system that was going rogue or a machine that needed to be decommissioned and wasn’t, and that is what observability can help us with.”

Automation and observability 

Bristowe emphasized the impact that AI and automation can have on the observability space. 

He explained that tools such as ChatGPT have really brought strong AI models into the mainstream and showcased the power that this technology holds. 

He believes this same power can be brought to observability tools. 

“Even if you are gathering as much information as possible, and you are reporting on it, and doing all these things, sometimes even those observations still aren’t evident or apparent,” he said. “But an AI model that is trained on your dataset, can look and see that there is something going on that you may not realize.”

Caruana added that AI can help developers better understand what the natural health of a system is, as well as quickly alert teams when there is an anomaly. 

He predicts that in the future we will start to see automation play a much bigger role in observability tools, such as filtering through alerts to select the key, root cause alerts that the developer should focus on.

“I think going forward, AI will actually be able to assist in the resolution of those issues as well,” Caruana said. “Even today, it is possible to fix things and to resolve issues automatically, but with AI, I think resolution will become much smarter and much more efficient.” 

Both Bristowe and Caruana agreed that AI observability tools will yield wholly positive results for both development teams and the organization in general.  

Bristowe explained that this is because the more tooling brought in and the more insights offered to developers, the better off organizations will be. 

However, Vishnu Vasudevan, head of product at the continuous orchestration company Opsera, had a slightly different take. 

He said that bringing automation into the observability space may end up costing organizations more than they would gain.

Because of this risk, he stressed that organizations would need to be sure to implement the right automation tools so that teams can gain the actionable intelligence and the predictive insights that they actually need.

“I would say that having a secure software supply chain is the first thing and then having observability as that second layer and then the AI and automation can come in,” Vasudevan said. “If you try to build AI into your systems and you do not have those first two things, it may not add any value to the customer.”

How to approach observability 

When it comes to making sure developers are provided with the highest level of observability possible, Prat has one piece of advice: utilize open-source tooling.

He explained that with tools like these, developers are able to connect several different solutions rather than feeling boxed into one single tool. This ensures that they are able to have the most well-rounded and comprehensive approach to observability.

“You can use several tools and they can probably play well together, and if they are not then you can always try and build a connection between them to try and help to close the gap between two tools so that they can talk to each other and share data and you can get more eyes looking at your problem,” Prat said. 

Caruana also explained the importance of implementing observability with room for evolution.

He said that starting small and building observability out based on feedback from developers is the best way to be sure teams are being provided with the deepest insights possible. 

“As you do with all agile processes, iteration is really key, so start small, implement something, get that feedback, and make adjustments as you go along,” Caruana said. “I think a big bang approach is a high risk approach, so I choose to evolve, and iterate, and see where it leads.”

The post How observability prevents developers from flying blind appeared first on SD Times.

]]>
New Relic introduces new capability for change tracking observability https://sdtimes.com/software-development/new-relic-introduces-new-capability-for-change-tracking-observability/ Tue, 31 Jan 2023 15:41:42 +0000 https://sdtimes.com/?p=50194 The observability provider New Relic has announced the launch of a new change tracking solution to provide full visibility of change events throughout an application’s life cycle. According to New Relic, change events are the cause of most degradations in software performance. “With New Relic change tracking, every engineer, regardless of the specialty, can now … continue reading

The post New Relic introduces new capability for change tracking observability appeared first on SD Times.

]]>
The observability provider New Relic has announced the launch of a new change tracking solution to provide full visibility of change events throughout an application’s life cycle.

According to New Relic, change events are the cause of most degradations in software performance.

“With New Relic change tracking, every engineer, regardless of the specialty, can now understand the impact of a change anywhere in the tech stack to take the fiction out of detection and resolution,” said Manav Khurana, chief growth officer and GM at New Relic. 

With this new solution, developers can track changes like deployments, configuration changes, and business events. By correlating changes with performance date, developers will be able to troubleshoot faster and improve efficiency. 

The new feature also integrates with the rest of the CI/CD toolchain and allows you to mark charts with change details. 

There is also an interactive dashboard with analysis of changes, to better allow developers to see a change’s effect over time. 

During triage, developers will be able to select a change notification and see why the change happened, which makes it easier to roll back if needed. 

The post New Relic introduces new capability for change tracking observability appeared first on SD Times.

]]>
Platform engineering vs. SRE https://sdtimes.com/devops/platform-engineering-vs-sre/ Fri, 06 Jan 2023 15:42:30 +0000 https://sdtimes.com/?p=50002 Although the roles of the SRE and site platform engineer share some similarities and are at times conflated, they’re still distinct.  Platform engineers are responsible for designing, developing and maintaining the underlying platform that the application runs on including the infrastructure, operating systems, databases and other components that enable the application to function. SREs, on … continue reading

The post Platform engineering vs. SRE appeared first on SD Times.

]]>
Although the roles of the SRE and site platform engineer share some similarities and are at times conflated, they’re still distinct. 

Platform engineers are responsible for designing, developing and maintaining the underlying platform that the application runs on including the infrastructure, operating systems, databases and other components that enable the application to function. SREs, on the other hand, focus on the reliability, scalability and performance of the application itself. 

“The self-serviceability aspect comes under the realm of a platform engineering team that is trying to provide self-service capabilities for product teams to consume,” Gartner’s Betts said. “SRE is going to be involved in looking at some of the tools that are used to help with that, but their focus is very much on removal of repeatable manual tasks that could potentially go wrong.”

However, SREs can be placed within platform engineering teams to help with some of the tasks. 

“As the SRE teams mature, they get into the platform side of the business where they’re actually calling out gaps in the self-service capabilities so the development teams and the product teams can fix it and benefit from it,” Red Hat’s Raghavan said.

While in large organizations, there’s a division between the two roles, the more resource-constrained ones might have the same person performing both roles, according to Ellis. 

Gear up your SRE 

Here are some of the tools to help gear the SRE up for battle as provided by Forrester’s report “Role Profile: Site Reliability Engineer”

    1. Automation: SREs will need to use scripting, code, or orchestration tools to manage a system or environment. This can include tools like Ansible, CircleCI, GitLab, Jenkins, and Google Cloud Build.
  • App modernization: This can be used to migrate legacy applications to newer ones through revising the code base or rewriting the code using Docker, Git, Google Cloud Run, Kubernetes, and more. 
  • Chaos engineering: SREs can use this method to find faults in a system by injecting specific faults in a testing or production environment using Chaos Machine, Chaos Mesh, Chaos Monkey, Chaos Toolkit, and more. 
  • Networking: This is all about Analyzing the communication process among various computing devices or computer systems using Nagios, Netdata, SolarWinds, Terraform, and more.
  • Observability: SREs need to manage observability to monitor and generate insights about a platform, site, or environment under management using DataDog, Dynatrace, Google Error Reporting, New Relic, and a host of others.
  • Security: SREs also take part in safeguarding an environment through strategies, policies, processes, and technology at every part of the life cycle using tools like Chef InSpec, Google Cloud Audit Logs, Sysdig, and Virus Total.

To read more, click here.

The post Platform engineering vs. SRE appeared first on SD Times.

]]>
The perfect SRE doesn’t exist, but the right one might already be in your organization https://sdtimes.com/devops/the-perfect-sre-doesnt-exist-but-the-right-one-might-already-be-in-your-organization/ Fri, 06 Jan 2023 15:17:31 +0000 https://sdtimes.com/?p=49999 There’s been an explosion of interest in SRE over the last 18 months and a lot of this has been from companies that are looking at scaling their DevOps or DevSecOps initiatives to look at the reliability concerns of their customers.  Vendors are recognizing this and a lot of general software interfaces (GSIs) and Managed … continue reading

The post The perfect SRE doesn’t exist, but the right one might already be in your organization appeared first on SD Times.

]]>
There’s been an explosion of interest in SRE over the last 18 months and a lot of this has been from companies that are looking at scaling their DevOps or DevSecOps initiatives to look at the reliability concerns of their customers. 

Vendors are recognizing this and a lot of general software interfaces (GSIs) and Managed service providers (MSPs) are offering some form of SRE-as-a-service, according to Brent Ellis, senior analyst at Forrester.

Since the role emerged at Google in 2003 to build reliable and high-quality services while reducing costs, it has since evolved, according to Narayanan Raghavan, senior director of site reliability engineering at Red Hat.

“I think the core SRE function, in many ways, becomes a foundation and then you build on top of it. So as the teams that focus on SRE capabilities start to mature, you get into ‘how do I get into robust CI/CD practices?’” Raghavan said. “How do I build capabilities for my development teams to onboard quickly and easily because it then makes my life easier as an SRE, it makes the developers’ lives easier because they don’t have to worry about things like observability, logging, metrics, alerting. They don’t need to think about disaster recovery, incident management, or incident rehearsals.”

For SRE to work in an organization, other teams also need to be receptive to the input that SREs offer and the level of role and this responsiveness differs based on the maturity of the organization. This level of engagement can be divided into three different buckets, according to Raghavan. 

One is that toil for SREs should become tech debt for development almost immediately so as to avoid a separate quote prioritization process. 

The second is that when developers actually start to architect a component that’s completely new, they need to pull in the SREs and engage with SREs up front, according to Raghavan. This is so the SREs can participate and think about how to scale that particular component. In mature organizations, this becomes an important bucket in which developers start to engage out of their own volition instead of being told that they have to do something. 

Then, the third bucket is that as the SRE practice matures and is creating the building blocks that matter to all teams (observability, logging, metrics, and alerting) it’s also engaging development teams up front. 

“That becomes important because it’s the development teams that are then adopting those self- service capabilities that SREs are putting out,” Raghavan said. 

SREs can also lead things like blameless post-mortems in which they’ll look to get to the bottom of what caused the problem. They won’t blame any person, but will look at the processes or the technology that enabled that to take place, according to Daniel Betts, senior director analyst at Gartner.

“If you want to get full value from your SRE, try not to use them as a developer resource,” Betts said. “They should be more of like a reliability focused engineer who’s looking at the overall picture of what’s going on across the product or service that you have.” 

SREs often come in at the beginning of the product life cycle and work to help the product team or the platform engineering teams build a product that is very reliable and robust, that meets the customers’ needs, he added. From there, they can perform tasks across the whole development life cycle. 

“They can be involved throughout the life cycle to the point where the actual product is highly automated and incredibly reliable. It’s now running that product quite maturely and it has very effective automation, monitoring, and observability in place,” Betts said. “The SRE may actually just be keeping an eye on or looking after that product from a standpoint of the dashboards or monitoring tools or observability tools to see if it’s doing what we expect it to do. It doesn’t need that much attention anymore. They can now focus on other solutions to help with the automation and improvement of those.”

Unleash the SRE from within

With potential hiring freezes and budget cuts looming, organizations often try to look for to-be SREs already within their company. 

“The perfect SRE is a myth. That perfect SRE would get bored a month, two months down the road, they’d say ‘been there, done that, give me something else, give me something new, I want to learn something different.’ So I am generally looking for people with potential,” Red Hat’s Raghavan said. “And when I say potential, these are people that are, in some cases, traditional software engineers.” 

These software engineers would already have a systems mindset with which they can think about systems at scale and approach problems that way. A good pool of potential SREs can also exist with systems engineers that can understand software engineering principles.

“So I am from a hiring practice perspective looking for people that fall in that bucket specifically, because then I know that I can invest in them. And as I invest in them, and as they learn the space, they invest back into the company and back in the team,” Raghavan said. “So I am not looking for a perfect fit. I’m in fact, looking for people who are, in many ways eager to learn, can understand technology and understand how to pick up different spaces quickly.”

It’s also important to assign new SREs to a production process early on and to have a mentor guide them.

Gartner’s Betts sees that some organizations that want to start an SRE practice just wind up rebranding an existing I.T. operations team or person in that role which is the wrong approach. 

“An SRE is giving value not just by focusing on things like incident problems, operational improvements, monitoring, and being able to have better insights,” Betts said. “It’s also looking at how we can take some of that software engineering or engineering mindsets to the world of infrastructure operations and look at how we can have reusable modules, efficient infrastructure delivery, efficient response to incidents, and being able to scale capacity.” 

In their day to day work, SREs are often embedded into a product team like a development product team where they’ll act as a reliability consultant to inform the team of expectations around reliability in the organization, help to look for some of the toil, and will look to automate some of those practices as part of the backlog in that product team, according to Betts. 

“In the early maturity stages, having a completely decentralized model makes a lot of sense, because you’re a lot more nimble and agile. But as the product matures, having a more central function to think about reliability at scale becomes important,” Red Hat’s Raghavan continued.

SRE…the social butterfly?

One skill set that often goes overlooked for this role is soft skills, which should instead be called ‘critical skills’, according to Gartner’s Betts.

SREs need to be great communicators because part of the job function is to communicate effectively, both in terms of data that they see with service level objectives (SLOs), budgets, and other things. They also need to show that they can empathize with customers and talk about specific things that are impacting customers’ experience. The SREs are often the ones interacting with customers, partners, development teams, product managers, and more.

“So if you’re talking to maybe a product owner or a strategy person, you take it to a higher level, you’re talking to someone that’s in the team, as an engineer or a developer, you need to get maybe down into the depths and talk a little bit more detail with them,” Betts said. 

Red Hat’s Raghavan added that these soft skills are even more important for an SRE than the technical skills. This is because technical skills are trainable, but it’s often much harder to find people with both soft skills and technical skills. 

“That mindset and the ability to articulate that is absolutely vital for a reliability engineering function, because then we start to look at if something really matters to the customer, you should probably be looking at the specific causes that matter and therefore the symptoms that show up to the customer and what it is that we need to get alerted on,” Raghavan said. 

To read more, click here.

The post The perfect SRE doesn’t exist, but the right one might already be in your organization appeared first on SD Times.

]]>