data Archives - SD Times https://sdtimes.com/tag/data/ Software Development News Mon, 01 May 2023 19:33:31 +0000 en-US hourly 1 https://wordpress.org/?v=6.1.1 https://sdtimes.com/wp-content/uploads/2019/06/bnGl7Am3_400x400-50x50.jpeg data Archives - SD Times https://sdtimes.com/tag/data/ 32 32 Vercel introduces a suite of serverless storage solutions https://sdtimes.com/data/vercel-introduces-a-suite-of-serverless-storage-solutions/ Mon, 01 May 2023 19:33:31 +0000 https://sdtimes.com/?p=51055 Vercel announced its suite of serverless storage solutions: Vercel KV, Postgres, and Blob to make it easier to server render just-in-time data as part of the company’s efforts to “make databases a first-class part of the frontend cloud.” Vercel KV is a serverless Redis solution that’s easy and durable, powered by Upstash. With Vercel KV, … continue reading

The post Vercel introduces a suite of serverless storage solutions appeared first on SD Times.

]]>
Vercel announced its suite of serverless storage solutions: Vercel KV, Postgres, and Blob to make it easier to server render just-in-time data as part of the company’s efforts to “make databases a first-class part of the frontend cloud.”

Vercel KV is a serverless Redis solution that’s easy and durable, powered by Upstash. With Vercel KV, it’s possible to generate Redis-compatible databases that can be written to and read from Vercel’s Edge Network in regions that you designate, requiring only minimal configuration.

Vercel Postgres is a serverless SQL database built for the frontend, powered by Neon. Vercel Postgres provides a completely managed, fault-tolerant, and highly scalable database that offers excellent performance and low latency for web applications. It’s specifically designed to work flawlessly with Next.js App Router and Server Components, as well as other frameworks like Nuxt and SvelteKit. This makes it easy to retrieve data from your Postgres database and use it to create dynamic content on the server with the same rapidity as static content.

Lastly, Vercel Blob enables users to upload and serve files at the edge, and is powered by Cloudflare R2. Vercel Blob can store files like images, PDFs, CSVs, or other unstructured data and it’s useful for files normally stored in an external file storage solution such as Amazon S3, files that are programmatically uploaded or generated in realtime, and more. 

“Frameworks have become powerful tools to manipulate backend primitives. Meanwhile, backend tools are being reimagined as frontend-native products. This convergence means bringing data to your application is easier than ever, and we wanted to remove the final friction point: getting started,” Vercel stated. 

Vercel KV, Vercel Postgres, and Vercel Blob are built on open standards and protocols, designed for low latency, efficient data fetching, and fully integrated with Vercel’s existing tools and workflows.

Additional details are available here

The post Vercel introduces a suite of serverless storage solutions appeared first on SD Times.

]]>
InfluxDB 3.0 released with rebuilt database and storage engine for time series analytics https://sdtimes.com/data/influxdb-3-0-released-with-rebuilt-database-and-storage-engine-for-time-series-analytics/ Wed, 26 Apr 2023 15:10:58 +0000 https://sdtimes.com/?p=51013 InfluxDB announced expanded time series capabilities across its product portfolio with the release of InfluxDB 3.0, the company’s rebuilt database and storage engine for time series analytics. “InfluxDB 3.0 is a major milestone for InfluxData, developed with cutting-edge technologies focused on scale and performance to deliver the future of time series,” said Evan Kaplan, CEO … continue reading

The post InfluxDB 3.0 released with rebuilt database and storage engine for time series analytics appeared first on SD Times.

]]>
InfluxDB announced expanded time series capabilities across its product portfolio with the release of InfluxDB 3.0, the company’s rebuilt database and storage engine for time series analytics.

“InfluxDB 3.0 is a major milestone for InfluxData, developed with cutting-edge technologies focused on scale and performance to deliver the future of time series,” said Evan Kaplan, CEO at InfluxData. “Built on Apache Arrow, the most important ecosystem in data management, InfluxDB 3.0 delivers on our vision to analyze metric, event, and trace data in a single datastore with unlimited cardinality. InfluxDB 3.0 stands as a massive leap forward for both time series and real-time analytics, providing unparalleled speed and infinite scalability to large data sets for the first time.”

The solution was originally developed as the open-source project InfluxDB IOx and was built in Rust. It was then rebuilt as a columnar database that leverages the scale and performance of the Apache Arrow data structure to deliver real-time query responses. 

Users can also benefit from unlimited cardinality and high throughput to continuously ingest, transform, and analyze billions of time series data points, low-cost object store, and SQL language support. 

The new version is available now in InfluxData’s cloud products, including the fully managed service InfluxDB Cloud Dedicated. InfluxData also announced InfluxDB 3.0 Clustered and InfluxDB 3.0 Edge to give developers next-gen time series capabilities in a self-managed database and InfluxDB 3.0 will be available in these products later in the year.

The post InfluxDB 3.0 released with rebuilt database and storage engine for time series analytics appeared first on SD Times.

]]>
Slack’s new platform makes it easier for developers to build and distribute apps https://sdtimes.com/software-development/slacks-new-platform-makes-it-easier-for-developers-to-build-and-distribute-apps/ Mon, 24 Apr 2023 20:41:13 +0000 https://sdtimes.com/?p=50993 Slack has launched its next-generation platform with new features and capabilities to make it easier for developers to build and distribute apps on the Slack platform.  The platform includes modular architecture grounded in building blocks like functions, triggers, and workflows. They’re remixable, reusable, and hook into everything flowing in and out of Slack.  It also … continue reading

The post Slack’s new platform makes it easier for developers to build and distribute apps appeared first on SD Times.

]]>
Slack has launched its next-generation platform with new features and capabilities to make it easier for developers to build and distribute apps on the Slack platform. 

The platform includes modular architecture grounded in building blocks like functions, triggers, and workflows. They’re remixable, reusable, and hook into everything flowing in and out of Slack. 

It also includes new tools such as the Slack CLI and TypeScript SDK that simplify and clarify the most tedious parts of building on top of Slack. Developers can easily share what they built anywhere in Slack. With a link trigger, the workflow becomes portable and can be shared in a message, added in bookmarks, put in a canvas, and more.

Lastly, developers now have access to Secure deployment, data storage, and authentication powered by Slack-managed serverless infrastructure. And a fast, Deno-based TypeScript runtime keeps you focused on your code and your users.

Overall, the next-gen platform aims to provide a more seamless and streamlined experience for both developers and Slack users.

“Listening to developers, admins, and users is critical to building, maintaining, and evolving a platform like ours. We know that it’s been too darn difficult building custom integrations, ensuring that they’re enterprise-ready from day one, and keeping them fresh whenever new Slack features are released, regardless of experience level or interest,” said Taylor Singletary, head of developer relations at Slack. “After witnessing our customers’ enormous success in automating work with Workflow Builder, we knew we had to bring that automation power to even more people.”

Slack stated that Workflow Builder will soon become a no-code tool that puts the power of automating Slack and integrating everyday tools directly into the hands of users. The functions and workflows will become remixable as users discover new ways to combine triggers, inputs, and outputs with functions for the software they use most. 

The post Slack’s new platform makes it easier for developers to build and distribute apps appeared first on SD Times.

]]>
UserTesting announces friction testing capability https://sdtimes.com/software-development/usertesting-announces-friction-testing-capability/ Wed, 12 Apr 2023 16:53:20 +0000 https://sdtimes.com/?p=50860 UserTesting announced machine learning innovations to the UserTesting Human Insight Platform to help businesses gain the context needed to understand and address user needs. One update is friction detection powered by machine learning to visually identify moments in both individual video sessions, and across multiple videos, where people experience friction behaviors like excessive clicking or … continue reading

The post UserTesting announces friction testing capability appeared first on SD Times.

]]>
UserTesting announced machine learning innovations to the UserTesting Human Insight Platform to help businesses gain the context needed to understand and address user needs.

One update is friction detection powered by machine learning to visually identify moments in both individual video sessions, and across multiple videos, where people experience friction behaviors like excessive clicking or scrolling while using digital products, including prototypes, apps, and websites, according to the company in a post

Now, organizations have the ability to merge behavioral data with video feedback to obtain a comprehensive understanding of the challenge and enhance the likelihood of successful outcomes. This functionality is especially beneficial for product and design teams, as it empowers them to visualize the user’s journey, and subsequently refine processes before committing costly development resources.

The new update includes integration with Microsoft Teams which will allow users of UserTesting and Microsoft Teams to easily share videos and related content with colleagues without leaving the UserTesting platform.

UserTesting also introduced expanded capabilities for Invite Network to help teams gain access to more audiences with increased privacy. The company stated that it will soon offer an integrated login experience for customers when they access the UserTesting, UserZoom, and EnjoyHQ platforms.

The post UserTesting announces friction testing capability appeared first on SD Times.

]]>
Android updates data deletion policy to provide more transparency to users https://sdtimes.com/data/android-updates-data-deletion-policy-to-provide-more-transparency-to-users/ Fri, 07 Apr 2023 15:03:30 +0000 https://sdtimes.com/?p=50837 Google announced a new data deletion policy to provide users with more transparency and authority when it comes to managing their in-app data. Developers will soon be required to include an option in their apps for users to initiate the process of deleting their account and associated data both within the app and online on … continue reading

The post Android updates data deletion policy to provide more transparency to users appeared first on SD Times.

]]>
Google announced a new data deletion policy to provide users with more transparency and authority when it comes to managing their in-app data.

Developers will soon be required to include an option in their apps for users to initiate the process of deleting their account and associated data both within the app and online on applications that allow the creation of user accounts. This option will have to be linked to Data safety forms in apps. 

“While Play’s Data safety section already lets developers highlight their data deletion options, we know that users want an easier and more consistent way to request them,” said Bethel Otuteye, the senior director of product management at Android App Safety in a blog post. “By creating a more intuitive experience with this policy, we hope to better educate our shared users on the data controls available to them and create greater trust in your apps and in Google Play more broadly.”

With this new option, users who do not want to delete their entire account will have the option to delete specific data, such as activity history, images, or videos. 

Developers who need to keep certain data for legitimate reasons, such as security, fraud prevention, or regulatory compliance, are required to disclose those practices clearly.

Google is asking developers to submit answers to new deletion questions in their app’s Data Safety form by December 7th. Google Play users will begin to see reflected changes in theirapp’s store listing by early next year. 

The post Android updates data deletion policy to provide more transparency to users appeared first on SD Times.

]]>
Google now shows datasets in search results https://sdtimes.com/data/google-now-shows-datasets-in-search-results/ Mon, 06 Mar 2023 20:20:11 +0000 https://sdtimes.com/?p=50485 The new ability to see datasets in search results is aimed at helping scientific research, business analysis, or public policy creators get access to data quickly, according to Natasha Noy, research scientist, and Omar Benjelloun, software engineer at Google Research in a blog post. Google search engine users can click on any of the top … continue reading

The post Google now shows datasets in search results appeared first on SD Times.

]]>
The new ability to see datasets in search results is aimed at helping scientific research, business analysis, or public policy creators get access to data quickly, according to Natasha Noy, research scientist, and Omar Benjelloun, software engineer at Google Research in a blog post.

Google search engine users can click on any of the top three results to get to the dataset page or explore further by clicking “More datasets.” Users will get essential metadata about datasets and previews of the data where available and they can then go to the repositories that host the datasets. 

This feature is powered by Dataset Search, a search engine specifically designed for datasets, which has indexed over 45 million datasets from over 13,000 websites. Dataset Search gathers information from various areas including government, scientific, and commercial datasets.

Dataset Search indexes dataset pages that contain schema.org structured data and displays key elements such as description, license, temporal and spatial coverage, and available download formats.

Google encourages dataset authors to ensure that their web pages have machine-readable metadata so that Dataset Search can find it more easily. The best way to do this is to publish in a dataset repository that automatically includes this metadata. 

“In the scientific community and throughout various levels of the public sector, reproducibility and transparency are essential for progress, so sharing data is vital. For one example, in the United States a recent new policy requires free and equitable access to outcomes of all federally funded research, including data and statistical information along with publications,” the blog authors wrote. “As data sharing continues to grow and evolve, we will continue to make datasets as easy to find, access, and use as any other type of information on the web.”

The post Google now shows datasets in search results appeared first on SD Times.

]]>
Talend Winter ‘23 release introduces cloud migration capabilities https://sdtimes.com/data/talend-winter-23-release-introduces-cloud-migration-capabilities/ Tue, 28 Feb 2023 19:50:18 +0000 https://sdtimes.com/?p=50414 Data integration company Talend has announced updates to Talend Data Fabric, which is an end-to-end platform for data discovery, transformation, governance, and sharing.  The Winter ‘23 release adds capabilities for automating cloud migrations and data management, expanding data connectivity, and improving data visibility, quality, control, and access.  To ease migrations, Talend has added the ability … continue reading

The post Talend Winter ‘23 release introduces cloud migration capabilities appeared first on SD Times.

]]>
Data integration company Talend has announced updates to Talend Data Fabric, which is an end-to-end platform for data discovery, transformation, governance, and sharing. 

The Winter ‘23 release adds capabilities for automating cloud migrations and data management, expanding data connectivity, and improving data visibility, quality, control, and access. 

To ease migrations, Talend has added the ability to publish workflows created on-premise to cloud platforms. 

There is a new AI-powered feature called Smart Services that automates task scheduling and job orchestration. Through Smart Services, users can pause and resume tasks based on smart timeouts, which reduces computing time and increases efficiency.

Through its Universal Spark capability, the company also is adding support for newer Apache Spark releases, enabling data scientists to develop Spark jobs once and quickly switch them to run on different Spark versions as needed. 

Also included in this release are new connectors for SAP S/4HANA and SAP Business Warehouse on HANA; ad platforms like TikTok, Snapchat, and Twitter; and cloud databases like Amazon Keyspaces, Azure SQL Database, Google Bigtable, and Neo4j Aura Cloud. 

There are also new observability features to help data scientists uncover blind spots in their data. For example, they can check validity and usage of data types, apply contextual data quality rules, use the Talend Trust Score to see how data quality evolves over time, and gain visibility into sharing policies. 

Talend also made updates to Stitch, which is a fully managed cloud ETL service. It now has role-based access controls and new pipeline monitoring capabilities, enabling teams to get metrics on data ingestion, including data volumes, data freshness, and schema changes. 

“Winter ’23 is based on direct customer feedback and continuing to support those on the front lines charged with the increasingly daunting task of extracting maximum, ongoing value from corporate data,” said Jason Penkethman, chief product officer at Talend. “As well as continuing to drive operational efficiency and expedite data modernization efforts and returns, with Winter ’23, we are also empowering our customers to continuously monitor data throughout its lifecycle and understand and impact how it evolves and moves to fuel positive business outcomes.”

The post Talend Winter ‘23 release introduces cloud migration capabilities appeared first on SD Times.

]]>
Don’t let data compliance block software innovation; automation is the key https://sdtimes.com/software-development/dont-let-data-compliance-block-software-innovation-automation-is-the-key/ Fri, 24 Feb 2023 15:57:31 +0000 https://sdtimes.com/?p=50399 The need for the digital transformation of business processes, operations, and products is nearly ubiquitous. This is putting development teams under immense pressure to accelerate software releases, despite time and budget constraints. At the same time, compliance with data privacy and protection mandates, as well as other risk mitigation efforts (e.g., zero trust), often choke … continue reading

The post Don’t let data compliance block software innovation; automation is the key appeared first on SD Times.

]]>
The need for the digital transformation of business processes, operations, and products is nearly ubiquitous. This is putting development teams under immense pressure to accelerate software releases, despite time and budget constraints. At the same time, compliance with data privacy and protection mandates, as well as other risk mitigation efforts (e.g., zero trust), often choke the rate of innovation by making it harder for development teams to acquire and use high-quality test data. Is it possible to achieve both of these seemingly opposing requirements, speed and protection? 

The answer lies in a familiar tactic: automation. Development teams are increasingly adept at automating huge chunks of their work, from setting up the necessary infrastructure environments to building, integrating, testing, and releasing software. Call it DevOps or CI/CD, the tactic is the same: ruthlessly automate mundane or repetitive tasks. To ensure compliance requirements don’t hinder development, IT leaders must similarly prioritize automating data profiling and protection as a normal part of their development pipelines. 

The growing impact of data privacy on software development

The regulatory landscape for data privacy and protection continues to grow, resulting in ever-increasing, and increasingly complex, compliance requirements. In fact, McKinsey found three quarters of all countries have adopted data localization rules, which have “major implications for the IT footprints, data governance, and data architectures of companies, as well as their interactions with local regulators.” 

Existing data privacy regulations such GDPR in the EU and HIPAA in the US, updates to older mandates (e.g., the recently updated Federal Trade Commission (FTC) Safeguards Rule mandated by the Gramm-Leach-Bliley Act), and new and emerging laws (e.g., Virginia Consumer Data Protection Act, Canada’s Consumer Privacy Protection Act) all threaten to slow down software development and innovation by adding layers of security requirements onto the development process. 

Even without the introduction of new privacy mandates, the impact of data privacy and security requirements on development is almost certain to grow. For one thing, development and testing environments have proven to be rich attack targets for threat actors. From source code management systems to infrastructure such as virtual test servers to the test data itself, all are attractive targets for bad actors seeking to compromise systems and data. Add in the many different cloud development platforms like Salesforce and SAP, and it’s clear there is plenty of opportunity for a hungry hacker with nefarious intentions. 

Therefore teams must ensure the entire application lifecycle is secure, including development and test environments, whether on-prem or in the cloud. How do IT and security accomplish this without slowing development and release cycles? The answer lies in test data automation.

Test data management meets DevOps

The software development process is reliant on access to fresh test data. Traditional methods for managing and provisioning test data are typically manual and tremendously slow – think ticketing systems and siloed, request-fulfill models that can take days or even weeks. These processes are very much at odds with modern development methods such as DevOps and CI/CD, which demand fast, iterative release cycles. 

This is where application innovation often grinds to a halt. DevOps and DevSecOps processes have automated quality assurance testing and security and compliance testing throughout the CI/CD pipeline. But data provisioning and governance has remained a manual and time-consuming practice. Enter DevOps test data management (TDM) which automates the “last mile” of DevOps and provides fast delivery of lightweight, protected data in minutes instead of days, weeks or months. With DevOps TDM, organizations can accelerate development and testing, and in turn, can increase compliance and innovation.

Just how much can DevOps TDM accelerate software innovation? Consider one example from Dell Technologies. The technology giant’s developers needed quick access to fresh test data, but, like many other organizations, manually provisioning the data was a slow, tedious process. 

By automating DevOps test data management, Dell significantly increased the speed and efficiency of its test data provisioning and governance. Now, 92% of Dell’s ~160 global, non-production database environments are refreshed automatically on a bi-weekly basis. Developers can now initiate releases through their CI/CD pipelines in just 17 minutes. This has allowed the Dell team to run 6 million pipelines the first quarter of 2022, and more than 50 million since they implemented this standardized, automated approach. 

Shrink the surface area of private data

Antiquated approaches to test data management often rely on scripts or otherwise poorly integrated processes that result in the proliferation of sensitive data throughout the enterprise. It’s not uncommon for each development environment to have its own copy of sensitive production data for testing purposes. And often developers maintain their own copies for coding and unit testing. Many enterprises end up with hundreds or even thousands of uncontrolled copies of sensitive data.

Privacy mandates and security policies treat these copies of sensitive data no differently than the production databases from which they were spawned. Sensitive data such as personally identifiable information (PII) or cardholder data must be secured to the same degree, whether or not it’s in production. This often translates into requiring encryption both at rest and in transit, as well as carefully managed access controls and other protections. And then there are the near-universal requirements for the right to be forgotten. Privacy mandates regularly require businesses to destroy personal data upon request. It does not matter where that data lives.

The solution is eliminating the replication of sensitive data through the use of data masking. To provide production-quality data to your teams and non-production environments without multiplying the burden of security and privacy protections, DevOps TDM approaches — when implemented properly — automate the masking of sensitive data. In effect, this step shrinks the surface area that you must protect. This reduces your compliance and security risks as well as the impact on your budget.

Quite simply, having less sensitive data strewn about your business means less you to protect. Automation can make that possible.

Starting small but thinking big

Automating with DevOps TDM may appear overwhelming at first. Where do you start? But this is one change where it is very easy to start small, automating test data delivery and masking for just one or a handful of applications. Many businesses begin by addressing their most sensitive environments and where CI/CD pipelines already exist, such as customer-facing apps. Here, the need for protection and the underlying automation framework (i.e., the DevOps toolchains) already exist.

But businesses should also think big as they evaluate solutions. The number of distinct data sources is likely to expand over time. You might have a SQL database on AWS today, but then add your Salesforce platform and mainframe DB2 into the mix in the future. Masking these data sources while preserving referential integrity across them may prove challenging but is essential to effective integration and user acceptance testing.

Ultimately, businesses centralize DevOps TDM while giving their development teams autonomy over the acquisition and use of test data. Centralization means you can apply policies for the masking of sensitive fields and use database virtualization to cost-effectively provision data. 

The benefits of DevOps TDM are substantial. Not only do businesses improve compliance and mitigate risks, they also speed up development and reduce costs. It represents one of those rare instances where a tradeoff between faster, better (safer) and cheaper is no longer required.

The post Don’t let data compliance block software innovation; automation is the key appeared first on SD Times.

]]>
Developers have to keep pace with the rise of data streaming https://sdtimes.com/data/developers-have-to-keep-pace-with-the-rise-of-data-streaming/ Mon, 13 Feb 2023 19:24:12 +0000 https://sdtimes.com/?p=50313 The rise of data streaming has forced developers to either adapt and learn new skills or be left behind. The data industry evolves at supersonic speed, and it can be challenging for developers to constantly keep up. SD Times recently had a chance to speak with Michael Drogalis, the principal technologist at Confluent, a company … continue reading

The post Developers have to keep pace with the rise of data streaming appeared first on SD Times.

]]>
The rise of data streaming has forced developers to either adapt and learn new skills or be left behind. The data industry evolves at supersonic speed, and it can be challenging for developers to constantly keep up.

SD Times recently had a chance to speak with Michael Drogalis, the principal technologist at Confluent, a company that provides a complete set of tools needed to connect and process data streams. (This interview has been edited for clarity and length.)

SD Times: Can you set the context for how much data streaming is growing today and how important is it that developers pay more attention to it?

Drogalis: I remember back in like 2013 or 2014, I attended the Strange Loop Conference, which was really great. And as I was walking around, I saw there was this talk on the main stage by Jay Kreps, who’s now Confluent’s CEO, and it was about Apache Kafka. I walked away with two things on my mind. Number one, this guy was super tall like 6 foot 8 which was very impressionable. And then the other was that there are at least two people in the world who care about streaming, which is basically the vibe back then it was a very new technology.  

There were a lot of academic papers about it, and there were clearly patches of interest in the technology landscape that could be put together, but none of them had really broken out. 

The other project at that time was Apache Storm, which was a real-time stream processor, but it kind of just lacked the components around it. And so there was like a set of people: a small community. 

And then fast forward to today, and it’s just a completely different world. I have the privilege of working here and seeing companies, every size, every vertical, every industry, every use case, and with every latency requirement. And the transition is kind of just shocking to me that you don’t see a lot of technologies break out that quickly over the course of a decade.

SD Times: Are there any projects around this that you’re seeing are interesting?

Drogalis: I saw a few stats that are interesting this year. The Apache Foundation’s Kafka is one of the most active projects, which is pretty cool, because the Apache Foundation now has a huge number of projects that it incubates. And I also saw on the StackOverflow annual developer survey that Kafka was ranked as one of the most loved or one of the most recognizable technologies. To see it break out from being an undercurrent to something that’s really important and on peoples’ minds is pretty great.

SD Times: What are some of the challenges of handling data streaming today?

Drogalis: It’s kind of like driving on the opposite side of the road than you’re used to. You go to school, and you’re taught to program in maybe Java or Python. And so the basic paradigm everyone is taught is, you have a blob of data in a data structure in a file, and you suck it up, and then you process it, and then you spit it out somewhere. And you do this over and over again until you perform your data processing task, or you do whatever needs to be done. 

And streaming really turns this all on its head. You have this inversion of flow, and instead of bounded data structures, you have unbounded data structures. The data continuously comes in and you have to constantly process the very next thing that shows up. You really can’t arbitrarily scan into the future, because you don’t really know what’s coming. Events may be arriving out of order, and you don’t know if you have the complete picture yet. Everything is effectively asynchronous by default. And it takes some getting used to since it’s becoming an increasingly robust paradigm. 

But, it certainly is a big change to get your head around. I kind of liken it to when people were starting to adopt JavaScript on the server, and it’s async. So it definitely takes a little bit of getting used to but the power makes it worth it.

SD Times: So what are some of the best practices and most common skills that are needed to deal with the growth of data streaming?

Drogalis: A lot of it kind of comes down to experience. I mean, this is sort of a newer technology that’s kind of evolved somewhat recently. So a lot of it is just getting your hands dirty, going out and figuring out how does it work? What will work best? 

As far as best practices, I think a couple of things jumped out to me. Number one is getting your head around the idea of data retention. When you work with batch-oriented systems, the idea is to generally just kind of keep all your data forever, which can work. You may have some expiration policy that sort of works in the background where you mop up data that you don’t need at some point, but the streaming systems seem to have this idea of retention built into them where you age out old data, and you make this trade-off between what do I keep versus what do I throw away and what you keep is kind of the boundary of what you’re you’re able to process. 

The second thing that’s worth studying up on is to be intentional about your designs and the idea of time. With streaming, your data can kind of come out of order. I think a classic example of this is maybe you’re collecting events that are coming off of cell phones, and maybe somebody takes a cell phone and they drive into the Amazon rainforest, and they have no connectivity. And then they come out and they reconnect. And then the upload data from last week, the systems that you design have to be able to be intelligent enough to kind of look at it and say this data didn’t actually just happen. It’s from like a week ago. There’s power and there’s complexity, and the power is obviously that you can really retroactively update your view of the world. And you can take all kinds of special actions depending on whatever you want to do with your domain. But the complexity is that you have to figure out how to deal with that and factor that into your programming model. 

The post Developers have to keep pace with the rise of data streaming appeared first on SD Times.

]]>
Zero-Copy Integration standard made available to public https://sdtimes.com/data/zero-copy-integration-standard-made-available-to-public/ Wed, 08 Feb 2023 20:32:41 +0000 https://sdtimes.com/?p=50282 The Digital Governance Council and Data Collaboration Alliance have announced the public release of Zero-Copy Integration, which is a Canadian standard that provides a framework for meeting strict data protection regulations and dealing with the risks related to data silos and copy-based data integration.  The Zero-Copy Integration framework establishes the following principles: data management via … continue reading

The post Zero-Copy Integration standard made available to public appeared first on SD Times.

]]>
The Digital Governance Council and Data Collaboration Alliance have announced the public release of Zero-Copy Integration, which is a Canadian standard that provides a framework for meeting strict data protection regulations and dealing with the risks related to data silos and copy-based data integration. 

The Zero-Copy Integration framework establishes the following principles: data management via a shared data architecture, data sharing via access-based data collaboration, data protection via universal access controls set in the data layer, data governance via data products and federated stewardship, prioritization of data-centricity and active metadata, and prioritization of solution modularity. 

It supports a project-by-project approach to reducing data silos and avoiding “rip and replace” approaches that don’t factor in existing IT investments. 

Zero-Copy integration is ideal for developing new applications, predictive analysis, digital twins, customers 360 views, AI/ML operationalization, workflow automations, and legacy system modernization. 

Currently the Digital Governance Standards Institute is working on plans to advance the standard in international standards organizations. 

“By eliminating silos and copies from new digital solutions, Zero-Copy Integration offers great potential in public health, social research, open banking, and sustainability,” said Keith Jansa, CEO at the Digital Governance Council. “These are among the many areas in which essential collaboration has been constrained by the lack of meaningful control associated with traditional approaches to data sharing.”

The post Zero-Copy Integration standard made available to public appeared first on SD Times.

]]>