Majority of Data Engineers Spending More Than Half Their Time Handling Data Issues, Soda Reveals
BRUSSELS, June 14, 2023 — Soda has released research finding that 61% of Data Engineers are spending half or more of their time handling data issues. Meanwhile, the research also found that just 12% of engineers have managed to keep the handling of data issues below 20% of their time.
Gartner defines data engineers as having a “key role in building and managing data pipelines, and promoting data and analytics use cases to production.” However, according to Soda’s research which followed a survey of almost 100 data engineers and subject matter experts, engineers are increasingly finding the bulk of their time consumed by data issues. A further two-thirds of data engineers believe poor source data is the root cause of these data issues, underlining the importance of data quality checks at the beginning of pipelines.
Conversely, the research also reveals that data engineers working with subject matter experts are less affected by data issues, with 57% of respondents able to keep the handling of data issues below 20% of their time. This suggests that volumes of data in larger organizations are causing data engineers to become overwhelmed with data quality challenges at the source.
When asked by the survey how they currently find, analyze and resolve data issues, data engineers confirmed that current processes are lengthy and inefficient: “we get an email or (Microsoft) Teams message flagging a problem, end up emailing back and forth whilst investigating, and finally either fix it or find out that the issue is normal,” said one. Another confirmed that after analyzing errors and attempting to fix issues, their team often needs to “recreate corrected data from source manually.”
“Data engineers have a huge amount of innovation and creativity to offer organizations, but time and again their roles are being swamped with trying to overcome data quality challenges,” said Maarten Masschelein, CEO and Co-Founder of Soda. “It is absolutely critical for businesses wanting to get maximum value from their data that engineers are freed from the burden of solving repetitive data issues. To best address this problem, organizations need to ensure that they are preventing issues as early as possible to prevent breakages at source before they wreak havoc further and require ever more complex fixes. It also requires subject-matter experts to be empowered to self-serve and autonomously turn business requirements into data quality checks, democratizing data whilst also further freeing engineers from an endless stream of data issues.”
The research also found that 60% of data engineers surveyed identified a lack of resources and time for data and analytics engineering teams as a primary cause of bottlenecks when it comes to resolving data issues. Half of data engineers also identified getting tools and processes right for both data consumers and data producers, as important to overcoming these bottlenecks.
About Soda
Soda is the data quality platform that delivers end-to-end data quality management to test, prevent, and resolve data issues, and empower everyone to share and use reliable, high-quality data to improve decision making in a data-informed culture. Soda’s platform has received industry recognition from Gartner, Thoughtspot, and Modern Data Stack. Soda serves over 100 customers around the world. For more information, visit soda.io.
Source: Soda