Tag: LinkedIn
LinkedIn Implements New Data Trigger Solution to Reduce Resource Usage For Data Lakes
With its vast user base and the numerous interactions that occur daily, LinkedIn generates an enormous amount of data every day. The billions of data points fuel various applications, from rankings to search. The additio Read more…
Linkedin To Open Source Its Data Lakehouse Management Tool OpenHouse
LinkedIn has announced the open sourcing of OpenHouse - a management framework for data lakehouse. OpenHouse offers a control plane that gives users an interface with managed tables in open-source data lakehouse deploy Read more…
Navigating the AI Skills Revolution in the Age of GenAI: LinkedIn Report
The launch of ChatGPT and similar generative AI technologies is reshaping the skills required in the workplace, according to a new report from LinkedIn. “The Future of Work Report: AI at Work” found the pace at wh Read more…
Cloud Analytics Firm StarTree Receives $24 Million in Series A Funding
Even coming off the momentum of 2019, 2020 turned into a stunning turning point for the cloud takeover of AI and analytics (our own editor Alex Woodie described it as “the great cloud migration [starting] to look like Read more…
LinkedIn Open Sources Dagli to Simplify ML Pipeline Building
LinkedIn yesterday announced that it has open sourced Dagli, a Java-based framework for building and deploying machine learning pipelines. While the number and quality of tools for developing machine learning models h Read more…
LinkedIn Unveils Open-Source Toolkit for Detecting AI Bias
As AI becomes increasingly integrated in our day-to-day lives, the implications of bias in AI grow more and more worrisome. Training data that appears impartial is often influenced by historical and socioeconomic factors Read more…
Navigating the AI and Analytics Job Market During COVID-19
The market for AI and analytics jobs has not been spared from the wrath of COVID-19, which has directly led to the loss of more than 30 million American jobs over the past four months. It may not appear to be an ideal ti Read more…
LinkedIn Unleashes ‘Nearline’ Data Streaming
LinkedIn is releasing its Brooklin data ingestion service to the open source community. Brooklin has been running in production on the social media platform since 2016. The stateless and distributed service is used pr Read more…
Dr. Elephant Leads the Performance Parade
I started working on big data infrastructure in 2009 when I joined Cloudera, which at the time was a small startup with about 10 engineers. It was a fun place to work. My colleagues and I got paid to work on open source Read more…
How Kafka Redefined Data Processing for the Streaming Age
The Apache Kafka phenomenon reached a new high today when Confluent announced a $50 million investment from the venture capital firm Sequoia. The investment signals renewed confidence that Kafka is fast becoming a new an Read more…
Dr. Elephant Steps Up to Cure Hadoop Cluster Pains
Getting jobs to run on Hadoop is one thing, but getting them to run well is something else entirely. With a nod to the pain that parallelism and big data diversity brings, LinkedIn unveiled a new release of Dr. Elephant Read more…
LinkedIn Adds to Growing List of ML Tools
LinkedIn is releasing to the open source community its machine-learning tool used to train the ranking algorithm for its newsfeed, advertising and customer recommendations. The world's largest professional network (NY Read more…
Kafka Creators Tackle Consistency Problem in Data Pipelines
One of the big questions surrounding the rise of real-time stream processing applications is consistency. When you have a distributed application involving thousands of data sources and data consumers, how can you be sur Read more…
LinkedIn Diagnostics Help Tune Hadoop Jobs
An open source tool released last by LinkedIn developers is intended to help Hadoop and Spark users analyze, tune and improve the performance of their workflows. The self-service performance-tuning tool for Hadoop dub Read more…
Kafka Gets a Stream-Processing Makeover
A new library for building streaming applications seeks to shift the focus from analytics to developing core application used to process data streams. Confluent Inc. announced a technical preview this week of a new te Read more…
What Data Science Skills Employers Want Now
There's good news if you're for a job in data science in 2016 -- the number of job openings in the field appears to be rising as companies look to leverage big data for competitive advantage. But actually landing a covet Read more…
Kafka Tops 1 Trillion Messages Per Day at LinkedIn
There is data in motion, and then there is really big data in motion. The folks at LinkedIn gave us a compelling example of the latter today when it announced that it's using the distributed messaging system Kafka to pro Read more…
One on One with LinkedIn’s VP of Engineering
Why are data scientists tripping over themselves to get their hands on LinkedIn’s data? What’s it like to run one of the world’s biggest social media sites, and how can machine learning algorithms contribute to the Read more…
LinkedIn Spinoff Confluent to Extend Kafka
The LinkedIn team that built the Apache Kafka real-time messaging service has left to form a new company called Confluent. The startup said it would offer a "real-time data platform" built around Apache Kafka. Along w Read more…
LinkedIn Centralizing Data Plumbing with Kafka
As company data grows, so too grows the complexity of the data-stream. This can lead to a convoluted crisscross of specialized pipelines that don’t scale or play well with each other. Recently at the Hadoop Summit, engineer, Joel Koshy discussed how LinkedIn attacked this problem and built a real-time data pipeline that they manage to keep current across geography in their disparate datacenters. Read more…