• Git Commit Newsletter
  • Posts
  • Yelp's AI Review Filter, Zillow's Real-Time Search, & Kafka's Seamless Migration 🚀

Yelp's AI Review Filter, Zillow's Real-Time Search, & Kafka's Seamless Migration 🚀

Explore how AI and advanced technologies are enhancing safety, speeding up searches, and simplifying massive data migrations.

Did someone forward you this email? Be one of the first to join an exclusive newsletter which unlocks curated actionable tech insights straight to your inbox—subscribe here! 

This Week's Engineering Insights:

  • AI on Patrol: Yelp uses AI to catch harmful reviews before you see them.

  • How Zillow Makes Real-Time Recommendations Fly. Discover the secrets behind Zillow's lightning-fast search results.

  • Mission: Impossible... Migrated! Zendesk shares their zero-downtime migration strategy.

YELP AI GATEKEEPERS

Yelp’s AI pipeline for inappropriate language detection in reviews

Yelp, the go-to platform for finding great local businesses, takes user safety seriously. To combat harmful content like hate speech and harassment, they've turned to the power of Artificial Intelligence.

  • The Challenge: Automatically detecting inappropriate language in reviews is tricky. Mistakes can delay good reviews or let bad ones slip through.

  • The Solution: Large Language Models (LLMs) step in! These AI models are whizzed at understanding language, making them perfect for flagging egregious content.

  • Training the AI: Yelp curated a massive dataset of flagged reviews, including hate speech, harassment, and lewdness. This data helped the AI learn the nuances of inappropriate language.

  • Real-World Impact: Since deploying LLMs, Yelp has stopped over 23,600 harmful reviews from ever reaching the public. That's a major win for user safety!

Actionable Insights:

  1. Consider using LLMs for tasks requiring nuanced language understanding, like sentiment analysis or spam detection.

  2. Data quality is crucial! Clean and diverse training data helps AI models perform at their best.

  3. Human oversight is still important. AI can be a powerful tool, but human judgment is irreplaceable for making final calls.

As AI continues to evolve, how can we leverage its power to build safer and more positive online communities?

Read More
by Gouthami Senthamaraikkannan ⸱ Yelp ⸱ 8 min read

ZILLOW ELASTICSEARCH

Optimizing Elasticsearch for Low Latency, Real-Time Recommendations

Zillow personalizes home searches with real-time recommendations, but ensuring lightning-fast response times is crucial. Here's how they optimized their Elasticsearch setup for peak performance:

  • Identifying the Bottleneck: Candidate generation, the core of recommending listings, was causing delays. Elasticsearch was spending too much time fetching document details.

  • Optimizing Fetch Methods: Switching from the default "source" method to "doc values" significantly reduced fetch time and network strain. This targeted approach retrieved only necessary data, minimizing processing overhead.

  • Cache Magic: Zillow strategically leveraged caching throughout the system. Redis cached full responses, while Elasticsearch's query cache stored frequently used query components, further accelerating response times.

  • Finding the Balance: Elasticsearch prioritizes real-time recommendations, even if it means slightly delayed data updates. This ensures a smooth user experience with minimal latency.

Actionable Insights:

  1. Analyze your Elasticsearch usage: pinpoint performance bottlenecks like Zillow did with the candidate generation phase.

  2. Consider fetch methods: Choose the most efficient approach (source, stored fields, doc values) based on your specific data retrieval needs.

  3. Implement a caching strategy: Utilize caching mechanisms like Redis and Elasticsearch's built-in caches to minimize redundant computations.

How can we optimize search technologies to deliver real-time results while maintaining data consistency?

Read More
by Kelsey Juraschka ⸱ Zillow ⸱ 9 min read

KAFKA TO KUBERNETES

Migrating Kafka Cluster to Kubernetes

Imagine migrating a massive Kafka cluster, handling billions of messages daily, without a hitch! Zendesk engineers achieved this remarkable feat, transitioning their Kafka infrastructure to Kubernetes with zero downtime and minimal client impact.

  • The Challenge: Moving a critical Kafka cluster with minimal disruption and client-side changes.

  • The Flawed Approach: Shifting complexity to clients with individual cutovers proved inefficient.

  • The Winning Strategy: Broker-level migration within the existing cluster.

    • New brokers join the cluster on Kubernetes.

    • Data is seamlessly transferred from old brokers to new ones.

    • Clients remain connected to the same cluster throughout, experiencing no interruption.

Actionable Insights:

  1. Analyze your migration complexity: Identify bottlenecks like Zendesk did.

  2. Consider broker-level migration: This innovative approach minimizes client disruption.

  3. Orchestrate and monitor: Utilize tools and automation for a smooth, controlled process.

How can we further streamline complex migrations while ensuring data consistency?

Read More
by Frankie ⸱ Zendesk ⸱ 13 min read

🙏 Support the newsletter

This newsletter is free, and it takes 50% of my time.

I have just started the newsletter and
The best way to support me is to help me grow it:

Recommend the newsletter: it is very effective!
This is our third edition and we have already crossed 45+ subscribers within a week.
Thanks for supporting!

Okay, that's it for now. Thanks for reading today’s newsletter, See you next Tuesday!

Reply

or to participate.