ScORe – Schema On Read for Spark SQL

Lior Chaga | 27 May 2021 | Big Data

Tags: open source, parquet, pruning, schema, Spark, Spark-SQL

The world is not flat, it’s highly nested With over 4 billion page views per day and over Read More...

TEST in PRODUCTION – should you?

Tal Bar Zvi | 07 Apr 2021 | System

Tags: automation, ci, development cycle, devops, Grafana, Jenkins, production, release engineering, testing, testing cycle, testing in production

You wrote your code. You even tested it. And now, you are eager to git push it. But Read More...

The Challenges Of Uploading 150TB/day From Spark To BigQuery – Part 2

Itai Barel and Gaash Hazan | 25 Mar 2021 | Big Data

Tags: Airflow, BigQuery, Cloud Storage, Google Cloud, Lessons, performance, Scale, Spark

In part 1 of the series we shared the architecture of Taboola’s PV2Google service which uploads over 150TB/day Read More...

The Challenges Of Uploading 150TB/day From Spark To BigQuery – Part 1

Itai Barel and Gaash Hazan | 25 Mar 2021 | Big Data

Tags: Airflow, BigQuery, Cloud Storage, Google Cloud, performance, Scale, Spark

Have you ever tried building an infrastructure to upload 150TB a day? Have you ever tried querying over Read More...

Don’t just run DNS, run it FAST

Ariel Pisetzky | 03 Nov 2020 | System

Tags:

By Ariel Pisetzky and Tarek Shama Taken for Granted Taken for granted. That’s the way most users and Read More...

If you fall, fall right – a tale of SRE critical incident management

Ariel Pisetzky | 31 Oct 2020 | Culture

Tags: Culture, Engineering Culture

If you fall, fall right – a tale of SRE critical incident management By Yehuda Levi, Tal Valani, Read More...

Anomaly detection using LSTM with Autoencoder

Gali Katz | 14 Sep 2020 | Big Data

Tags: autoencoder, LSTM, Metrics

Taboola is one of the largest content recommendation companies in the world. We maintain hundreds of servers in Read More...

Running a Post Mortem Following “The Moment”

Ariel Pisetzky | 17 Aug 2020 | Culture

Tags:

Failure. I need to talk about failure, and not any failure, my failure. I need to share it Read More...

That Moment

Ariel Pisetzky | 12 Aug 2020 | Culture

Tags:

That Moment So, the firefighters are in your data center, there is no electricity, and the pager is Read More...

High Scale Service Deployment: Taboola’s Recommended Flow

Tidhar Klein Orbach | 01 Jul 2020 | Java

Tags: Deployment, devops, high scale service, release engineering

This post is not about K8S – nor is it about AWS. It is not about containers – Read More...

Using Spark Dynamic Allocation

Igor Berman | 24 Jun 2020 | Big Data

Tags: big data, dynamic allocation, infra, mesos, performance, Spark

The story starts with metrics. Every mature software company needs to have a metric system to monitor resource Read More...

Collaborative Trial: On Optimizing Recommendation Testing

Maoz Cohen | 09 Jun 2020 | Big Data

Tags: a/b testing, algorithms, big data, data, data science, Monitoring, performance, statistics, testing

Taboola is responsible for billions of daily recommendations, and we are doing everything we can to make those Read More...

Stop waking up at night over MySQL replication

Ariel Pisetzky | 26 May 2020 | Big Data

Tags:

MySQL Slave Replication Optimization Written by Yossi Kalif & Ariel Pisetzky   MySQL in Taboola So you love Read More...

Fear of breaking production? Use Grafana!

Tal Bar Zvi | 07 May 2020 | Big Data

Tags: Grafana, Monitoring, Observability

In Taboola, we deal with scale, huge scale. A small issue might turn into a disaster in a Read More...

Growing by Learning – DIY

Dar Cohen | 29 Feb 2020 | Uncategorized

Tags: Culture, deep learning, Engineering Culture, mentorship

To facilitate flexibility and technological hype, you want to work with people who know how to learn. This Read More...