TechTalks@Foursquare: Tony Zhang and Vladimir Zamiatovskii
Join us for the sixth edition of TechTalks@Foursquare, where we unite the tech community for engaging discussions, fueled by pizza and beer. Held at the vibrant Foursquare office in Dom Sindikata, this event promises an evening of insightful conversations and networking opportunities. Dom Sindikata, Trg Nikole Pasica 5(entrance next to cafe Soljica), VI Floor. Here is the video how to find us: https://www.youtube.com/watch?v=K9dKdEWLsXg
Time table:
6:00 pm - Meet & greet over pizza & beer
6:15 pm - Vladimir Zamiatovskii: "Efficient PySpark Deployment on EMR Serverless: Strategies and Tools from the Engineering Trenches", 30 min + 15 min QA
7:00 pm - Break, 15 min
7:15 pm - Tony Zhang: "From Databases to Data Lakes - A Journey through Evolution and Innovation", 30 min + 15 min QA
8:00 pm - Networking and discussions over pizza & beer
8:30 pm - End
Vladimir Zamiatovskii, Software Engineer, "Efficient PySpark Deployment on EMR Serverless: Strategies and Tools from the Engineering Trenches"
Deploying PySpark applications to Amazon EMR (Elastic MapReduce) serverless environments presents a unique set of challenges and opportunities for engineering teams. In this talk, we delve into the intricacies of how our engineering team has developed tailored tools and methodologies to streamline and optimize this deployment process.
We will discuss the architecture and design principles behind our custom deployment tools, which are crafted to seamlessly integrate with EMR serverless environments while maximizing efficiency and minimizing overhead. From packaging and dependency management to configuration and orchestration, attendees will gain insights into the key considerations and best practices for deploying PySpark applications at scale.
With a Master's degree focused on development and research of methods for improving machine translation and development and research of NLP approaches for knowledge base collecting based on plain texts, Vladimir is a seasoned professional with a deep understanding of Natural Language Processing (NLP) and Machine Learning (ML). Vladimir possesses extensive experience with the Apache stack, including Hadoop, Spark, Kafka, Flume, Sqoop, and Airflow, as well as proficiency in handling NoSQL databases like MongoDB and Neo4j.
===
Tony Zhang, Sr. Staff Engineer: "From Databases to Data Lakes - A Journey through Evolution and Innovation"
Join us on a journey through the transformative landscape of data management as we explore the evolution from traditional databases to the expansive realm of data lakes. In this talk, Tony will delve into the historical context, examining the limitations of traditional database systems and the catalysts that drove the emergence of data lakes. From structured to unstructured data, we'll dissect the paradigm shift towards scalability, flexibility, and real-time analytics, facilitated by the advent of data lakes. Drawing upon industry insights and best practices, we'll elucidate the strategic advantages, challenges, and considerations inherent in this evolutionary trajectory. Whether you're a seasoned data professional or just beginning your journey, this talk promises to illuminate the path forward in harnessing the power of data lakes to unlock unprecedented insights and drive innovation in the digital age.
With a focus on GeoAnalytics tools facilitating spatial temporal analysis at scale, Tony is a at the forefront of distributed geo-spatial analytics, spatial-temporal clustering, and batch ETL pipeline development. His expertise extends to the development of fine-tuned unsupervised machine learning models tailored for GPS data clustering, infused with human intuition to enhance accuracy and relevance. Beyond his contributions to GeoAnalytics, Tony was actively engaged in researching autonomous vehicle behavior and fleet management simulation, exploring the intricate dynamics of modern transportation systems. Additionally, his involvement in advancing pavement recycling technology underscores their commitment to leveraging technology for sustainable infrastructure solutions.
When / Where
- When: 2024-05-16 20:00:00 UTC
- Starting: 2024-05-16 20:00:00 UTC
- Upto: 2024-05-16 20:00:00 UTC