Categories
Uncategorized

Spark Workshop

Spark Workshop

Dan Serban

Dan Serban is s a data engineer who occasionally teaches advanced data engineering workshops using Spark as the big data framework.

Interested in learning the practical applications of a modern, streaming data analytics pipeline? Meet Apache Spark, the big data framework that helps reduce data interaction complexity, increase processing speed and enhance data-intensive, near-real-time applications with deep intelligence.

This 2-hour, intensely hands-on workshop introduces Apache Spark, the open-source cluster computing framework with in-memory processing and streaming capabilities that makes analytics applications up to 100 times faster compared to Hadoop. The workshop is aimed at seasoned developers with an interest in understanding the streaming data pipelines that power today’s real-time analytics engines. Agenda Interactive Data Analytics Overview Creating Spark DataFrames From Publicly Available Datasets Spark Streaming Overview Time Series Analytics Overview Graph Analytics With Spark GraphX All the tools we use during the workshop will be inside one Docker container per attendee on a cloud server. This will make it possible for attendees to continue experimenting at home on their own laptops.

Categories
Uncategorized

Offline-first apps with WebComponents

Offline-first apps with WebComponents

AMahdy AbdElAziz

AMahdy AbdElAziz is an international technical speaker, Google developer expert (GDE), trainer and developer advocate. Passionate about Web and Mobile apps development, including PWA, offline-first design, in-browser database, and cross platform tools. Also interested in Android internals such as building custom ROMs and customize AOSP for embedded devices.

PWA, offline-first design, framework.JS …etc. A lot of hype words recently, or isn’t it? Let’s explore this in details and how it’s affecting both Mobile and Web development.

We will explore how to boost the usability of web and mobile-web apps by implementing offline-first functionalities, it’s the only way to guarantee 100% always on user experience. Low signal or no connectivity should no longer be a blocker for the user, we will discuss the available solutions for caching, in-browser database, and data replication. We will also take a look at how WC such as Polymer and Vaadin Elements help solving those issues out of the box. There will be a live coding demo to see how it’s simple to manipulate a large data, completely offline.

Categories
Uncategorized

Java Libraries You Can’t Afford to Miss

Java Libraries You Can’t Afford to Miss

Andres Almiray

Andres Almiray is a Java/Groovy developer and a Java Champion with more than 17 years of experience in software design and development. He has been involved in web and desktop application development since the early days of Java. Andres is a true believer in open source and has participated on popular projects like Groovy, Griffon, and DbUnit, as well as starting his own projects (Json-lib, EZMorph, GraphicsBuilder, JideBuilder). Founding member of the Griffon framework and Hackergarten community event.

This presentation covers Java libraries that have risen to the top, having proved themselves to be worthy of a place in every developer’s toolbox, for both production and testing code. It also discusses some fairly new libraries that are bound to make a big impact in the ecosystem.

The Java language has passed its 20th anniversary, and with it comes an incredible range of tools libraries to choose from; sometimes there are actually too many choices for the same task. This presentation covers those libraries that have risen to the top, having proved themselves to be worthy of a place in every developer’s toolbox, for both production and testing code. It also discusses some fairly new libraries that are bound to make a big impact in the ecosystem.

Categories
Uncategorized

Process your big data in a blink using Spark

dan-serban

Dan Serban is a data engineer who occasionally teaches advanced functional programming as well as data engineering (using Spark as the big data framework).

This 2-hour, intensely hands-on workshop introduces Apache Spark, the open-source cluster computing framework with in-memory processing that makes analytics applications up to 100 times faster compared to technologies in wide deployment today. Highly versatile in many environments, and with a strong foundation in functional programming, Spark is known for its ease of use in creating exploratory code that scales up to production-grade quality relatively quickly (REPL driven development).

The plan is to start with a few publicly available datasets and gradually work our way through them until we harness some useful insights, gaining a deep understanding of Spark’s rich collections API in the process.

Time permitting, we are going to look at a very simple Spark Streaming example (stream of integers / moving average).

During the workshop, participants are encouraged to exchange with one another URLs and snippets of code via the issues section of this GitHub repository ( https://github.com/dserban/SparkVoxxed ).