travel
Madagascar: humans
Some pictures taken in Madagascar. Use them as you please with only one restriction: if used in public (blog, presentation , news article...), cite me or this web/post.
Physicist, data engineer working with data processing tools like spark and pandas. Contributor of @opendevstack I love to travel, so some posts might be about it.
travel
Some pictures taken in Madagascar. Use them as you please with only one restriction: if used in public (blog, presentation , news article...), cite me or this web/post.
travel
Some pictures taken in Madagascar. Use them as you please with only one restriction: if used in public (blog, presentation , news article...), cite me or this web/post. Can't be used to train AI/ML algorithms.
kafka
In this fourth part we will see how to use avro in kafka, and we will try to do it withhout using any extra help, the old way. This way we will experience the pain of not having a good integration ecosystem (e.g. a native library and the schema
kafka
The first big step to work with Kafka is to put data in a topic, and so is the purpose of this post. Being JSON the most common way to intercomunicate, and having the schema with the data, we will explore how we build a producer in scala to start
travel
Some pictures taken in Norway: Trondheim, Bodø and Oslo. Use them as you please with only one restriction: if used in public (blog, presentation , news article...), cite me or this web/post. Can't be used to train AI/ML algorithms.
Spark
Imagine we have a table with a sort of primary key where information is added or updated partially: not all the columns for a key are updated each time, but we now want to have a consolidated view of the information, with just one value of the key containing the
kafka
In this second part of the Kafka in MapR series we are going to de-duplicate identical messages using HBase. > NOTE I am using MapR so not all configurations are the same in its Open Source counterparts (e.g. I don't need to add the ZooKeeper nodes) This
kafka
While Spark continues to thrive as the main big data processing framework for batch and streaming, alternatives emerging from the 1970s actor model and the reactive manifesto are gaining notoriety. Akka is widely known in the Scala community and on March 2016 Confluent released its library Kafka Streams. One of
Spark
So today I was trying to use the handy function sha1() provided by Spark and I needed to concatenate all my columns in just one, since it did not supported multiple ones. The solution seemed easy at first: use concat(), however, something odd was happening. It turns out I had
scala
Similar to the previous post Reading Sequences of Scala classes in Play JSON [https://8vi.cat/sequences-of-classes-in-play-json/], I found some issues when trying to parse and write Timestamp using the JSON module from Play Framework. The objective I want to be able to write and read Timestamp format. I have
scala
One wise man said once: "Try to improve 1% every day so you will be an expert without you noticing". Following this advise, in my personal projects I try to use new libraries and this time the time has come for JSON with Play Framework. The objective of
travel
Some pictures taken in New York, US. Use them as you please with only one restriction: if used in public (blog, presentation , news article...), cite me or this web/post. Can't be used to train AI/ML algorithms. Pollution everywhere A gothic catedral in the heart of NY