aggregate - 63 orders of magnitude

Spark

Grid neighbour operations in Spark

In physics or biology you sometimes simulate processes in a 2 dimensional lattice, or discrete space. In those cases you usually compute some local interactions of "cells", and with that, calculate a result. An example of this could be the Ising model which was proposed in 1920 for

Spark

Finding latest non-null values in columns

Imagine we have a table with a sort of primary key where information is added or updated partially: not all the columns for a key are updated each time, but we now want to have a consolidated view of the information, with just one value of the key containing the

Spark

Pivoting a table with Spark

A pivot table is an old friend of Business Intelligence: it allows to summarize the data of another table, usually by performing aggregations (sum, mean, max, ...) on aggregated data. They are found in multiple reporting tools (i.e. Qlik, Tableau and Excel) and analytical RDBMS (i.e. Oracle) also implement