Flowman — A Declarative ETL Framework powered by Apache SparkDon’t reinvent the wheel by writing more boilerplate code. Focus on critical business logic and delegate the tricky details to a clever…Jun 2, 20234Jun 2, 20234
Published inTowards Data ScienceRethinking the Roles of Data Scientists, Engineers and ArchitectsWhat the wording tells about the roles — and why some companies should rethink their approach and expectations from data projects.Jan 20, 2021Jan 20, 2021
Published inTowards Data ScienceUsing Permutation Tests to proof the Climate ChangeA simple statistical test shows that average temperatures are very unlikely to increase due to “bad luck”.Dec 23, 2020Dec 23, 2020
Published inTowards Data ScienceData Engineering at ScaleHow to speed up building your Big Data ETL pipelines and getting them into productionDec 17, 2020Dec 17, 2020
Published inTowards Data ScienceInvestigating the Climate Change with Python and Spark, Part 3Create your own Insights on Global Warming using publicly available Data.Dec 15, 20201Dec 15, 20201
Published inTowards Data ScienceUsing Python and Spark to research the Climate Change, Part 2Create your own Insights on Global Warming using publicly available Data.Dec 11, 2020Dec 11, 2020
Published inTowards Data ScienceUsing Python and Spark to research the Climate Change, Part 1Create your own Insights on Global Warming using publicly available Data.Dec 8, 2020Dec 8, 2020
Published inTowards Data ScienceDo I need Big Data? And if so, how much?Many companies follow the hype of big data without understanding the implications of the technology.Nov 16, 2020Nov 16, 2020
Published inTowards Data ScienceSpark vs Pandas, part 4— RecommendationsWhy neither Spark nor Pandas is better than the other. Or: Always chose the right tool for the right job.Nov 14, 20202Nov 14, 20202
Published inTowards Data ScienceSpark vs Pandas, part 3 — Scala vs PythonWhy programming languages matterOct 26, 2020Oct 26, 2020