September 22, 2021 at 6:56 PM
Pretty nice article by Eugene.
TLDR: become familiar with the data and the problem you’re trying to solve before calling model.fit().
https://eugeneyan.com/writing/first-rule-of-ml/

Pretty nice article by Eugene.
TLDR: become familiar with the data and the problem you’re trying to solve before calling model.fit().
https://eugeneyan.com/writing/first-rule-of-ml/

Working for a non-profit?
We offer a 50% discount on all our courses, be it public or in-company.
We’re already helping Médecins Sans Frontières (MSF) in the Netherlands and US by training their staff with this arrangement.
Get in touch if you want to be onboarded in this program!
#godatadriven #training #nonprofit
The power of online training:
This morning I’ve facilitated a session with people in Shanghai, Hong Kong, Guangzhou. 🍜🥟🍲
This afternoon is time for Palo Alto!
California here I come 😎🏄🏻♂️
We got a very successful Optimizing #apachespark session today with our star Roman Ivanov, hosted by the fine folks of GoDataDriven | Part of Xebia.
So I created a poll — open for two weeks — to chime in. The winning option is going to be the next session!
Where did Roman learn to optimize Spark?
Story time and bonus points if you guess the company’s name based on the story!
While working there, one of us started using a Google library to interact with GCP. This library — unknown to us — had a bug that would cause listing the content of a bucket recursively.
Listing the content of a bucket is in principle not a big deal, the cost is minimal, but at the scale of this company, the cost quickly added up.
How much?
Well, after pushing to production the code with this library, we went to get lunch.
After coming back (a good hour later as this company — hint — has terrific lunch facilities) the API calls looked completely off the charts: we raked up $50.000 in charges because of this bug during LUNCH (remember: implement monitoring! Had we not had it, finding out end of month would have meant MILLIONS of $ in charges).
We quickly rolled back the changes, looked why that was happening, fixed the changes, filed a PR to the Google library, and called it just a normal day. (Kudos to Google BTW to compensate us for the $50.000).
So what company are we talking about here?
And — more importantly — what kind of scale do they have if a tiny bug listing the content of a bucket costs you $50.000 dollars during lunch?
Well, this is the company Roman learned from. So you’d be a fool not to be at his event (and it’s free!)
Are you from an underrepresented or marginalized #community and you want to learn #dbt, THE tool that every #analytics #engineer should master?
We have good news: we have dates (the 25th and 26th of October) and a website to apply for the scholarship: https://gdd.li/scholarship
You’ll learn dbt with Lucy Sheppard from GoDataDriven, Part of Xebia and somebody from the product team of dbt Labs.
Are you not from a marginalized or underrepresented community? Even better news for you: if you get your tickets through https://gdd.li/tickets-dbt we will donate 300$ — for EACH ticket — to sponsor the scholarship.
What are you waiting for?!
Lenovo sent me an email at the end of July.
“Back to school promotions! Order your laptop now to be ready for the new year.”
So I ordered a laptop for my daughter.
Fool.
Today Lenovo sent me an email, saying the laptop will arrive in DECEMBER, 3 months after the schools have started.
Whaaaat?
Best part: want to cancel your order? There’s no way to log in to the portal to do so!
#epicfail

This month it’s 5 years that I wrote a blog post about production ready data science, and I think it really aged well
https://godatadriven.com/blog/production-ready-data-science/

A participant in our Analytics Translator training just said to me
“Time flies when you’re having fun!”
♥️
#analtyicstranslator GoDataDriven, Part of Xebia #training
Until 2018 (notice the old logo in the screenshot) GoDataDriven used to help refugees get up to speed with programming in Python and data analysis — for free.
It was a great way to help less fortunate people. We used to do so through https://restart.network which — unfortunately — closed their doors in early 2019.
Our drive behind the effort is not gone though and I would still love for GoDataDriven, proudly part of Xebia to help refugees and minorities by offering free Python / data analysis / data science classes (read why our back-then colleague Rodrigo Agundez taught three different classes on our blog https://godatadriven.com/blog/python-masterclass-with-restart-network/ Robert Rodger also taught a couple of times).
It would be great if people in my network were to help me get in contact with similar organizations again.
If you don’t know any, please like or share for reach!
#datascience #python #machinelearning #programming #help
