Not only do they have a great logo: they also have great speakers!
This time, after the usual over-the-top 🍕 (usage of the emoji is not an endorsement of its graphics), we’ll get to listen to Workload orchestration with SageMaker Pipelines by Simon Stiebellehner.
Simon works for Transaction Monitoring Netherlands (TMNL).
TMNL has been established by the largest Dutch banks in the collective fight against money laundering and the financing of terrorism.
A single bank has heaps of transaction data — but usually a limited view of the Dutch market as there’s no monopoly.
Putting together transactions from different entities gives a better overview of what’s going on.
Of course it comes with challenges as well, so be sure to check out Simon’s talk!
The conference season has barely started and already #godatadriven has made an entrance!
We organized a whole track and two tutorials — including the most popular tutorial of the WHOLE conference — at the Applied Machine Learning Conference in Lausanne🇨🇭
We had 4 speakers at PyData Berlin: Daniël Willemsen, 👋Jordi Smit, Marysia Winkels, and James Hayward 🇩🇪
We had Lucy Sheppard talk at PyCon US 🇺🇸
Juan Manuel Perafan presented at the Big Data Technology Warsaw conference 🇵🇱
We now have Rogier van der Geer who got his talk accepted at EuroPython in Dublin 🇮🇪
And Marysia Winkels whose talk about data-centric AI got accepted at PyData London 🇬🇧
And we are hosting in-person meetups again, with the best pizza of northern Europe!
Coming out of my PhD, I had no idea what it meant to work in industry.
I found out the hard way when joining KPMG but I would have cherished speaking with future colleagues with what I had to expect, where to invest my time before starting, and more.
This is your chance to avoid making many mistakes.
And to get to know the company with the best coffee in Amsterdam.
Algorithms can have serious consequences on lives of people around you!
I’ve already posted about the scandal the Dutch tax office was involved.
They used the second nationality as a feature in their model — to find possible fraudulent behavior in their allowances scheme.
There were two problems with their approach:
First of all, it was unlawful in the NL. This was the biggest issue, algorithm, or no algorithm
The second one, was that the algorithm didn’t say why it flagged an individual.
Is this problematic?
Yes, it is! If you don’t know why someone if flagged, then you will be looking into everything trying to find something is wrong. And sometimes that something is a technicality such as forgetting to sign a form — a far cry from committing fraud!
So how do you do it right?
A couple of years ago, I was called by a bank that had a very high performing machine learning model (an isolation forest) to flag correspondent banking transactions that were suspicious.
The problem is that isolation forests are not very explainable, you don’t know why they flag something.
However the bank found it unacceptable for the model to just report a transaction to an analyst.
The analyst would have engaged in the same behavior the Dutch office engaged it: find anything that was not 100% kosher. Of course if you’re not 100% within the lines, it doesn’t mean you’re committing fraud. It can be as silly as forgetting to sign a form.
What I did back then was to develop a geometric model that would explain why the isolation forest model was flagging transactions.
Please do the same with models that can have nefarious effects. I don’t care if you’re wrong about my taste in fashion when I browse Amazon.
I very much care if my life gets destroyed though!