This was a great year for machine learning, the amount of content produced about data is overwhelming, to try to keep up, I follow people on Twitter, participate on ML groups on telegram, subscribe to newsletters, podcasts and so on, so I open a lot of links every day, and some of them are very worth going back to.

Because of those many sources, I don’t have a centralized way of knowing what I read, so I decided to take a data mining approach to it: I got the links from my tweets, articles sent to kindle, YouTube history, newsletter emails and todo lists, and dumped here.

Most of the links are content created on 2018, some are much older, but I just came across them this year. I did this for myself as I will go back to those links from time to time, but I hope it can be useful to you as well!

About Bias

Towards fairness in ML with adversarial networks
Bias and Fairness in Machine Learning
Selection bias: The elephant in the room
Compliance bias in mobile experiments

About Interpretability

FairML: Auditing Black-Box Predictive Models
A Brief History of Machine Learning Models Explainability
How (and how not) to fix AI

Learnings Material

If Correlation Doesn’t Imply Causation, Then What Does? (the whole series is amazing)
Determining The Optimal Number Of Clusters: 3 Must Know Methods
Understanding Activation Functions in Deep Learning
Machine Learning for Recommender systems
Object Detection with 10 lines of code
Fizz Buzz in Tensorflow
Rules of Machine Learning: Best Practices for ML Engineering
Introducing the Facebook Field Guide to Machine Learning video series
Machine Learning Done Wrong
Building a Language and Compiler for Machine Learning

Data Visualization

The 45 Best — And Weirdest — Charts We Made In 2018
How to visualize decision trees

Novel Approaches

Ad-versarial: Defeating Perceptual Ad-Blocking
Privacy Preserving Deep Learning with PyTorch & PySyft
Large Scale GAN Training For High Fidelity Natural Image Synthesis
You Won’t Believe How We Optimize our Headlines
Multitask NLP Model
Can agents learn inside of their own dreams?
Turning Design Mockups Into Code With Deep Learning
Fake News Detection on Social Media: A Data Mining Perspective
The Current Best of Universal Word Embeddings and Sentence Embeddings
Neural Arithmetic Logic Units (NALU) — A new beginning?
AdamW and Super-convergence is now the fastest way to train neural nets
Learning explanatory rules from noisy data
The Case for a Synthetic Approach to Narrow AI
On-page SEO for NLP
Detect pressed keys via microphone audio capture in real-time

Math and Statistics

Kill Math
Analyzing Experiment Outcomes: Beyond Average Treatment Effects
math-as-code
Learning Math for Machine Learning
How to Read Mathematics
A Mathematician’s Lament
Defining Churn Rate
Common Probability Distributions: The Data Scientist’s Crib Sheet

Tools and Products

Snorkel: A System for Fast Training Data Creation
Dedupe Python Library
Papers with Code
TensorFlow Advent
TensorFlow Ecosystem

Some Criticism

Troubling Trends in Machine Learning Scholarship
Deep Learning: A Critical Appraisal
The deepest problem with deep learning
Out of shape? Why deep learning works differently than we thought

Fun & Amazing

Is It A Bug Or Is It A Story?
BIG DATA — 3.0 — “L1ZY”
This robot uses AI to find Waldo, thereby ruining Where’s Waldo
Why so many poor kids who get into college don’t end up enrolling
I used TensorFlow.js to make Amazon Echo respond to sign language
DolphinAttack: Inaudible Voice Command
Convolutional Network Demo from 1993
When AI is the Product: The Rise of AI-Based Consumer Apps

Comments

If you’d like to add a comment, please send a merge request adding your comment here, copying this block as an example

@yourusernameoptional-link.com

Rogério Chaves

Machine Learning Link Dump 2018