Who are we? 🕵️
We’re the research team at TruEra, where we help data scientists and ML practitioners build, debug, and monitor trustworthy ML models. In short, we think about trustworthy ML every day.
Research we're reading
Lately, we’ve been reading papers that argue that interpretable models are inherently better models. Put simply, ensuring that your model is interpretable can actually lead to more performant and robust models, and vice versa. How’s that possible?
Researchers at Carnegie Mellon University found that adversarially-robust deep networks are more interpretable, particularly because robust models have smoother decision boundaries which can lead to cleaner gradient-based attributions 💪. (Spoiler alert: TruEra cofounder Anupam Datta is a co-author of this paper).
On the other hand, enforcing that deep learning models are more interpretable during training can lead to benefits in performance, as argued in a paper that eliminates features that have low global importance during training to boost interpretability📈 . This sparked some interesting discussion during our research reading group this week...
Arri's take: This may lead to the model getting stuck in local minima during training, and the results on model performance improvements could be clearer.
Interested in participating in our public reading group? Join here to hear our takes live and join the discussion!