Techno Blender
Digitally Yours.
Browsing Tag

Dar

11 Burning Questions We Have After The Marvels

Image: Marvel StudiosIn the comics, Binary is the superhero identity of Carol Danvers in the ‘80s, when she was a close ally of the X-Men and had previously had her abilities as Ms. Marvel depowered—and was in fact her de facto identity for much of the ‘80s and ‘90s. Give or take, she still has had those cosmic abilities on top of her usual ones ever since she returned to the Ms., and eventually Captain, Marvel mantles. Which means on film at least, we have now seen more universes where Maria Rambeau became Captain Marvel

Jake Gyllenhaal, Dar Salim on doing the right thing

“Guy Ritchie’s The Covenant” stays with you long after the last frame. It’s supposed to. Its star, Jake Gyllenhaal, says it’s just that kind of movie. The Oscar nominee is convinced that the R-rated “Covenant,” opening April 21 in area theaters, succeeds as both edge-of-the-seat entertainment and a potent parable about the core values — integrity, valor and a dogged desire to do the right thing — that are the essence of the American ideal. “It’s about being the kind of American I want to be,” said Gyllenhaal during a…

Weight Decay is Useless Without Residual Connections | by Guy Dar | Feb, 2023

How do residual connections secretly fight overfitting?Photo by ThisisEngineering RAEng on UnsplashThe idea in broad strokes is fairly simple: we can render weight decay practically useless by making it arbitrarily small. Just a quick recap of what weight decay is: weight decay is a regularization technique that is used to prevent neural networks from converging to solutions that do not generalize to unseen data (overfitting). If we train the neural network to only minimize the loss on the training data we might find a…

Speaking Probes: Self-Interpreting Models? | by Guy Dar | Jan, 2023

Can language models aid in their interpretation?Photo by Kane Reinholdtsen on UnsplashIn this post, I experiment with the idea that language models can be coaxed to explain vectors coming from their parameters. It turns out to work better than you might expect, but still much work needs to be done.As is customary in scientific papers, I use “we” instead of “I” (among other reasons, because it makes the text sound a bit less self-centered..).This is not really a complete work, but more like a preliminary report on an idea…

Analyzing Transformers in Embedding Space — Explained | by Guy Dar

How and why look through the embedding space prismPhoto by Dan DeAlmeida on UnsplashIn this post, I present the paper “Analyzing Transformers in Embedding Space” (2022) by Guy Dar, Mor Geva, Ankit Gupta, and Jonathan Berant. Guy Dar is me :)In this paper, we propose a new method to interpret Transformers by making their parameters more interpretable. We show that some Transformer weights can be “persuaded” to explain what they mean. We use a simple and very efficient technique to translate the model’s weights into tokens.…