Eventually, my journey through neural networks has brought me to a point where I feel equipped with enough foundational knowledge to delve into the attention mechanism and transformer models in deep learning. I can’t help but start this post to record what I’ve learned so far, even as I continue...
Now, I’ve set off on a new learning journey into Large Language Models (LLMs). As always, I prefer to start from the fundamentals and gradually build up my understanding. Recently, I came across the Stanford course — CS224N: NLP with Deep Learning — available on YouTube. After going through its...
Linear Map Lemma is a fundamental result in linear algebra that essentially says a linear map is completely and uniquely determined by what it does to a set of basis vectors. This lemma is crucial because it allows us to understand and work with linear maps in a more manageable...
It’s been a while since my last post on neural networks, where I explored some mathematical details out of personal interest. My original plan was to move on to attention mechanisms and transformers. But the book Linear Algebra Done Right by Sheldon Axler, which had been sitting on my desk...
Until now, I am quite happy with the progress of my neural network learning journey. I have covered the basics of neural networks, convolutional neural networks, and also get my hands on coding the vanilla neural network and convolutional neural network from “scratch”. Now, it is time to dive into...
Last time, I summarized my learnings of the basic concepts of Convolutional Neural Networks (CNNs) and a basic neural network implementation from scratch. In this post, I will continue sharing my experience and notes on coding a CNN from “scratch”, which is still part of my nn-learn project.
I’ve set out on a journey to learn and understand AI — with the ultimate goal of grasping the essence of large language models (LLMs) and exploring the frontier of research in this field freely. This pursuit is driven by a deeper curiosity: a desire to understand the origins of...
After completing my initial exploration and study of neural networks — and documenting it in Neural Network Notes: the Basics and Backpropagation — I am now moving on to the next fascinating topic: Convolutional Neural Networks (CNNs). CNNs are a type of deep learning architecture specifically designed to process data...
As an individual, I feel incredibly lucky to live in an era largely free from major national or global wars, while both fundamental sciences and engineering are advancing rapidly, reaching new peaks one after another at an unprecedented pace. Among these breakthroughs, Artificial Intelligence — particularly Large Language Models (LLMs)...
It’s been several years since I first encountered the Gimbal Lock problem, and back then, I didn’t quite grasp its underlying mechanics. However, after revisiting the issue more recently, I believe I’ve finally cracked the parts that had confused me for so long. In this blog, I want to share...