Build Your Own Git With TypeScript
Let’s build some cool git features
In today’s experiment, I want to share with you how to build a few features of Git with TypeScript. Of course there would exist better ways to do it, but here’s how I can see and solve the problem. This is a simplistic point of view of a complex thing, I hope you enjoy it.
Before we start, you’re going to have a few previous experiences with some tools and technologies if you want to follow and code with me, here’s the list:
- VS Code
- Node
- TypeScript
- Git basics
Here’s a link where you can check the repository with the final code:
What we’re going to build
In this article, we’re going to build 6 basic features, for a better explanation than the one I’m going to give you here, please check the official documentation, you can start here.
- Repository
- Commit
- Commit chaining
- Branching
- Stage a file
- User input and commands
Configuring project
Let’s start configuring our project with some dependencies
Create babel.config.js file
After adding these dependencies, make sure to test jest
with a smoke test, for example:
With everything settled, let’s start to build our things =)
Repository
A Repository is the first thing that you create when you want to use GIT in your project, then we’re going to start with it, considering that a Repository is everything that is going to wrap the features we’re going to implement we’re going to call Git in our project, let’s start by creating an interface called GitI
.
In this interface, we’re going to declare how our Git class will look like
Now, let’s implement this interface in our Repository
(Git) class
At this first moment, our repository will only have the responsibility of having the name of our repository.
After creating the class, we can easily test it now with jest:
This command is related to:
git init
git init
initializes a repository, this means that this command creates all the files that will encapsulate everything and allow us to use other commands like: git log
.
Commit
In simple words, a commit is a snapshot of the content of the files you staged. Let’s start by creating our Commit interface.
For generating our id, we’re going to use the sha-1 library to generate something close to the GIT hash, in this first moment let’s use our message as the content to generate our id.
In this first moment, we’re not going to set the author for this commit, you can do it if you want to, but I will not focus on how we’re going to track commits by a user.
Now, let’s test our commit class, to verify if our id is being generated with the proper hash, keep in mind that for a specific string the output will always be the same, in that way, the string message
will always be: 6f9b9af3cd6e8b8a73c2cdced37fe9f59226e27d
A simplistic way to see how Git knows how a file has changed and how he’s able to get staged is by comparing the hashes, Let’s see it in an example
// index.ts
export * from "./git"
export * from "./commit"
We can consider that this specific file is no more than a formatted string, so this formatted string will generate a specific hash, if we make this test and take the content of this file and put it on the sha1 function, it’s going to return this specific hash:
const sha = sha1(fs.readFileSync('./index.ts', {encoding: 'utf-8'}))
console.log(sha)
// sha = 0280670383ddc7cd6640f60f6e4a10eb1799f807
Now, if we modify our file and add a new line, we expect that this function will generate to us, this hash
// index.ts
export * from "./git"
export * from "./commit"
export * from './branch'const sha = sha1(fs.readFileSync('./index.ts', {encoding: 'utf-8'}))
console.log(sha)
// sha = 84d89b5dbc2a304b1564d1761a68e8938ee7ea07
You’re going to notice that the hash has changed, this is what GIT does to know that a file changed.
Commit history
After creating our Commit class, we can now relate commits to each other with a commit history.
The history is a relation between older commits and newer commits where new commits are added to the head of the list and older commits are more close to the tail.
Let’s start by adding some new functionalities to the Commit interface:
You can notice that we’ve added two new things in the Commit interface, the parent property, and the getCommitLog
function.
The parent property is how’re going to relate commits to each other, Git uses Linked Lists to keep the commit history of a branch.
A Linked list is a data structure that is essentially a sequence of elements in memory that are related (linked) to each other (doubly) or to the next element (simple):
The Head
is where the Linked list starts and the last item of the list will points to null.
In our implementation will be basically the same, but when appending new commits to our list of commits we’re going to set the new commit as the Head of the list.
Before we create our getCommitLog
function let’s modify our constructor a bit:
Let’s start creating the parent
commit and initializing it in our constructor, the class or file that will create an instance of the Commit
class will be responsible to give the information about what commit is related for this instance.
Now, we can make our commit history and create the getCommitLog
in the Commit
class
What this function is doing is returning an array of ids (Our hashes), you can customize this as you like, with an array of hashes I was already satisfied with, if you want to add the whole class or just a custom object, please do it, an share with me how you customize this function.
Now let’s understand what is happening line by line, let’s start with commitAux
, this is auxiliary variable of CommitI
type or null
, so it can be the Head, Tail, or any element of our list. In the first moment, it will receive as value our own class. The other variable we have is history
in this variable we’re going to keep the ids of our list.
Now, we can look directly into our while loop that receives as condition the commitAux
this means that will keep looping while commitAux
is not null, inside while loop we’re adding the commitAux.id
into history
and set a new value to commitAux
as the parent of the previous commit, at some point commitAux
will be null
, while loop will be over, and will return our array:
I will let to test this specific part of the code when we implement the Branch class, so let’s continue with our experiment
Branch
Branches are how we can separate code in Git.
Let me give you an example, you’re in a team with 2 other developers working in the same code base that is hosted on the cloud. You already have a feature ready to go to production and you need to generate a build with that feature and need to send this to your code base on could. On another hand your colleagues were developing a test feature, that should not go to production, but they need to send it to the cloud. Branches can solve this easily, a branch is a clone of a code base where you can do whatever changes you want, without messing up the original one, avoiding conflicts and separating responsibilities with specific branches. In this example, we could create two branches one would only accept code that is ready for production and another branch that is a test branch, or a develop branch.
So at the end of the day, the branch is pointing to one modification, we save modifications in the shape of commits, this means that a branch points to a commit, that points to another commit, and so on.
In the image below you can consider that the code that could be delivered to production is in master and we can consider the branch testing as where things can be tested with no description, both branches points to the same commit
Now consider that your colleagues pushed their changes to the testing branch, this is what will look like:
Now the changes that your colleagues do, will not affect the master, they are in a different ramification of this tree.
Now if you push your changes to master, this is how it will look like:
Changes are not affecting each other, but both are pointing to the same start point.
Now it’s just to replicate this with TypeScript.
Let’s start by creating our Branch interface, we’re going to start by adding a name
and a commit
as properties of a branch.
Implementing this interface will look like this:
Simple, as that, now let’s test our branch class:
That’s cool, but this is not doing much, we only have a class that can return to us a name and a commit, this is basically doing nothing.
Let’s make some modifications and add some real functionality to our Git interface:
Adding those features to our Git class:
Wow, there are a lot of things happening in Git class now, so let’s start with the new properties, and let the checkout function for the last.
The branch
property will be the current branch of the repository, so it will be the code that the repository will be seeing at the moment.
The branches is private property, which means is only accessible to the class, this property is not exposed to the outside world, it’s an array of branches, which means that we’re going to store the branches we create in this array.
In the constructor, we’re initializing the branches as an empty array, below we’re creating our first branch called “main
”, adding into the branches
array, and setting it as our current branch.
The private method add
is pushing to our array the new branch.
Now finally, we’re on the checkout
method, this method is receiving as a prop the name of the branch that you want to checkout.
This means that if the branch exists in our repository we’re going to switch to it
// checkout
const branchIdx = this.branches?.findIndex((branch) => branch.name === name);
/* If branch already exists changes to existing branch */
if(branchIdx !== undefined && branchIdx !== -1 && this.branches?.length) {
this.branch = this.branches[branchIdx];
console.info(`Switched to branch: ${this.branch.name}`)
return this.branch;
}
If it doesn’t exist, we’re going to create and switch:
// checkout
/* Create if does not exists */
this.branch = new Branch(name, this.branch?.commit);
this.add(this.branch)
console.info(`Created and Switched to: ${name}`)
return this.branch;
If no name is passed it returns the current branch:
// checkout
if(!name){
console.info(`Current branch: ${this.branch.name}`)
return this.branch;
}
Now, after implementation, we can finally test our new code and the commit history.
Let’s create these new test cases:
Stage a file
Staging a file is no more than letting Git knows that a file or a group of them is ready to be committed.
This action is represented by the command
git add file.txt
Each file in your repository has two states: tracked or untracked.
Tracked files are files that were in the last snapshot (commit), as well as any newly staged files, in short, tracked files are files that GIT knows about.
Untracked files are everything else, this state represents any file that is in your working directory but was not in the last snapshot and is not in your staging area.
Please check the doc for further information
For now, we’re going to create the action of staging a tracked file that was modified and we want to stage it **.**
Let’s create this action and try to represent this with TypeScript
Start by creating the Add interface
This interface will only have a single method called stageFile, this method will receive as a prop a file path.
After this, we can create and implement our interface in the Add class
For now, we’ll only consider that we want to stage a file at a time.
Let’s understand what is happening in Add class.
First of all, we have a private property called dbPath
this is the path of the DB (a simple .txt file) where we’re going to store the files that are staged and ready to be committed, we’re going to initialize this property in our constructor with a custom path OR with our default path store.txt
.
In the stageFile
method, we’re validating if it’s a valid path, if it is we’re going to write the path in the file and return true if not return false.
Now we can test the stageFile
function
Now we must expose this method in our Git class and interface.
Let’s start by creating a declaration of the stageFile
function in Git interface:
After adding this declaration we can implement this function in our Git class:
The function is only a call to the Add class.
File content in Commit
After we stage our file in store.txt
file, we’ll need to convert the content of that specific path to a hash.
Let’s modify a few things in the Commit class
:
You may notice that we’ve modified the hash to be generated with the content of the file path that is stored in the store.txt
file, this content is read by the private function getStore
after reading and generating the hash, we clear the file with the clearStore
function.
User input and commands
On the road so far, we already built a few git features, those were: Repository, Branch, Commit, and Add.
For now, we’re only testing the features with jest, it’s good, but not functional, let’s create some real-world experience in our application by reading the user input
Start by creating the file where your program will start, I’m going to call the file: index.ts
This file will be the entry point of the program, so it’s cool to give some information to the user
I’m going to use ts-node
to build our application on the desktop, so make sure to add in your package.json
the script start
:
Back into the index.ts
, create an instance of the readLine
module from node:
After creating this instance, let’s create an instance of our Git class to have access to the features and read the user input with the question
function.
Here, we’re creating a function called readCommand
and creating a recursive function, after every answer of the user the function is activated again asking for a new command.
Finally, add the on
listener to watch the program and over and say “Goodbye” to the user
Now, you may notice that I didn’t explain yet what the syntaxValidator
function is doing, let’s get deep into this. I’ve started here by creating an array of valid commands, really similar to the GIT ones, expect by qti
which is an abbreviation of quati
that is a really cute animal
After defining these valid values, I’ve created the syntaxValidator
function, there’s a lot going on inside the function, a lot of ifs and elses, sincerely not the cleaner code you’ll see, so if you’ve any idea to get things better or more readable, please let in comments below or open a PR in the repo, I’ll be more than glad to hear your opinion.
I’m not going too deep into those functions because they are really simple, they’re just validating if the user input is valid, so you can customize this as you want.
The isCommandStartingProperly
function validates if the user input is starting with the “qti” command and if have a valid command after the “qti”
The isCheckoutCommandValid
function validates if the user input has a branch name or not and if it has it passes or not to the git.checkout
function.
The isBranchCommandValid
function does almost the same as isCheckoutCommandValid
but this function validates the branch -m
command.
The isLogCommandValid
function validates if the log command is valid, if there’s a commit in history it returns the history, if not tells the user to make a commit first.
The isCommitCommandValid
function validates if the commit -m
command is valid, and gives the message to our commit.
The isAddCommandValid
function validates if the add
is valid and sends the path of the file to the stageFile
function.
The syntaxValidator
is the entry point.
After implementing the functions, you should see something close to this:
Now it just tests your commands and the functionalities you’ve implemented.
Conclusion
That’s it for today’s article, we built some cool features of Git with TypeScript, and I really hope that you enjoyed it, any suggestions or corrections please let me know in the comments.
References and inspirations
- https://github.com/codecrafters-io/build-your-own-x
- https://kushagra.dev/blog/build-git-learn-git/
- https://www.youtube.com/watch?v=_7nISfpofec
- https://git-scm.com/
Want to Connect?LinkedinGithub