Build Your Own Git With TypeScript

Let’s build some cool git features

Maxsuel Silva
Better Programming

--

Git

In today’s experiment, I want to share with you how to build a few features of Git with TypeScript. Of course there would exist better ways to do it, but here’s how I can see and solve the problem. This is a simplistic point of view of a complex thing, I hope you enjoy it.

Before we start, you’re going to have a few previous experiences with some tools and technologies if you want to follow and code with me, here’s the list:

  • VS Code
  • Node
  • TypeScript
  • Git basics

Here’s a link where you can check the repository with the final code:

What we’re going to build

In this article, we’re going to build 6 basic features, for a better explanation than the one I’m going to give you here, please check the official documentation, you can start here.

  • Repository
  • Commit
  • Commit chaining
  • Branching
  • Stage a file
  • User input and commands

Configuring project

Let’s start configuring our project with some dependencies

Create babel.config.js file

After adding these dependencies, make sure to test jest with a smoke test, for example:

With everything settled, let’s start to build our things =)

Repository

A Repository is the first thing that you create when you want to use GIT in your project, then we’re going to start with it, considering that a Repository is everything that is going to wrap the features we’re going to implement we’re going to call Git in our project, let’s start by creating an interface called GitI.

In this interface, we’re going to declare how our Git class will look like

Now, let’s implement this interface in our Repository (Git) class

At this first moment, our repository will only have the responsibility of having the name of our repository.

After creating the class, we can easily test it now with jest:

This command is related to:

git init

git init initializes a repository, this means that this command creates all the files that will encapsulate everything and allow us to use other commands like: git log.

Commit

In simple words, a commit is a snapshot of the content of the files you staged. Let’s start by creating our Commit interface.

For generating our id, we’re going to use the sha-1 library to generate something close to the GIT hash, in this first moment let’s use our message as the content to generate our id.

In this first moment, we’re not going to set the author for this commit, you can do it if you want to, but I will not focus on how we’re going to track commits by a user.

Now, let’s test our commit class, to verify if our id is being generated with the proper hash, keep in mind that for a specific string the output will always be the same, in that way, the string message will always be: 6f9b9af3cd6e8b8a73c2cdced37fe9f59226e27d

A simplistic way to see how Git knows how a file has changed and how he’s able to get staged is by comparing the hashes, Let’s see it in an example

// index.ts
export * from "./git"
export * from "./commit"

We can consider that this specific file is no more than a formatted string, so this formatted string will generate a specific hash, if we make this test and take the content of this file and put it on the sha1 function, it’s going to return this specific hash:

const sha = sha1(fs.readFileSync('./index.ts', {encoding: 'utf-8'}))
console.log(sha)
// sha = 0280670383ddc7cd6640f60f6e4a10eb1799f807

Now, if we modify our file and add a new line, we expect that this function will generate to us, this hash

// index.ts
export * from "./git"
export * from "./commit"
export * from './branch'
const sha = sha1(fs.readFileSync('./index.ts', {encoding: 'utf-8'}))
console.log(sha)
// sha = 84d89b5dbc2a304b1564d1761a68e8938ee7ea07

You’re going to notice that the hash has changed, this is what GIT does to know that a file changed.

Commit history

After creating our Commit class, we can now relate commits to each other with a commit history.

The history is a relation between older commits and newer commits where new commits are added to the head of the list and older commits are more close to the tail.

Let’s start by adding some new functionalities to the Commit interface:

You can notice that we’ve added two new things in the Commit interface, the parent property, and the getCommitLog function.

The parent property is how’re going to relate commits to each other, Git uses Linked Lists to keep the commit history of a branch.

A Linked list is a data structure that is essentially a sequence of elements in memory that are related (linked) to each other (doubly) or to the next element (simple):

The Head is where the Linked list starts and the last item of the list will points to null.

In our implementation will be basically the same, but when appending new commits to our list of commits we’re going to set the new commit as the Head of the list.

Before we create our getCommitLog function let’s modify our constructor a bit:

Let’s start creating the parent commit and initializing it in our constructor, the class or file that will create an instance of the Commit class will be responsible to give the information about what commit is related for this instance.

Now, we can make our commit history and create the getCommitLog in the Commit class

What this function is doing is returning an array of ids (Our hashes), you can customize this as you like, with an array of hashes I was already satisfied with, if you want to add the whole class or just a custom object, please do it, an share with me how you customize this function.

Now let’s understand what is happening line by line, let’s start with commitAux, this is auxiliary variable of CommitI type or null, so it can be the Head, Tail, or any element of our list. In the first moment, it will receive as value our own class. The other variable we have is history in this variable we’re going to keep the ids of our list.

Now, we can look directly into our while loop that receives as condition the commitAux this means that will keep looping while commitAux is not null, inside while loop we’re adding the commitAux.id into history and set a new value to commitAux as the parent of the previous commit, at some point commitAux will be null, while loop will be over, and will return our array:

I will let to test this specific part of the code when we implement the Branch class, so let’s continue with our experiment

Branch

Branches are how we can separate code in Git.

Let me give you an example, you’re in a team with 2 other developers working in the same code base that is hosted on the cloud. You already have a feature ready to go to production and you need to generate a build with that feature and need to send this to your code base on could. On another hand your colleagues were developing a test feature, that should not go to production, but they need to send it to the cloud. Branches can solve this easily, a branch is a clone of a code base where you can do whatever changes you want, without messing up the original one, avoiding conflicts and separating responsibilities with specific branches. In this example, we could create two branches one would only accept code that is ready for production and another branch that is a test branch, or a develop branch.

So at the end of the day, the branch is pointing to one modification, we save modifications in the shape of commits, this means that a branch points to a commit, that points to another commit, and so on.

In the image below you can consider that the code that could be delivered to production is in master and we can consider the branch testing as where things can be tested with no description, both branches points to the same commit

Now consider that your colleagues pushed their changes to the testing branch, this is what will look like:

Now the changes that your colleagues do, will not affect the master, they are in a different ramification of this tree.

Now if you push your changes to master, this is how it will look like:

Changes are not affecting each other, but both are pointing to the same start point.

Now it’s just to replicate this with TypeScript.

Let’s start by creating our Branch interface, we’re going to start by adding a name and a commit as properties of a branch.

Implementing this interface will look like this:

Simple, as that, now let’s test our branch class:

That’s cool, but this is not doing much, we only have a class that can return to us a name and a commit, this is basically doing nothing.

Let’s make some modifications and add some real functionality to our Git interface:

Adding those features to our Git class:

Wow, there are a lot of things happening in Git class now, so let’s start with the new properties, and let the checkout function for the last.

The branch property will be the current branch of the repository, so it will be the code that the repository will be seeing at the moment.

The branches is private property, which means is only accessible to the class, this property is not exposed to the outside world, it’s an array of branches, which means that we’re going to store the branches we create in this array.

In the constructor, we’re initializing the branches as an empty array, below we’re creating our first branch called “main”, adding into the branches array, and setting it as our current branch.

The private method add is pushing to our array the new branch.

Now finally, we’re on the checkout method, this method is receiving as a prop the name of the branch that you want to checkout.

This means that if the branch exists in our repository we’re going to switch to it

// checkout
const branchIdx = this.branches?.findIndex((branch) => branch.name === name);
/* If branch already exists changes to existing branch */
if(branchIdx !== undefined && branchIdx !== -1 && this.branches?.length) {
this.branch = this.branches[branchIdx];
console.info(`Switched to branch: ${this.branch.name}`)
return this.branch;
}

If it doesn’t exist, we’re going to create and switch:

// checkout
/* Create if does not exists */
this.branch = new Branch(name, this.branch?.commit);
this.add(this.branch)
console.info(`Created and Switched to: ${name}`)
return this.branch;

If no name is passed it returns the current branch:

// checkout
if(!name){
console.info(`Current branch: ${this.branch.name}`)
return this.branch;
}

Now, after implementation, we can finally test our new code and the commit history.

Let’s create these new test cases:

Stage a file

Staging a file is no more than letting Git knows that a file or a group of them is ready to be committed.

This action is represented by the command

git add file.txt

Each file in your repository has two states: tracked or untracked.

Tracked files are files that were in the last snapshot (commit), as well as any newly staged files, in short, tracked files are files that GIT knows about.

Untracked files are everything else, this state represents any file that is in your working directory but was not in the last snapshot and is not in your staging area.

Please check the doc for further information

For now, we’re going to create the action of staging a tracked file that was modified and we want to stage it **.**

Let’s create this action and try to represent this with TypeScript

Start by creating the Add interface

This interface will only have a single method called stageFile, this method will receive as a prop a file path.

After this, we can create and implement our interface in the Add class

For now, we’ll only consider that we want to stage a file at a time.

Let’s understand what is happening in Add class.

First of all, we have a private property called dbPath this is the path of the DB (a simple .txt file) where we’re going to store the files that are staged and ready to be committed, we’re going to initialize this property in our constructor with a custom path OR with our default path store.txt.

In the stageFile method, we’re validating if it’s a valid path, if it is we’re going to write the path in the file and return true if not return false.

Now we can test the stageFile function

Now we must expose this method in our Git class and interface.

Let’s start by creating a declaration of the stageFile function in Git interface:

After adding this declaration we can implement this function in our Git class:

The function is only a call to the Add class.

File content in Commit

After we stage our file in store.txt file, we’ll need to convert the content of that specific path to a hash.

Let’s modify a few things in the Commit class:

You may notice that we’ve modified the hash to be generated with the content of the file path that is stored in the store.txt file, this content is read by the private function getStore after reading and generating the hash, we clear the file with the clearStore function.

User input and commands

On the road so far, we already built a few git features, those were: Repository, Branch, Commit, and Add.

For now, we’re only testing the features with jest, it’s good, but not functional, let’s create some real-world experience in our application by reading the user input

Start by creating the file where your program will start, I’m going to call the file: index.ts

This file will be the entry point of the program, so it’s cool to give some information to the user

I’m going to use ts-node to build our application on the desktop, so make sure to add in your package.json the script start:

Back into the index.ts, create an instance of the readLine module from node:

After creating this instance, let’s create an instance of our Git class to have access to the features and read the user input with the question function.

Here, we’re creating a function called readCommand and creating a recursive function, after every answer of the user the function is activated again asking for a new command.

Finally, add the on listener to watch the program and over and say “Goodbye” to the user

Now, you may notice that I didn’t explain yet what the syntaxValidator function is doing, let’s get deep into this. I’ve started here by creating an array of valid commands, really similar to the GIT ones, expect by qti which is an abbreviation of quati that is a really cute animal

After defining these valid values, I’ve created the syntaxValidator function, there’s a lot going on inside the function, a lot of ifs and elses, sincerely not the cleaner code you’ll see, so if you’ve any idea to get things better or more readable, please let in comments below or open a PR in the repo, I’ll be more than glad to hear your opinion.

I’m not going too deep into those functions because they are really simple, they’re just validating if the user input is valid, so you can customize this as you want.

The isCommandStartingProperly function validates if the user input is starting with the “qti” command and if have a valid command after the “qti”

The isCheckoutCommandValid function validates if the user input has a branch name or not and if it has it passes or not to the git.checkout function.

The isBranchCommandValid function does almost the same as isCheckoutCommandValid but this function validates the branch -m command.

The isLogCommandValid function validates if the log command is valid, if there’s a commit in history it returns the history, if not tells the user to make a commit first.

The isCommitCommandValid function validates if the commit -m command is valid, and gives the message to our commit.

The isAddCommandValid function validates if the add is valid and sends the path of the file to the stageFile function.

The syntaxValidator is the entry point.

After implementing the functions, you should see something close to this:

Now it just tests your commands and the functionalities you’ve implemented.

Conclusion

That’s it for today’s article, we built some cool features of Git with TypeScript, and I really hope that you enjoyed it, any suggestions or corrections please let me know in the comments.

References and inspirations

Want to Connect?LinkedinGithub

--

--