3 Git Basics

Git is a powerful and widely-used version control system that allows developers to track changes in their code over time, collaborate on projects of any scale, and maintain the history of their project modifications. With its efficient handling of large projects, robust branching and merging capabilities, and distributed architecture, Git facilitates the smooth operation of both small and large scale developments. This section provides an introduction to the basic concepts and operations of Git, enabling you to understand and leverage this fundamental tool in modern software development. Whether you’re new to Git or looking to strengthen your understanding, these Git basics will provide the foundational knowledge required to effectively navigate and use this version control system. For an in-depth understanding, refer to the official Git documentation at https://git-scm.com/docs/user-manual.

3.1 Installing Git

Choose your operating system: Git is compatible with Windows, macOS, and Linux, so make sure you choose the version that is compatible with your operating system.
Download and install Git. Go to the official Git website at https://git-scm.com/downloads and download the appropriate installer for your operating system. Follow the installation instructions provided by the installer.
Verify Git installation: Once Git is installed, open your terminal or command prompt and type git --version to verify that Git is installed correctly and check the version number.
Set up Git configuration: After installing Git, you need to set up your Git configuration. Open your terminal or command prompt and run the following commands:
```
 git config --global user.name "Your Name" 
```
```
 git config --global user.email "youremail@example.com" 
```
Replace Your Name with your actual name and youremail@example.com with your actual email address.

3.2 Initializing a Repository and Checking Status

git init initializes new git repository

ls -a checks for hidden files/folders

git status tells you what branch you are on.

This is a command in Git that shows the current status of your repository. It displays information about any changes that have been made to your working directory or staging area, as well as any new files or directories that have been added.

When you run git status, Git will show you the following:

Information	Description
Branch information	tells you which branch you are currently on, and whether there are any changes to that branch that have not yet been committed.
Staging area	shows you any files that have been modified, deleted, or added to the staging area, which is the area where changes are prepared for the next commit.
Untracked files	shows you any files in your working directory that are not being tracked by `Git`.

By running git status, you can get a quick overview of what changes have been made to your repository and what still needs to be committed or added. This command is often used as a first step when working with Git, to ensure that you are working with the correct branch and that all changes have been properly tracked.

3.3 Creating a Gitignore File

The .gitignore file is used to specify which files and directories should be ignored by Git when committing changes to a repository.

In the following example, the first line specifies that all files with the .txt extension should be ignored. The second line specifies that all files in the logs directory should be ignored. The third line specifies that the temp.log file in the root directory should be ignored. The fourth line specifies that all .md files in the docs directory and its subdirectories should be ignored.

The fifth line specifies that all files in the build directory should be ignored, except for the build/.keep file. This is achieved by using a negation pattern (i.e. !) to specify that the build/.keep file should not be ignored.

 
# ignore all .txt files
*.txt

# ignore all files in the logs directory
logs/

# ignore the temp.log file in the root directory
temp.log

# ignore all .md files in the docs directory and its subdirectories
docs/**/*.md

# ignore all files in the build directory, except for the build/.keep file
build/*
!build/.keep

# ignore all folders with the word data in them to avoid exposing PHI
data/

3.3.1 Ethical and Legal Considerations

Safeguarding Sensitive Health Data: The Importance of HIPAA Compliance and Responsible Data Management

When managing sensitive information, particularly health-related data governed by stringent regulations such as the Health Information Portability and Accountability Act (HIPAA), it’s crucial to take every possible measure to prevent the exposure of protected health information (PHI). This may include patient identification numbers, medical records, or any other personally identifiable data that, if misused, could lead to breaches of privacy or confidentiality.

To minimize the risk of accidental disclosure or unauthorized access, never upload or expose this sensitive data to a remote repository, even if certain elements of the data, such as patient IDs, have been anonymized. This precautionary measure is facilitated by the use of the .gitignore file. The .gitignore file is a critical tool that allows you to specify files and directories that should not be tracked by Git, hence, they won’t be included in any of your repositories, local or remote. By adding sensitive data directories to .gitignore, you ensure these data are excluded from version control, enhancing the security of the sensitive information contained within.

Despite these measures, it’s important to note that the ultimate responsibility of protecting sensitive information falls on you, as the end-user. Given the severe legal implications and potential financial costs associated with data breaches, it’s in your best interest to take all necessary steps to safeguard the data you handle, ensuring that it complies with all relevant data protection and privacy laws, including HIPAA. It’s not only a matter of legal compliance, but also one of ethical responsibility towards the individuals whose data we are entrusted with.

3.4 Exploring a Repository’s History

git log is a command in Git that displays the commit history of a repository. When you run git log, Git will show you a list of all the commits in the repository, starting with the most recent commit and going back in time.

For each commit, git log will display the following information:

Information	Description
Commit hash	A unique identifier for the commit.
Author	The name and email address of the person who made the commit.
Date	The date and time when the commit was made.
Commit message	A brief description of the changes made in the commit.

By default, git log will show the entire commit history of the repository. However, you can also use various options to filter and customize the output. For example, you can use git log -n to show only the last n commits, or git log --grep to search for commits that contain a specific keyword in the commit message.

git log is a powerful tool for exploring the history of a repository, tracking changes over time, and identifying when and where bugs or other issues may have been introduced. It is often used by developers to understand the development history of a codebase, and to collaborate and coordinate with other team members on a project.

Example:

 git log --grep="fixes a bug"

3.5 Adding, Committing, and Pushing Changes

git add: The git add command is used to add changes to the staging area. When you make changes to your codebase, these changes are initially only in your working directory. You can use the git add command to add these changes to the staging area. Once changes are in the staging area, you can commit them to the repository.
1. git add . is used to stage all changes made to tracked files in the current project directory and its subdirectories to the staging area for inclusion in the next commit. This includes newly created files, modified files, and deleted files.
2. git add name_of_file.extension is used to stage specific files or directories by specifying their names in the command. For example, the following command will stage changes made to the file myfile.txt:
```
 git add myfile.txt 
```
git commit: The git commit command is used to create a new commit in your local repository. A commit is like a snapshot of your code at a specific point in time. When you commit changes, you provide a commit message that describes the changes you’ve made. Commit messages should be descriptive and help other developers understand what you’ve changed. For example, if you want to commit changes with the message “updated the homepage content”, you can use the following command:
```
 git commit -m "Updated the homepage content" 
```
git push: The git push command is used to upload your local repository to a remote repository, like GitHub or GitLab. When you push changes, your local repository is synced with the remote repository, making your changes available to other developers.
1. Here’s an example of how to push a branch to a remote repository using the git push command where <remote> is the name of the remote repository, and <branch> is the name of the local branch you want to push to the remote repository.
```
 git push remote branch 
```

3.6 Navigating the Git Tree: Understanding Branch Creation and Switching

Git offers powerful features that support efficient version control, including the ability to create and switch between different branches. The git checkout command is a fundamental Git operation, enabling you to navigate the “tree” structure of your repository, create new branches, and restore your working directory to a previous state.

3.6.1 The Anatomy of a Git Tree

In a typical Git repository, you’ll find a default branch, commonly named main or master. This branch represents the primary codebase, where all stable and production-ready code resides. However, for managing multiple versions of the project, accommodating parallel development efforts, or isolating changes for specific features or bug fixes, Git allows the creation of additional branches. You can visualize the structure as a tree, with the main or master branch as the trunk, and other branches representing the branches of the tree.

3.6.2 Branch Creation

With git checkout, you can create a new branch and simultaneously switch to it, allowing you to commence work on a new feature or bug fix without disturbing the main development branch. To illustrate, creating a new branch named new_features can be accomplished as follows:

 git checkout -b new_features

3.6.3 Branch Switching

The git checkout command also facilitates switching between branches in your repository. Switching branches prompts Git to update your working directory to mirror the content of the new branch. As an example, let’s assume you’ve created a new branch called new_models. To switch to this branch from the main or master branch, you would enter:

 git checkout new_models

By effectively navigating the Git tree, developers can maintain a clean and stable main branch, while concurrently developing, testing, and implementing new features or fixes on separate branches.

3.7 Git and Shell Commands for File and Directory Management

It’s crucial to understand the difference between Git-specific commands and those used in the shell. Although they might look somewhat similar, they serve different purposes and have different impacts on your project.

3.7.1 Untracking Files and Directories in Git

The git rm -r --cached command is specifically designed for Git usage. It’s employed when you want to stop Git from tracking certain files in your repository. While this command removes files from the index (the staging area for commits), it does not delete them from your working directory.

For instance, suppose you’ve included a directory called internal_notes/ to your repository but later decide that Git shouldn’t track the files in internal_notes/. In this case, you would execute:

 git rm -r --cached internal_notes/

This action removes all files within the data/ directory from Git tracking, but they remain on your local filesystem.

3.7.2 Removing Files and Directories with Shell Commands

Conversely, the rm -rf command is a shell command unrelated to Git. It forcefully removes files or directories from your filesystem. When this command executes, it permanently deletes the specified files or directories.

For example, to delete a directory named internal_notes/, you’d use:

 rm -rf internal_notes/

Executing this command permanently removes the internal_notes/ directory and all its contents from your filesystem. Be extremely cautious with this command, as it doesn’t simply untrack files—it erases them for good.

3.7.3 Modifying the .gitignore File

Sometimes, you’ll want to instruct Git to ignore certain files or directories, meaning it won’t track changes to those files. To do this, you can add entries to a .gitignore file in your repository.

To add additional files or directories to your .gitignore file, you can use the echo command. This command, when combined with the >> operator, appends the specified string to the end of a file.

For example, to add internal_notes/ to your .gitignore file, you’d use:

 echo "internal_notes/" >> .gitignore

This command appends internal_notes/ to your .gitignore file. This addition tells Git to ignore the internal_notes/ directory and not track its contents. You can use this command to add any number of files or directories to your .gitignore file as your project requires.

3.8 Undoing Changes in Git: Understanding Revert and Reset

Git provides powerful features that enable developers to navigate their repositories, experiment freely, and make mistakes without fear. Two critical Git commands that facilitate undoing changes are git revert and git reset. These commands differ in their approaches and are used in distinct scenarios.

3.8.1 Git Revert

The git revert command is used when you want to reverse the effect of a specific commit while preserving the history of your repository. It does this by creating a new commit that undoes the changes made in the commit you’re reverting. This approach is safe and non-destructive, making it ideal for public or shared repositories where preserving history and avoiding force pushes is crucial.

For instance, if you want to revert the last commit, you can use the following command:

 git revert HEAD

For reverting a specific commit, replace HEAD with the commit hash:

 git revert [commit_hash]

3.8.2 Git Reset

On the other hand, git reset alters the commit history, which can be a powerful but potentially destructive operation if not used carefully. It moves the HEAD pointer to a specified commit and optionally changes the staging area or working directory to match the state at that commit.

The command git reset --hard changes the HEAD, staging area, and working directory to match the state at the specified commit, effectively discarding all changes since that commit:

 git reset --hard [commit_hash]

If you want to discard all changes in the working directory and staging area and go back to the state of the last commit, use:

 git reset --hard HEAD

Note: This command is “destructive” because it discards all changes and there’s no way to recover them.

When you’re uncertain, use git revert to safely undo a commit. Save git reset for situations where you’re confident that the lost changes won’t be needed in the future or for local changes that haven’t been pushed yet.

These commands, when used wisely, offer powerful control over your Git repository, allowing you to experiment freely, and safely undo changes when necessary. However, always remember that with great power comes great responsibility. Be mindful of the implications when altering the commit history, particularly in shared repositories.

3.8.3 Restoring a file to a previous state

If you want to revert a specific file to the version from a past commit, you can use the git checkout command with the commit hash and the file name. The commitid should be replaced with the SHA hash of the commit that has the version of the file you want to restore, and filename is the name of the file you want to restore.

The command is as follows:

 git checkout "commitid" "filename"

This command will update the specified file in your working directory to match the version from the specified commit. Note that this operation is local and won’t affect the repository’s history.

Remember to replace <commitid> with the actual commit hash (e.g., a1b2c3d4) and <filename> with the actual file name (e.g., myfile.txt).

For example:

 git checkout a1b2c3d4 myfile.txt

This command can be very useful when you have made unwanted changes to a file and want to restore it to a previous state. Be careful, as this operation will discard any unsaved changes to the file in your working directory.

3.9 Additional Notes

git branch tells you what branch you are currently on and is also used to create a new branch.
git branch -d deletes a branch.
git add name_of_file.extension for this to work effectively, you must adding from the correct working directory where the file is stored.
Once you have set up your virtual environment, make sure to add it to your .gitignore file, if you have not already.