You already have your own forked copy of the NumPy repository, by following Create a NumPy fork, Make the local copy, you have configured git by following Git configuration, and have linked the upstream repository as explained in Linking your repository to the upstream repo.
What is described below is a recommended workflow with Git.
In short:
Start a new feature branch for each set of edits that you do. See below.
Hack away! See below
When finished:
Contributors: push your feature branch to your own Github repo, and create a pull request.
Core developers: If you want to push changes without further review, see the notes below.
This way of working helps to keep work well organized and the history as clear as possible.
See also
There are many online tutorials to help you learn git. For discussions of specific git workflows, see these discussions on linux git workflow, and ipython git workflow.
First, fetch new commits from the upstream repository:
upstream
git fetch upstream
Then, create a new branch based on the master branch of the upstream repository:
git checkout -b my-new-feature upstream/master
# hack hack git status # Optional git diff # Optional git add modified_file git commit # push the branch to your own Github repo git push origin my-new-feature
Make some changes. When you feel that you’ve made a complete, working set of related changes, move on to the next steps.
Optional: Check which files have changed with git status (see git status). You’ll see a listing like this one:
git status
# On branch my-new-feature # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: README # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # INSTALL no changes added to commit (use "git add" and/or "git commit -a")
Optional: Compare the changes with the previous version using with git diff (git diff). This brings up a simple text browser interface that highlights the difference between your files and the previous version.
git diff
Add any relevant modified or new files using git add modified_file (see git add). This puts the files into a staging area, which is a queue of files that will be added to your next commit. Only add files that have related, complete changes. Leave files with unfinished changes for later commits.
git add modified_file
To commit the staged files into the local copy of your repo, do git commit. At this point, a text editor will open up to allow you to write a commit message. Read the commit message section to be sure that you are writing a properly formatted and sufficiently detailed commit message. After saving your message and closing the editor, your commit will be saved. For trivial commits, a short commit message can be passed in through the command line using the -m flag. For example, git commit -am "ENH: Some message".
git commit
-m
git commit -am "ENH: Some message"
In some cases, you will see this form of the commit command: git commit -a. The extra -a flag automatically commits all modified files and removes all deleted files. This can save you some typing of numerous git add commands; however, it can add unwanted changes to a commit if you’re not careful. For more information, see why the -a flag? - and the helpful use-case description in the tangled working copy problem.
git commit -a
-a
git add
Push the changes to your forked repo on github:
git push origin my-new-feature
For more information, see git push.
Note
Assuming you have followed the instructions in these pages, git will create a default link to your github repo called origin. In git >= 1.7 you can ensure that the link to origin is permanently set by using the --set-upstream option:
origin
--set-upstream
git push --set-upstream origin my-new-feature
From now on git will know that my-new-feature is related to the my-new-feature branch in your own github repo. Subsequent push calls are then simplified to the following:
my-new-feature
git push
You have to use --set-upstream for each new branch that you create.
It may be the case that while you were working on your edits, new commits have been added to upstream that affect your work. In this case, follow the Rebasing on master section of this document to apply those changes to your branch.
Commit messages should be clear and follow a few basic rules. Example:
ENH: add functionality X to numpy.<submodule>. The first line of the commit message starts with a capitalized acronym (options listed below) indicating what type of commit this is. Then a blank line, then more text if needed. Lines shouldn't be longer than 72 characters. If the commit is related to a ticket, indicate that with "See #3456", "See ticket 3456", "Closes #3456" or similar.
Describing the motivation for a change, the nature of a bug for bug fixes or some details on what an enhancement does are also good to include in a commit message. Messages should be understandable without looking at the code changes. A commit message like MAINT: fixed another one is an example of what not to do; the reader has to go look for context elsewhere.
MAINT: fixed another one
Standard acronyms to start the commit message with are:
API: an (incompatible) API change BENCH: changes to the benchmark suite BLD: change related to building numpy BUG: bug fix DEP: deprecate something, or remove a deprecated object DEV: development tool or utility DOC: documentation ENH: enhancement MAINT: maintenance commit (refactoring, typos, etc.) REV: revert an earlier commit STY: style fix (whitespace, PEP8) TST: addition or modification of tests REL: related to releasing numpy
If you plan a new feature or API change, it’s wisest to first email the NumPy mailing list asking for comment. If you haven’t heard back in a week, it’s OK to ping the list again.
When you feel your work is finished, you can create a pull request (PR). Github has a nice help page that outlines the process for filing pull requests.
If your changes involve modifications to the API or addition/modification of a function, add a release note to the doc/release/upcoming_changes/ directory, following the instructions and format in the doc/release/upcoming_changes/README.rst file.
doc/release/upcoming_changes/
doc/release/upcoming_changes/README.rst
We review pull requests as soon as we can, typically within a week. If you get no review comments within two weeks, feel free to ask for feedback by adding a comment on your PR (this will notify maintainers).
If your PR is large or complicated, asking for input on the numpy-discussion mailing list may also be useful.
This updates your feature branch with changes from the upstream NumPy github repo. If you do not absolutely need to do this, try to avoid doing it, except perhaps when you are finished. The first step will be to update the remote repository with new commits from upstream:
Next, you need to update the feature branch:
# go to the feature branch git checkout my-new-feature # make a backup in case you mess up git branch tmp my-new-feature # rebase on upstream master branch git rebase upstream/master
If you have made changes to files that have changed also upstream, this may generate merge conflicts that you need to resolve. See below for help in this case.
Finally, remove the backup branch upon a successful rebase:
git branch -D tmp
Rebasing on master is preferred over merging upstream back to your branch. Using git merge and git pull is discouraged when working on feature branches.
git merge
git pull
Sometimes, you mess up merges or rebases. Luckily, in Git it is relatively straightforward to recover from such mistakes.
If you mess up during a rebase:
git rebase --abort
If you notice you messed up after the rebase:
# reset branch back to the saved point git reset --hard tmp
If you forgot to make a backup branch:
# look at the reflog of the branch git reflog show my-feature-branch 8630830 my-feature-branch@{0}: commit: BUG: io: close file handles immediately 278dd2a my-feature-branch@{1}: rebase finished: refs/heads/my-feature-branch onto 11ee694744f2552d 26aa21a my-feature-branch@{2}: commit: BUG: lib: make seek_gzip_factory not leak gzip obj ... # reset the branch to where it was before the botched rebase git reset --hard my-feature-branch@{2}
If you didn’t actually mess up but there are merge conflicts, you need to resolve those. This can be one of the trickier things to get right. For a good description of how to do this, see this article on merging conflicts.
Do this only for your own feature branches.
There’s an embarrassing typo in a commit you made? Or perhaps you made several false starts you would like the posterity not to see.
This can be done via interactive rebasing.
Suppose that the commit history looks like this:
git log --oneline eadc391 Fix some remaining bugs a815645 Modify it so that it works 2dec1ac Fix a few bugs + disable 13d7934 First implementation 6ad92e5 * masked is now an instance of a new object, MaskedConstant 29001ed Add pre-nep for a couple of structured_array_extensions. ...
and 6ad92e5 is the last commit in the master branch. Suppose we want to make the following changes:
6ad92e5
master
Rewrite the commit message for 13d7934 to something more sensible.
13d7934
Combine the commits 2dec1ac, a815645, eadc391 into a single one.
2dec1ac
a815645
eadc391
We do as follows:
# make a backup of the current state git branch tmp HEAD # interactive rebase git rebase -i 6ad92e5
This will open an editor with the following text in it:
pick 13d7934 First implementation pick 2dec1ac Fix a few bugs + disable pick a815645 Modify it so that it works pick eadc391 Fix some remaining bugs # Rebase 6ad92e5..eadc391 onto 6ad92e5 # # Commands: # p, pick = use commit # r, reword = use commit, but edit the commit message # e, edit = use commit, but stop for amending # s, squash = use commit, but meld into previous commit # f, fixup = like "squash", but discard this commit's log message # # If you remove a line here THAT COMMIT WILL BE LOST. # However, if you remove everything, the rebase will be aborted. #
To achieve what we want, we will make the following changes to it:
r 13d7934 First implementation pick 2dec1ac Fix a few bugs + disable f a815645 Modify it so that it works f eadc391 Fix some remaining bugs
This means that (i) we want to edit the commit message for 13d7934, and (ii) collapse the last three commits into one. Now we save and quit the editor.
Git will then immediately bring up an editor for editing the commit message. After revising it, we get the output:
[detached HEAD 721fc64] FOO: First implementation 2 files changed, 199 insertions(+), 66 deletions(-) [detached HEAD 0f22701] Fix a few bugs + disable 1 files changed, 79 insertions(+), 61 deletions(-) Successfully rebased and updated refs/heads/my-feature-branch.
and the history looks now like this:
0f22701 Fix a few bugs + disable 721fc64 ENH: Sophisticated feature 6ad92e5 * masked is now an instance of a new object, MaskedConstant
If it went wrong, recovery is again possible as explained above.
git checkout master # delete branch locally git branch -D my-unwanted-branch # delete branch on github git push origin --delete my-unwanted-branch
See also: https://stackoverflow.com/questions/2003505/how-do-i-delete-a-git-branch-locally-and-remotely
If you want to work on some stuff with other people, where you are all committing into the same repository, or even the same branch, then just share it via github.
First fork NumPy into your account, as from Create a NumPy fork.
Then, go to your forked repository github page, say https://github.com/your-user-name/numpy
https://github.com/your-user-name/numpy
Click on the ‘Admin’ button, and add anyone else to the repo as a collaborator:
Now all those people can do:
git clone git@github.com:your-user-name/numpy.git
Remember that links starting with git@ use the ssh protocol and are read-write; links starting with git:// are read-only.
git@
git://
Your collaborators can then commit directly into that repo with the usual:
git commit -am 'ENH - much better code' git push origin my-feature-branch # pushes directly into your repo
To see a graphical representation of the repository branches and commits:
gitk --all
To see a linear list of commits for this branch:
git log
You can also look at the network graph visualizer for your github repo.
Backporting is the process of copying new feature/fixes committed in numpy/master back to stable release branches. To do this you make a branch off the branch you are backporting to, cherry pick the commits you want from numpy/master, and then submit a pull request for the branch containing the backport.
numpy/master
First, you need to make the branch you will work on. This needs to be based on the older version of NumPy (not master):
# Make a new branch based on numpy/maintenance/1.8.x, # backport-3324 is our new name for the branch. git checkout -b backport-3324 upstream/maintenance/1.8.x
Now you need to apply the changes from master to this branch using git cherry-pick:
# Update remote git fetch upstream # Check the commit log for commits to cherry pick git log upstream/master # This pull request included commits aa7a047 to c098283 (inclusive) # so you use the .. syntax (for a range of commits), the ^ makes the # range inclusive. git cherry-pick aa7a047^..c098283 ... # Fix any conflicts, then if needed: git cherry-pick --continue
You might run into some conflicts cherry picking here. These are resolved the same way as merge/rebase conflicts. Except here you can use git blame to see the difference between master and the backported branch to make sure nothing gets screwed up.
Push the new branch to your Github repository:
git push -u origin backport-3324
Finally make a pull request using Github. Make sure it is against the maintenance branch and not master, Github will usually suggest you make the pull request against master.
Requires commit rights to the main NumPy repo.
When you have a set of “ready” changes in a feature branch ready for NumPy’s master or maintenance branches, you can push them to upstream as follows:
maintenance
First, merge or rebase on the target branch.
Only a few, unrelated commits then prefer rebasing:
git fetch upstream git rebase upstream/master
See Rebasing on master.
If all of the commits are related, create a merge commit:
git fetch upstream git merge --no-ff upstream/master
Check that what you are going to push looks sensible:
git log -p upstream/master.. git log --oneline --graph
Push to upstream:
git push upstream my-feature-branch:master
It’s usually a good idea to use the -n flag to git push to check first that you’re about to push the changes you want to the place you want.
-n