Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Pro Git, 2nd Edition (git-scm.com)
411 points by petercooper on Oct 25, 2014 | hide | past | favorite | 79 comments


I'm planning to write all of this up soon, but there are a few really interesting things about the second edition.

For one, everything was changed from Markdown to Asciidoc and the entire book can now be generated in multiple formats in the asciidoctor toolchain. Also, we're using O'Reilly's Atlas platform to generate amazingly high-quality PDF, ePub and Mobi versions automatically with every push to master. In every language. This is a massive improvement over the previous version.

The other really interesting thing is the production process. We used GitHub and prose diffs for the entire production of the book. I think we used about 100 Pull Requests to get to where we are now. This is a massive improvement over how I collaborated with editors for the first version, the first few chapters of which were actually done by sending Word documents back and forth.


It's interesting you talked about the writing process and not the content. After reading your comment I realised that I was overlooking a huge aspect of providing this resource. As a consumer of Pro Git, all I ever see is the end result. Thank you for reminding me there are serious man hours behind this, and thank you for your time and effort and the time and effort of all the people that contributed.


Thanks for all your hard work.

But why did you choose Asciido and asccidoctor? I think Markdown together with Pandoc is capable of generatring all kinds of formats too.


I wrote a short book recently [1] and went through a similar process. Started writing using Markdown & Pandoc but soon realized that Markdown isn't a very good fit for longer technical documents. As an example, support for asides aka admonition blocks is nonexistent. I switched to Asciidoc + Asciidoctor and the process was much more productive.

[1] Python for the busy Java Developer. http://antrix.net/py4java



If I understand correctly, the gitbook toolchain isn't freely available. Nor does gitbook address any of the limitations of Markdown that prompted me to switch to Asciidoc.


20% per transaction is just too much.


Pandoc is a powerful tool but it has some serious limitations. Nothing more sophisticated than it's own flavor of Markdown is supported. "Simple" things like internal references to anything other than headings are impossible. I had to abandon it after realizing how restricted its ReST parser is compared to the official Docutils.


Hi Scott, this is very exciting. Thank you.

I'd just like to point out that Google Play Books (play.google.com/books) gave me an error after I uploaded progit-en.31.epub:

"This file cannot be processed."

Then again, I've had lots of problems with Play Books and ePubs from APress.


This is interesting, I've never used Google Play Books. Is there an actual error message that you could put into our issues? The file is actually generated with O'Reillys Atlas platform, not Apress, so they would be interested in why that doesn't work I'm sure.


OK, Play appears to be pretty strict about xrefs for some reason. I'll look into it. You can track the issue here:

https://github.com/progit/progit2/issues/111


Is there an open source alternative to the Atlas Platform that has similar features? Thanks!


Take a look at Softcover (http://www.softcover.io/), which is a 100% open-source ebook production toolchain used to produce the Ruby on Rails Tutorial book (among others).


There is a simple build process using asciidoctor-pdf and asciidoctor-epub that can also create all 3 formats locally. The Atlas ones are just a little more professional looking, but the asciidoctor chain is still really great.

https://github.com/progit/progit2#how-to-generate-the-book


See also this previous HN thread about Pollen (written on top of Racket) by Matthew Butterick of Butterick's Practical Typography fame.

https://news.ycombinator.com/item?id=7822057


Hi, not 100% sure but I think there might be a typo on the front page "Everything you need to know about GITT". I'm new so might be missing something.


Yup, that's the cover that Apress made and we (lol) screenshot from Amazon. I was thinking of Photoshopping it out, but decided to just wait until they fixed it and do another screenshot.


This book is super helpful and I referred to it in the past while I was learning Git. I want to note though that Git is pretty complex and trying to learn it all at once is going to be extremely frustrating. Instead, just start by learning the basics (push, pull, commit, clone, add, and maybe a few others). Then, when you have a specific problem that you don't know how to solve (How do I combine all these tiny commits into one substantial one?... or, I screwed everything up, how do I revert to a previously good commit?), look it up or Google for it. I still learn new things about Git this way and I think I know my way around it pretty well. It's amazing that pretty much anything I can think of Git already has implemented.


> Instead, just start by learning the basics (push, pull, commit, clone, add, and maybe a few others).

That's how I learned it but quickly I felt like I was missing important things and didn't really understand the tool. Then I took the time to read pro git and realized it is actually pretty simple.

Git has became so ubiquitous that I think it's worth spending some time to understand it thoroughly.


"understand it thoroughly" is key. I deal with a lot of programmers who are new to git, and the first thing I always do is to take them through the material in sections 10.1-10.3 of this book. Once a person has those fundamentals (which are not complicated at all), they're most of the way there.

A great deal of the confusion around more advanced tasks (rebasing, etc) is removed when working from those first principles of the system.


I just taught git to my boss and a coworker. They both use Windows exclusively and don't touch a command line. I was able to teach them the basics (push, pull, clone, commit, status) and they learned very quickly. I told them to ignore the other commands for now, but look up how to use them when they need them.

Great advice!


What's the best approach to using fit on Windows? I am comfortable with git basics on the Linux command line, and I want to teach it to students who are using Windows.

Is there a windows command line version, or is there a GUI version I should start them on?


If you run the default Git windows installer, it installs something called "Git bash" which is basically msys (part of mingw) with git included, which gives you command line access to git very easily. If you want to teach them the CLI way from the start, that is the way to go.


I believe that "Git for Windows" has been EOL'd in favor of https://github.com/git-for-windows/sdk (which hasn't seen a first official release yet).


>This is an Inno Setup based wrapper around MinGW's mingw-get which installs a development environment for building Git for Windows using GCC.

This installs the dependencies and environment needed for building Git from source. I think that's overkill for most users.


You can teach them how to use the git command-line interface without forcing them to use Bash, though. Nothing wrong with simply putting `git` in your path.


As someone who used Windows for development for a long time, depending on what languages they're learning, using Bash is a much better idea than `cmd.exe` or PowerShell. The latter is quite nice, but when you're using *nix based software, the differences between the two are great enough to cause (at least for me personally) a bit of confusion.


I'm coming from it from the other direction: I'm a long time Unix guy who has crash landed on Windows. I find any Unix-like environment on Windows to just be an atrocity. Cygwin and mingw are both embarrassing compromises that are so middling as to be nearly pointless. And I like the Bourne shell syntax.

It seems cruel to make somebody relearn their command-line interface only to give them the neutered, confused interface that Git Bash provides.


Sourcetree is nice and free.


Personally I install git from their website and tick the middle radio option to use git from the normal command line. I live in the powershell.exe console all the time.

You might want to see the cmder project it can come with a build that includes it all ready to go. http://bliker.github.io/cmder/


FWIW, GitHub has a fantastic Metro-esque app for Windows - https://windows.github.com/


I prefer SourceTree from Atlassian, but GitHub for Windows may be easier for beginners.


There's also Git Extensions: https://code.google.com/p/gitextensions/

I was discouraged by early releases of SourceTree for Windows because it was buggy. I believe it has improved by now, but I got used to Git Extensions already. It's open source and written in C#, by the way - I keep on planning to fork it and make it more to my liking, just can't find time for that :)

If I recall well, the GitHub app doesn't provide any branch visualization, correct me if I'm wrong.

Git Extensions could be a good choice for teaching, because you get a preview of what commands are being executed by the client. It's not a blackbox that hides what Git does underneath.


GitHub for Windows is definitely too simplistic, though attractive. I use the TK-based `git gui` which comes with git regardless of OS.


I know that real programmers only use command line (and chew bees), but GUI client is really helpful for grokking a complex history with lots of branches, merges etc.


I hate to make a "what he said" comment, but this is really excellent advice for beginners with git.

Just learning the very basic actions will get you up and running to gain experience. I have found that many nuances of git are difficult to learn by reading theoretical examples, but fairly easy to understand once you encounter and research a similar problem in your real workflow.


One big thing that helped me get started was following this workflow: http://nvie.com/posts/a-successful-git-branching-model/

If you have co-workers whose workflow you can copy then I don't think it is needed, but if you want to start a project from scratch with Git then having a plan like this can be really helpful.


That is dangerous - you could very well be missing out on very important features all along the way without ever knowing it, wasting tons of time and productivity in process. This happens also when you directly start using an appliance without reading its manual first.


This is the book that I recommend to colleagues who are learning Git. I still refer back to it on occasion when I get into tricky situations with Git. Looking forward to reading through this new edition.


I don't think you can go wrong with anything Git related by Scott Chacon.

There used to be screencasts by him that were really good, quick tutorials of running through common scenarios (branching, merging, rebasing, managing your stash). I haven't been able to find them in a while. Those were really great when I was learning how to do an interactive rebase.


A search for GitCasts gave some results, such as:

https://www.youtube.com/playlist?list=PLttwD7NyH3omQLyVtan0C...


lol, gitcasts, nice! I totally forgot about these. Man, I should probably redo them, I liked that series.


I also recommend people to use the included "gitk" tool for visualized repository histories. It's absolutely criminal how most git tutorials never mention this tool.


I really liked the 1st edition, good and full featured book on Git. Does anyone know what the diff is with the 1st edition?


A lot. (Coauthor here.)

Not every sentence was rewritten, but 4 years is a long time. There was a lot of content that was either inaccurate or out of date. We added content about two-way bridges and migration to other VCSes, graphical clients, shell integration, and lots more. There are also new chapters on GitHub and embedded Git (Libgit2 and JGit).


Chacon sketches the differences in the Preface. As I understand the Preface (I haven't read the new book nor compared the respective TOCs), the second edition aims to give deeper treatments of these three topics:

1. HTTP as the simplest protocol for Git network transactions

2. GitHub and the overall Git community

3. Git on MS Windows


It's interesting how github even got its own section in this book. Poor bitbucket never gets any love. :(


Honestly, the reason is size. If you're a newcomer to OSS, it's very likely that you'll end up with a GitHub user account, and much less so with BitBucket.

We do mention BitBucket in the [forking workflows section](http://git-scm.com/book/en/v2/Distributed-Git-Contributing-t...), and many of the lessons from the GitHub chapter will carry over; the two sites have a lot in common.


GitHub is indeed the 800 lb. Git gorilla in the OSS world, so I don't doubt that your main motivation is to serve the interests of your readers, but it's probably a good idea to disclose the potential conflict of interest (namely, that Pro Git coauthor Scott Chacon was GitHub's first employee and still works there).


I do actually cover this in the preface: https://github.com/progit/progit2/blob/master/book/preface.a...

The conflict of interest does bother me, so I did try to be as clear as I could about it and to point out at the beginning of the chapter that you could easily skip it if you don't like or don't want to use GitHub. I also stayed away from everything I could that required payment.

The truth is that I've been approached to write books specifically about GitHub for several years so I thought that the demand for that sort of information was high enough to warrant moving it to it's own chapter. Had I been approached about writing about other resources I would probably have also included them.

The more difficult decision was whether to include information on Gerrit. I actually started writing a section on it, then considered a chapter or appendix. The main deciding factor was how many people I thought might benefit from it.


Whoa. The Amazon link on that page links to the 1st edition. I almost ordered it. whev ^_^


Ugh, yeah, this is complex. The second edition isn't going to be properly out for probably months. While it's online now and still undergoing some copy and technical editing, I'm not sure that sending you to the Amazon page for a pre-order months down the road is the best. It's hard to say though. :(


Huh, we'll have to fix that. You can find the pre-order for the 2nd edition if you search for it:

http://www.amazon.com/gp/product/1484200772/ref=as_li_tl?ie=...


Noticed that the price is double the price of the 1st edition. Is that due to the number of authors doubling? ;)


Amazon pretty much sets the price themselves here, we have nothing to do with it (nor do we get paid differently if you buy it for $50 or $20). I believe generally Amazon (or any reseller) pays Apress about $15 per book, since that's what our royalties are based off of.


This book just achieved the impossible and helped me understand the rebase command properly. I'm ordering the printed version as soon as it's available.


I'm liking asciidoc but wondering if anyone has married it to high quality website "skins". I googled and found some websites done in asciidoc but they all look pretty basic. Has anyone done anything that looks more professional, like a marketing company did it? If it costs money that is fine.


I still wonder what are the advantages of git over mercurial.


Perhaps the main advantage is that Git has won the mindshare war vs. Mercurial, and there are benefits to using what everyone else is using. Indeed, über-hacker Eric Raymond mentioned as recently as yesterday that he supports git even though he wishes hg had won instead. [1]

[1]: http://esr.ibiblio.org/?p=6476


It might be better to wonder what are the advantages of mercurial over git.

Better Windows support ? I think a huge proportion of OSS devs these days use linux / osx.

It might be useful for mercurial to focus on making itself the premier OSS DVCS for "enterprise"; that would create space between it and git / the git userbase, and would allow for it to specialize in a meaningful way that is perhaps less-easily forklifted into git.


I'm a git user, but the one killer feature that Mercurial has that I'd love to see appear in Git some day is revsets [1]. It's a DSL for selecting commits from the repo's history that match a predicate, and you can easily create complex predicates with it that would be much more difficult or impossible to specify with git's brittle command line options.

[1] http://www.selenic.com/hg/help/revsets


Those look very, very useful. Thanks for the pointer.


While I'm generally uninterested in "git vs hg" these days, I am interested in this Windows question. Though I don't use Windows, I'm of the impression that Git is as solid on Windows today as Hg (or any other command line based tool) is. If you install GitHub for Windows, we will also install the command line version with good defaults automatically. It's fast and easy and works well. Is there any way you're aware of that Git for Windows is not as solid as Git on Linux or Mac?


> I think a huge proportion of OSS devs these days use linux / osx.

For web development, maybe, but a huge amount of C++ development(games in particular) are almost exclusively done on windows


Out of the box support for rebase and interactive rebase. Interactive commands to stage/unstage/commit.

Better underlying data format: Checking out any version in history is a cheap operation.

And generally better acceptance of the tool/wider familiarity.


> Out of the box support for rebase and interactive rebase. Interactive commands to stage/unstage/commit.

Mercurials has those features, too, if you desire them (whether they're a good thing to have is a different argument). These features are not enabled by default because they can either lose history (rebase) or are complicated for new users (or are of dubious benefit).

You're probably confused by the fact that they're called extensions. I've always argued that the term "extensions" is misleading in this case. In actual fact, the bundled extensions are modules of the core system, which are maintained and supported at the same level as "core" Mercurial. See, e.g., this discussion: http://www.selenic.com/pipermail/mercurial/2013-November/046...

> Better underlying data format: Checking out any version in history is a cheap operation.

Git actually has some serious scalability problems in places and no good way to overcome them, since the storage format is hardcoded. For example: https://lists.gnu.org/archive/html/emacs-devel/2014-01/msg02...

I can't think of any version control system where checking out any version in history isn't fundamentally a cheap operation, either. This is not a Git-specific thing.


> are not enabled by default because they can either lose history (rebase) or are complicated for new users (or are of dubious benefit).

Yet for me and many users, this means we cannot go to any coworker's terminal and use those features, many of us can no longer live without.

> Git actually has some serious scalability problems in places and no good way to overcome them

How so? You can upgrade the format in new versions (perhaps some abstractions to make it easy are missing, but there's enough development effort behind git that it doesn't have to be easy).

I don't think hg is more scalable than git.

> I can't think of any version control system where checking out any version in history isn't fundamentally a cheap operation, either. This is not a Git-specific thing.

I read that hg uses some sort of a linear-append history format, meaning that going back to past versions is O(distance) in time. Is that false?


> I read that hg uses some sort of a linear-append history format, meaning that going back to past versions is O(distance) in time. Is that false?

This is pretty much completely false. Mercurial stores repository data (except for the parts that relate to the current working tree) in revlog format [1], which is specifically designed to allow retrieval of arbitary revisions in time proportional to the file size (modulo what seek times spinning platter disks may impose for uncached sectors). The revlog format is designed to be append-only, but that actually is done to keep things fast by enabling random access based on an index; it does not lead to O(history size) operations.

> I don't think hg is more scalable than git.

Git actually has made a number of conscious design decisions that limit its scalability that other versions haven't made. To understand that this is largely intentional and not likely to go away in any meaningful way anytime soon, consider this post [2] by Linus Torvalds on the Git mailing list in 2007 regarding the "git blame" problem. There is a surprising willingness there to sacrifice usability and scalability on the altar of Git's super-simple repository format.

Git's storage format is essentially a very simple transactional key-value store. While that has the benefit of allowing for a simple and robust implementation, it also makes certain operations more expensive than they need be (such as "git blame"). Some things Git can't even do, because it lacks the necessary metadata (such as avoiding spurious conflicts from first cherry-picking from a branch and later merging with the same branch). Others inherently require O(history length) or O(repository size) operations.

Git was designed by people who considered a repository with a couple of hundred MB a big project; Git really starts breaking down once you hit the 1GB barrier. Checkout the gcc repository (about 1.4GB), for example, then do "git gc", and take a coffee break, because it'll take a couple of minutes.

Somewhat famously, Facebook documented [3] some of their troubles with making Git scale.

Git, unlike Bzr or Mercurial, does not have a pluggable storage layer that can be used to easily ameliorate these shortcomings. Bzr [4], as an extreme, can cheerfully operate on a remote repository in the exact same way as a local one, because it treats local file storage vs. ssh vs. http(s) vs. sftp etc. just as different implementations of the storage layer. Facebook extended Mercurial [5] to accomplish largely the same goal.

Note that the above does not mean "Git sucks". For the most part, Git is a fine version control system, but where scalability is concerned, I'd pick a number of other VCSs first (and which is, frankly, why a lot of shops still use SVN over either Git or Mercurial; SVN may be more cumbersome to use, but it is a known quantity when it comes to handling large repositories). Git in particular is too much written like the typical C program, with a lot of concern for keeping constants low, but often with little concern for asymptotic worst-case complexity or extensibility.

> Yet for me and many users, this means we cannot go to any coworker's terminal and use those features, many of us can no longer live without.

This seems to be a workplace policy issue, not a technical concern. You can have these extensions preconfigured in a central place as part of your Mercurial installation if that's desirable.

[1] http://mercurial.selenic.com/wiki/Revlog (note that the specific format did change in v0.9, but the principles remain the same).

[2] http://marc.info/?l=git&m=116991865311836

[3] https://news.ycombinator.com/item?id=3548824

[4] Yes, I know that Bzr is not really being maintained actively by Canonical anymore, which is a shame, because in many respects, its design is far superior to either Git or Mercurial. Still, it's one of the reasons why quite a few people stick to it despite the lack of maintenance, because it bridges the gap between a fully distributed and a fully centralized VCS.

[5] https://code.facebook.com/posts/218678814984400/scaling-merc...


Git is simpler. If you understand the (very trivial) data model, then you understand what actions are possible, which actions are simple/complex or heavy/cheap, how to compose solutions for more complicated usecases, and how to reason yourself out of complex situations.


I used mercurial for a couple of years before switching to git precisely because mercurial was touted as the simpler choice. I believe I 'outgrew' mercurial, but I probably didn't invest time into discovering the various extensions and more complex components.

That said, in no way is git simpler if you view "simple" as the time between knowing nothing and being able to push your commits.


User friendliness and simplicity are completely orthogonal. For instance, Microsoft Word is very user friendly, yet it is an immensely complex piece of software.

I value simplicity over user friendliness.

Despite git having several usability blemishes, it is insanely simple. The only simpler system that I have ever used is quilt (used to manage patch files). When mercurial advocates talk about simplicity, what they are actually talking about is superficial user friendliness.


The mobi file is 105MB, which is over all of my email attachment limits.

Any ideas on how to send to my Kindles without uploading for each device?


So I just crushed the pngs which brings this down to 80M, but the problem is the mobi format that is produced by kindlegen. I cover the issue here: https://github.com/progit/progit2/issues/116

But basically Amazon produces a file with three files in it - the ePub source, the older mobi7 format and the newer k8 format. This file is meant to be uploaded to Amazon so you can download it from them. Amazon figures out what Kindle you have and sends you the correct version from this file instead of the whole 80M. It's unfortunate, but I'm not sure if I even _can_ take a single format out of the bundle.

If you have a newer Kindle, you should just be able to send the ePub or PDF to it. I'm not sure if it would be a lot worse or not though, I'll try it out.


The images are pretty big, so they'll look good in print and on retina screens. If you run the build locally, you could include a step that ImageMagicks them down to a more reasonable size for a Kindle.


Use the Kindle's "experimental browser" to open the .mobi file from the web somewhere.


Somewhat offtopic, but I wonder why git-scm.com has still not been transferred to the Software Freedom Conservancy http://whois.domaintools.com/git-scm.com


I believe I've actually offered to do this, but they haven't taken me up on it. I'm almost positive that I actually got a domain transfer code for it once but it was never transferred (I could be misremembering that though).

I think this just recently came up again when there was some issue with Heroku legacy routing stuff and people seemed to think that it was better that I continue to deal with it.


Thank you!


Thank you!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: