How I stopped missing Darcs and started loving Git

Posted by Tom Moertel Mon, 10 Dec 2007 21:52:00 GMT

About three years ago, I switched to Darcs as my primary source-code management system. It was simple, intuitive, and powerful, and it made managing my projects more fun and less frustrating than any centralized VCS ever had. That it was written in Haskell, one of my favorite programming languages, made it even better. I was hooked.

Since then, the distributed SCM landscape has changed. Darcs hasn’t improved much, but its competitors have made long strides, especially Git and Mercurial. Both are crazy fast, vigorously developed, and widely used on large, highly active real-world projects, such as the Linux kernel and Mozilla 2. In comparison, Darcs has stagnated.

When I started working for a new company recently, I had to consider whether to advocate Darcs or something else. In the end, I decided that Darcs would be a hard sell. Nobody else at the company uses Haskell, and having to explain how to avoid the occasional corner case seemed liked a losing proposition.

After researching and playing around with Git and Mercurial, I settled on Git. I like Git’s underlying hashed-blobs model better than Mercurial’s revlogs, and Git seems to have slightly more development momentum. Still, it was a close call. Either choice would have been completely reasonable.

Missing Darcs

When I started using Git on real projects, the one thing I really missed was the ability to easily amend earlier patches, something Darcs made trivial. Let me explain. The typical development workflow goes something like this:

  1. Checkout copy of upstream code base.
  2. Implement feature X.
  3. Commit.
  4. Implement independent feature Y.
  5. Commit.
  6. Implement independent feature Z.
  7. Commit.
  8. Push new features back upstream.

Now, what really happens is that when I’m implementing Y or Z, I’ll realize that I made a mistake in X. The trick is then fixing X so that my fix is part of the changeset/patch for X that ultimately gets pushed upstream in the last step. That way, the upstream folks will see only a single, clean patch for feature X – not a mishmash of patches that together represent X.

In Darcs, amending the original patch is easy because its patch theory lets me tweak the patch for X independently of the other patches. Darcs will simply ask me which patch I want to amend, and I’ll select the orignal patch for X:

$ emacs               # fix X
$ darcs amend-record  # amend original patch for X

Mon Dec 10 14:43:13 EST 2007  Tom Moertel <tom@moertel.com>
  * Implemented Z
Shall I amend this patch? [yNvpq], or ? for help: n

Mon Dec 10 14:42:12 EST 2007  Tom Moertel <tom@moertel.com>
  * Implemented Y
Shall I amend this patch? [yNvpq], or ? for help: n

Mon Dec 10 14:41:46 EST 2007  Tom Moertel <tom@moertel.com>
  * Implemented X
Shall I amend this patch? [yNvpq], or ? for help: y
hunk ./x 1
-X1
+X2
Shall I add this change? (1/?)  [ynWsfqadjkc], or ? for help: y
Finished amending patch:
Mon Dec 10 14:43:25 EST 2007  Tom Moertel <tom@moertel.com>
  * Implemented X

That’s it. The exact same process will work regardless of when I realize I need to fix X: before I start Y, while I’m implementing Y, after I’ve committed Y, while I’m working on Z, or after I’ve committed Z.

Learning to love Git

With Git, however, I can amend a commit only if I haven’t committed anything else before making my fix. In Git’s mind, Y depends on X, and Z depends on Y, even if they really are independent of one another.

So if I commit the original patch for X and then immediately realize I need to make a fix, before I start working on Y or Z, it’s easy:

$ emacs               # implement X
$ git commit -m 'Implemented X'

# discover problem in X

$ emacs               # fix X
$ git commit --amend  # amend original patch

More typically, it’s only while I’m working on Y that I’ll realize I need to fix X. Then it’s more complicated to amend the original commit:

$ emacs               # implement X
$ git commit -m 'Implemented X'
$ emacs               # start working on Y

# discover problem in X

$ git stash           # stash away half-completed work on Y
$ emacs               # fix X
$ git commit --amend  # amend original patch for X
$ git stash apply     # restore work on Y
$ emacs               # continue working on Y

While not as convenient as Darcs’s workflow, it’s perfectly workable.

Now let’s consider another fairly typical case: I commit X and Y and then start working on Z before I notice the problem in X. I used to think that Git couldn’t handle this case, but it can, thanks to git rebase --interactive:
$ emacs               # implement X
$ git commit -m 'Implemented X'
$ emacs               # implement Y
$ git commit -m 'Implemented Y'
$ emacs               # start working on Z

# discover problem in X

$ git stash           # stash away half-completed work on Z
$ emacs               # fix X
$ git commit -m 'Fixed X'
$ git rebase --interactive HEAD~3  # see comments below
$ git stash apply     # restore work on Z
$ emacs               # continue working on Z
The git rebase --interactive command is powerful. What the command does, as called in the snippet above, is invoke my editor of choice on a text file describing the last 3 commits (that’s the HEAD~3 part):
# Rebasing 3ad99a7..b9a8405 onto 3ad99a7
#
# Commands:
#  pick = use commit
#  edit = use commit, but stop for amending
#  squash = use commit, but meld into previous commit
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
pick 0885540 Implemented X
pick 320b115 Implemented Y
pick b9a8405 Fixed X

I can then edit the file to reorder, merge (squash), and/or remove the commits. In this example, I want to merge the fix for X into the original commit that implemented X. So I edit the file like so:

pick 0885540 Implemented X
squash b9a8405 Fixed X
pick 320b115 Implemented Y

Then I save the file, at which point Git takes over and makes the requested changes, merging the fix for X into the original commit for X. Now the log shows the original implementation and fix as one commit:

$ git log
commit f387d650976246c0854d028b040cca40e542be56
Author: Tom Moertel <tom@moertel.com>
Date:   Mon Dec 10 15:11:26 2007 -0500

    Implemented Y

commit 82a1c849ffd1bd688d5bc9d99be0e63548a89c4c
Author: Tom Moertel <tom@moertel.com>
Date:   Mon Dec 10 15:13:03 2007 -0500

    Implemented X

    Fixed X

commit 3ad99a7ef537b7ae99e435e0d2b4b0d03de92c65
Author: Tom Moertel <tom@moertel.com>
Date:   Mon Dec 10 15:11:14 2007 -0500

    Initial checkin

Once I figured out how to use git rebase --interactive, I stopped missing Darcs and started loving Git.

Posted in
Tags , , , ,
42 comments
no trackbacks
Reddit Delicious

Comments

  1. Håkon said about 1 hour later:

    Hilarious timing, as Darcs 2-pre1 was announced/released a few hours ago, fixing the dreaded conflict bug, etc.

  2. Eric said about 2 hours later:

    Håkon,

    Unfortunately simply fixing the conflict bug does not address the remaining usability concerns nor does it make darcs less likely to trash your data either intentionally or through one of its many bugs.

  3. Håkon said about 11 hours later:

    Eric,

    This is true. However, the prerelease shows that darcs development is carrying on, and I don’t think they’ve only fixed that one bug. To me, all the other distributed SCMs are still playing catch-up. And Git scares me (well, it did last time I read about it. Maybe I should give it another chance).

  4. Kurt said about 15 hours later:

    Okay, so Git includes an oddly named command that lets you edit a kind “script” of the source control hierarchy, with its own oddly named commands, to overcome a deficiency in the software regarding changeset dependencies. Thanks anyway.

    In addition to the changeset smarts, the major feature Darcs has going for it is the focus on usability. Git is starting to sound as bad as arch.

  5. she said about 15 hours later:

    What else do you want to see Kurt? The kernel folks use git, not darcs … it makes no sense to assume that git is any worse than darcs or that the kernel folks are idiots on what they do (but maybe some of them actually are idiots… sometimes flamewars make everyone involved look like an idiot there).

  6. Jonas said about 17 hours later:

    I have a very superficial understanding of git, having never used it in practice, but aren’t you supposed to branch the code between different independent functions? And then submit each feature as an independent patch. I had the impression that git made branches as cheap as commits and that’s what made it great.

  7. Kurt said about 19 hours later:

    I thought my point was pretty clear. I want to see more usable software. Unfortunately, designing for usability is hard, which means it often doesn’t get done. I have no doubt that the Git designers could come up with a better way to perform the task that Tom describes in the article. Will they? Probably not, because what they have “works”, and they don’t care about putting people off.

    Why does it matter? Since they’re the kernel developers, lots of people will use what they use by example. Lots of developers will be stuck with cumbersome software, and moreover, they’ll think it’s okay. That means they won’t put more thought into their own software, and they’ll end up writing cumbersome software. It’s a self-perpetuating cycle. “Well, I can deal with complexity, so my users can deal with it, too.”

  8. Kurt said about 19 hours later:

    From another point of view, every user who thinks that some convoluted software process is okay is a point against usability. “Well the users don’t complain or even say they like it this way, so no point in making it better.” That mindset is contagious.

  9. CV said about 19 hours later:

    Kurt, I think you should learn a bit more about how git works these days. Its usability has improved by leaps and bounds in the last few months. The git-rebase command, which you criticized for its strange name, is spectacular (quite aside from its—interactive flag), and is one of the few things which helps me stay sane while I deal with Subversion repositories every day.

    I used darcs for a project about six months ago, and liked it quite a bit; I have tremendous respect for the darcs team, and I’m looking forward to the improvements in version 2. That said, since git 1.5 came out, I haven’t missed darcs at all, either from a usability or reliability point of view.

  10. Håkon said about 20 hours later:

    @she: Git isn’t necessarily better than darcs, just because the kernel guys use it. Linus is certainly no god, or an authority on UI design (remember his misguided flaming over UI issues in the past). This, however, doesn’t mean that the Git interface can’t improve.

    As a mathematician, darcs appeals to me in a way Git never will. I also think darcs will become the better SCM when t -> inf.

  11. Johan said 1 day later:

    I agree with Jonas. Branch more!

  12. Tom Moertel said 1 day later:

    Kurt, I find git rebase --interactive to be very usable, and I say that as a 3-year Darcs user who still loves Darcs. Git’s approach is basically equivalent to darcs --amend: where Darcs asks you to select the patch to amend and the hunks to merge via a series of interactive prompts, Git asks you the same questions via a single interactive editing session. Both approaches are easy when the changes you want to make are small, such as in my example in the article above, but the Git approach is considerably more manageable when the changes are larger and more complicated.

    So far, in my short time using Git, I have found that its UI is highly optimized for non-trivial, real-world coding situations. It makes the easy things easy, and the hard things surprisingly manageable.

    Cheers,
    Tom

  13. Tom Moertel said 1 day later:

    Johan and Jonas: In Git, each repo represents an independent line of development, in effect a separate branch. This is the way most distributed SCMs work.

    In Git, however, you can also maintain multiple additional branches in your local repo. So, when working on feature X, I could create a new “topic branch” for X in my local repo. Then, after I was done working on X, I could merge the topic branch back into the master branch. Then I would probably delete the topic branch since it’s no longer needed. The end result is that X would get added into the master branch.

    If X were a big feature, I might develop it that way, in its own topic branch. But, in most cases, when X is small, I just do the work in the master branch of my local repo, committing it to that branch when I think it’s done. I won’t push it upstream until later, though. I will let the commit “rest” a while (generally while working on Y or Z) to make sure it’s really done.

    That’s how I got the X, Y, Z scenario I wrote about in the main article. It’s fairly common for me.

    Cheers,
    Tom

  14. Phil Toland said 8 days later:

    I have used both Git and Darcs extensively and I think both have their pluses and minuses. Git’s branching is handy as is git stash. On the other hand, Darcs’ interactive workflow just spanks Git in terms of ease of use. Git expects you to remember arcane sequences of commands in order to accomplish something whereas Darcs is focused on making common workflow tasks easier and more intuitive. Just because Git 1.5 is a vast improvement on previous versions does not make it “great”. I believe that Kurt’s analysis is correct and that the team behind Git puts a low priority on usability.

    All that having been said, I am in the same position as Tom and need to decide on a DVCS to recommend for work. In spite of its interface flaws, Git still seems to be a better choice that Darcs. Hopefully Darcs 2 will rectify that.

  15. kevin said 22 days later:

    I love love love darcs, and refuse to learn the myriad switches and complexity of git (I’ve done it at least twice and forgotten them already!). But, I’m faced with a dev team with a bunch of windows developers. so I’m pushing for mercurial. easy to learn, and windows friendly too.

  16. kevin said 22 days later:

    oh, and mercurial’s revlogs take up less disk space and are often faster than git’s blobs! :)

  17. Mark Stosberg said 33 days later:

    Eric,

    I’m curious one of the examples where darcs 2 will “trash your data”. I’ve been looking through the bug tracker over the past few days and writing tests cases for various bugs for testing with darcs 2, and I think of a serious data-trashing problem that happens when you start with darcs 2 and the new darcs-2 format.

    I’ve been using darcs-1 since before the 1.0 release for a 40k+ line project with a few other developers, and never ran into a serious data corruption with it either.

    In fact, I was very pleased with how easy it was to recover from the usually minor problems I did run into, which usually were triggered by some user error, like running a command with the wrong permissions to complete it.

    Mark
  18. Nihil Est said 37 days later:

    Phil (@14), consider Kevin (@15)’s advice. Mercurial seems to strike a balance between usability and speed, being on par with Git for speed, and with a simple core command set that grows as you need it to (via extensions).

    I say this as someone who has tried all the VCSes out of curiosity. Git is improving in the UI sense, but it still has an everything-and-the-kitchen-sink feel to it. It has a core command set that’s learnable, but it’s not easy to extract that from the documentation. I expect this will improve as more people write tutorials, but Mercurial will get you there quicker, at least for now.

  19. Anonymous@anonymous.com said 64 days later:

    Life is too short for anyone to spend more than 30 minutes, in your whole career, learning how to use version control software.

    “nobody wants to become an expert in their revision management software, so it should be really easy to learn (flat learning curve) and still very powerful to do all the things you need and want to do.”

    Revision control software should be so easy to use that the person who can’t handle programing anything more than MS-Excel macros should be an expert in the VCS in less than a half hour, and should be able to remember most of the commands after not using the VCS, at all, for a year.

    Especially roll-back, revert, or what ever you want to call the operation.

    the quote above is from a nice darcs vs. bzr article at http://www.kdedevelopers.org/node/2024

  20. Prasinos said 360 days later:

    I’d like to point out that darcs usability didn’t just happen by accident, or because the darcs developers are usability experts. It came about because darcs has a very sound theoretical model about version control (and is probably the only vcs that does). Even with its flows or even if it stagnates (which does not seem likely given the recent release of darcs2) the ideas behind darcs will probably find their way in another vcs.

    To me the most important part of the article is this: “In Git’s mind, Y depends on X, and Z depends on Y, even if they really are independent of one another.” (an assumption that most vcs make). This alone shows the advantages of the darcs approach.

    As for the “Linus/kernel uses git” argument, it is rather hollow. Not only the kernel is quite different from most projects but also the kernel developers are quite different from most other developers.

  21. Tom Moertel said 360 days later:

    @Prasinos:

    Thanks for your comment.

    Here’s why your argument doesn’t persuade me: I’ve used Darcs for about three years and Git for about one, and Git is better. It’s that simple: I’ve measured both systems via actual use, and Git wins.

    Don’t get me wrong, I like Haskell, I like theory in general, and I like the ideas behind Darcs in particular. I even used Darcs first and let it shape my thoughts about how DVCSs ought to work. Still, after starting to use Git for production work, I found that Git just works better. And after using Git now for about a year, that conclusion has only gotten stronger.

    For me, then, the interesting question is, why does Git work better, even though Darcs has the more powerful theory behind it? The answer, I’m coming to realize, is that Git was designed with practice in mind, with the knowledge of somebody who understood, arguably better than anybody else on the planet, how to manage the source code for large, highly collaborative, fast-moving software projects. Thus Git doesn’t have a rigorous theory behind it; rather it brute-forces the problem, snap-shotting every interesting point in your project history and making it easy – trivial mostly – to name interesting points and slice and dice the history until it tells any story you want it to. That model turns out to work great in practice: it’s easy to think about, and it’s easy to work with using the tools Git provides.

    I want to be clear, that’s not an argument for Git being better; it’s my best explanation for why Git is better for me in practice.

    Let me go back and re-examine my earlier statement:

    “In Git’s mind, Y depends on X, and Z depends on Y, even if they really are independent of one another.”

    If you don’t use Git, it’s hard to appreciate how easy it is to rewrite dependencies, and therefore it’s easy to overestimate the cost of having a fictional dependency in a project history. If X and Y are independent, for example, you might think it’s somehow burdensome to have your DVCS think that Y depends on X, but with Git it doesn’t matter because it’s trivial to swap them if you care (and most times you don’t).

    In sum, the reason I now prefer Git is because I’ve found it (by actual, long-term production use) to be the best DVCS in the running.

    Thanks again for your comment.

  22. Prasinos said 361 days later:

    Tom, I don’t want to persuade you (or anybody else) about darcs. I know firsthand that darcs can be very frustrating at times ;-)

    My comment was about making a point that sometimes we at the “software industry” forget: complex problems need sound theory. And distributed version control is a difficult problem.

    We can often produce a solution without that much of a theoretical foundation but as problems grow more complex these types of solutions become more brittle.

    The situation in not unlike that of programming languages where we see that “practical” languages (e.g., C+/Java) are much more widespread than more “elegant” ones (e.g. Haskell). However, as problems grow more complex (e.g. concurrency) we see ideas from functional programming enter the mainstream (the C+0x standard is a nice example).

    In any case, thanks for an informative article (we’ve had enough of articles that are nothing more than useless feature by feature comparisons).

  23. Dan said 427 days later:

    The problem “Sound Theory” is that while beautiful, the performance often sucks. Until recently, Haskell’s performance sucked. Yes, I’ve learned a lot by studying it, but it used to be infuriatingly slow.

    C has warts, but it is fast and gets the job done. Same with Java.

    And the same with Git vs Darcs. Sure, Darcs has a beautiful theory behind it, and it probably will influence more pratical DVCS, but purity is rarely the best solution.

    cf Conflicts in DARCs. Order of patches can cause run away exponential conflict issues? Wow, pretty theory, not good if brings down the repo or your computer!

    Git is not pretty, but you can slice and dice nearly any thing with it. It doesn’t have a ‘cute’ theory, but it also doesn’t blow up because of conflicts.

  24. Sitaram Chamarty said 438 days later:

    I was expecting to see at least one comment here mention that the GHC development has gone to git.

    I’m too old to be snarky so please don’t take it that way—it is a genuine question from someone who tried learning Haskell and liked it very much, although I admit I’m not a big fan of ‘theory’ from people not in the trenches.

  25. aam said 526 days later:

    This may be a really old blog post, but it’s the first thing that comes up if you do a google search of “darcs git.” Quite an informative post. After reading Sitaram’s comment, I was curious as to WHY ghc moved to git, so I did a quick google search which came up with this:

    http://hackage.haskell.org/trac/ghc/wiki/DarcsEvaluation

    In particular, you may find the external references part interesting (in fact, this post is listed as a linkback):

    http://hackage.haskell.org/trac/ghc/wiki/DarcsEvaluation#Externalreferences

  26. Jedai said 988 days later:

    Given that as aam said this article continue to be easy to access by google I think it important to say that GHC did not switch to git in the end. Simply put darcs development was invigorated by this news and the current version of darcs has improved its performance enough that the GHC team decided to stay on darcs. Darcs is still being worked on and is now much closer to git in terms of performance, surpassing mercurial.

    You may still prefer Git UI (ew ;) but for most reasonable projects (GHC is pretty big) darcs performance will now be adequate.

  27. Jerome Martin said 1033 days later:

    I know it’s been years, but I cannot resist jumping in. I hope my comment will still be of use, at least to people stumbling upon that thread a bit late, like I did.

    @prasinos, @Tom Moertel:

    • “In Git’s mind, Y depends on X, and Z depends on Y, even if they really are independent of one another.”
    • “If you don’t use Git, it’s hard to appreciate how easy it is to rewrite dependencies, [...].”

    Both comments completely miss the point. they talk about “dependencies” where there are none.

    With git, commits are all independent. You can cherrypick any commit from any branch and apply it to any other branch arbitrarily. You can do that with an arbitrary group of commits, in arbitrary order too, either producing a single large commit in the destination branch (squashed commits) or many individual commits.

    What you call “dependencies” here is just your project commits being ordered in the specific timeline you applied them on top of each other, in a tree-like fashion (a single commit can appear several times in that tree, i.e. in various branches).

    This is what we call your “history”, as can be seen in the git log.

    The only reason why git separates the functionality that allows you to rewrite history (git rebase, which could really be aliased to git change-history) from the other ones (like commit—amend), is because rewriting history is a very special operation in a DCVS, not because of technical limitations (git has none of that sort), but because of human workflow and principle of least surprise: If you already pushed your local repo history to a central/shared repo, and then you rewrite history, it could make your coworkers go cuckoo you see… “I could have sworn there was a commit there, it suddenly disappeared”, etc.

    However, if you decide that you want to rewrite history, either because you haven’t pushed your changes to a central repo yet, because your coworkers expect it or because you work alone on the project, whatever, then you can. This is exactly what the rebase command does, it allows you to rewrite your repo’s history.

    I hope this clarifies this “dependency” misconception and clearly explains why “amending the LAST-n commit” works the way it does, on purpose, and, IM(not so)HO, is a very wise decision from a UI standpoint.

    As for that notion that “Life is too short for anyone to spend more than 30 minutes, in your whole career, learning how to use version control software.”, it really underestimates the importance of the VCS in any sizeable software engineering project.

    Note: I used darcs too years ago, was initially attracted to it by the theory too. But as the saying goes, “Theory and practice are the same … in theory. But often not in practice.” :-)

  28. Jerome Martin said 1033 days later:

    Just to bounce back on the end of that last comment of mine, I’ll risk a hazardous and (as they all are) unaccurate parallel: mercurial/git are to darcs what erlang/ocaml are to haskell.

    In a positive way, that could mean that, like I would recommend haskell to anyone who wishes to put serious work into understanding the real power of functional programming and expand his mental horizon, maybe darcs could try to play that role for D. version control systems ?

  29. Tom Moertel said 1035 days later:

    @Jerome Martin,

    When I write that “In Git’s mind, X depends on Y”, I mean that Git doesn’t know that X and Y are order independent (commutable). In Darcs, this knowledge is represented in the repo (and often inferred automatically), but in Git, this knowledge is represented in the programmer’s head. Thus, while you write that there are “none” of these dependencies, what you really mean is that Git doesn’t know about them.

    The dependencies are real, however. While you can cherry-pick any commit into any branch, it’s up to you to know whether doing so will break something. What do you think this knowledge represents?

    Cheers,
    Tom

  30. Jerome Martin said 1036 days later:

    @Tom Moertel

    Wow, this thread is alive :-) Tom, Git does not have any notion of commutable vs dependent commits a priori.

    And contrary to what you state you do not have to know if cherry picking will work in order to not break anything, because git will detect the eventual conflicts and tell you so, refusing to commit until the conflict has been resolved by a human being.

    The knowledge you mention represents the human operator notion of dependency between individual commits, yes, you are right about that. However, this is different from “In Git’s mind, X depends on Y”.

    It is blatant that you either misunderstood or did not express clearly how it works at the time you wrote the initial comments and article years ago, which is perfectly fine (we all lack deep understanding on many topics, especially when doing first-time evaluation). But please, do not try to turn that around now, it is useless and might confuse people reaching this thread looking for a better understanding.

  31. Tom Moertel said 1036 days later:

    @Jerome Martin:

    Yes, you do have to know if cherry-picking will break something. When cherry-picking, it is your responsibility to make sure the branch you end up with represents a project state that makes sense. Neither Git nor Darcs can detect, for example, that you’re cherry-picking a commit that calls library functions that the current branch lacks. If you go ahead with that cherry-pick, you’ll get a valid commit that represents a broken project.

    Getting back to, “In Git’s mind, X depends on Y”, what do you think those “parent” lines represent in Git commit objects? Now, you and I may know that a particular commit and its “parent” represent changes that are independent of one another and, therefore, that the order dependency represented in the parent field is arbitrary and can be reversed (or even dissolved by removing one of the commits), but Git doesn’t. It maintains the convenient fiction – literally records it in the repo – that the child requires the parent, that there is no valid state in the project’s history in which the child participates and the parent doesn’t.

    This fiction, however, is advantageous. It’s easy to understand and hard to screw up. And Git users know that Git repos tell an overly simple this-builds-on-that project story that is not intended to capture the full complexity of the relationships within a project’s development history; that’s because Git users know it’s their responsibility to understand those relationships and to tell Git how to draw the history when those relationships change.

    That’s the strength of Git. It doesn’t try to anticipate all the weird situations that programmers might encounter; it just gives them a simple, general representation and the tools to shape that representation into whatever form might be needed.

    Cheers,
    Tom

    P.S. Other than this postscript, I’m ignoring the cheap shot about my motives at the end of your comment.

  32. Jerome Martin said 1038 days later:

    @Tom Moertel:

    I really do not understand the long talk here. You seem to agree that git does not enforce any dependency explicitly, and that was most of my original point, in disagreement with what your article and subsequent comments implied. either that was not clear from my previous comments or my “cheap shot” still holds.

    As for the parent of a commit object, this is indeed most interesting, and I can see now that you might have misinterpreted what is frequently called a dependency in git literature, aka node dependency. In effect, a node is dependent on its parents nodes because one needs some sort of reference to access a node. But a commit in itself is more than just a node. It is, in fact, a whole tree. And yes, a commit can have a parent (can, not must, i.e. detached commits, initial commits, etc.), as well as it can have several parents (merges like cherry-picks). In the case of a merge, this kind of 1 to N relationship between a commit and its parents clearly shows that a commit parent is not a dependency for the commit to exist, it is part of the commit. The diff that is the body of that specific commit (the last patch applied on the commit tree) itself does not have any dependency at all. Again, these parent/commit links are there merely for refcounting and eventually garbage collecting the detached commits, certainly not to indicate any dependency in terms of “valid state”. You can easily detach a commit (you cannot reference it anymore other than by its hash, there is no more parent to the commit, it is not part of any branch), and then reuse it at will for merging, label it with a tag or branch name, etc. this shows clearly that there is no dependency tale told by git about the body of the commit, just structural information that tells about, I will use that once again, the history of the repo.

    As for your take on the strength of Git, I could not agree more, its beauty (like many other tools) lies exactly in what you describe. Which is exactly why I wanted to make it clear in the first place that there is not such thing as enforced or assumed dependencies between commits from Git.

  33. Tom Moertel said 1038 days later:

    @Jerome Martin:

    You wrote:

    You seem to agree that git does not enforce any dependency explicitly…

    But it does. It’s just that Git also allows you to rewrite those dependencies at virtually zero cost. (Until you’ve published your repo, that is. Then there’s a nearly prohibitive social cost to rewriting.)

    I think where we are getting hung up is that I’m talking about the semantics of the Git model and you’re talking about the physical representation. I understand that the physical representation allows you to do just about anything, but that’s not the meaning of the representation. The meaning is that a project contains commits, each representing a project state that evolves from earlier states, represented by their own commits, to form a graph. You say that, because the graph can be changed arbitrarily, the edges within it don’t represent real dependencies. I say that they are real to Git, that Git promises to respect them, and that, until you tell Git otherwise, a commit represents the movement to a particular project state from the one(s) that came before it.

    So, again, getting back to “In Git’s mind, X depends on Y”, what I mean is that when I tell Git that X depends on Y, Git believes that X depends on Y and will persist in that belief until I tell it otherwise. Further, Git doesn’t really offer any convenient means to represent more-complex evolutionary relationships such as “here is a collection of mutually independent changes.” But that’s okay. Git forces me to write those changes into a commit graph that represents an ordering over them. I know that I can reorder them, but Git doesn’t. Git doesn’t let me represent (by practical means) that they are orderless. It requires me to carry this knowledge myself. But it does let me act upon this knowledge: it lets me reorder them whenever I want.

    Do you see the difference? You’re saying that because I can reorder them, the ordering dependencies in the repo aren’t real. I’m saying that they are real – but changeable.

    Cheers,
    Tom

  34. Jerome Martin said 1039 days later:

    @Tom Moertel:

    OK, you are calling dependencies the edges of the DAG, and I am, schematically, stating that the application of the body of a commit does not depend on anything as far as git is concerned, which you never disagreed upon. You talk about structure, I talk about real-world impact and constraints enforced by git on the user.

    I agree that we should stop arguing about that because you are right, we are probably not talking about the same thing :-) and I am not contesting the description of Git’s repo structure you have made. As both of us seem to understand it, it would be futile to dig more into that.

    I just hope that this (somewhat pointless it seems towards the end) debate will have raised some interesting enough points that people digging out that thread from now on will at least get food for thought.

    The only point I will hold on is the fact that the way you used the word (in)dependent both in the early comments and article (at the beginning of the “Learning to love Git” section)IMO is misleading and seem to express a constraint for the user, not a structural relationship. But we went through this already, so let’s says that we diverge on the perceived meaning of it and this comment thread holds all the disambiguation info that potential readers might need (and more).

  35. Tom Moertel said 1039 days later:

    @Jerome Martin:

    Thanks for staying with the conversation. :-)

    On your last point, I now see what you mean. I agree that, from the perspective of a Darcs user coming to Git, it seems that Git requires rigid ordering of commits where Darcs is happy to consider patches as independent. I hope, however, that I was able to explain during the remainder of the article that this perception, while correct in a static sense, isn’t limiting because Git’s commit graph is so easily changed with tools like interactive rebasing.

    In other words, Darcs’s “graph” is harder to change and therefore Darcs tries to represent weaker constraints within it (e.g., commutativity) so that you’re not overly constrained. Git, in contrast, makes you declare an ordering over the graph, even if the underlying work doesn’t impose one. Git prevents this representation from becoming overly constraining in a different way: by letting you easily change the graph.

    Thanks again for staying in the conversation. I learned something because of it.

    Cheers,
    Tom

  36. Jerome Martin said 1039 days later:

    @Tom Moertel:

    On an unrelated note, I had a thought in in mind since what you said earlier about Git’s strength (end of comment 31), but did hold it for myself as it is highly subjective and I did not want to mix arguing about Git semantics and risking to start a flamewar.

    However, your last comment about darcs vs Git constraints/graph semantics triggered the same thought again, so I’ll say it out loud: the differences your express between Git and Darcs make me think of the philosophical differences between static-typing languages like haskell vs dynamic typing ones like python (or even erlang for that matter).

    On the one end you try to prevent a whole class of mistakes from being possible to make at all, at the cost of some rigidity (and trying to approach asymptotically a mathematically impossible goal, as Godel, Church and Turing all demonstrated), on the other hand you try to be as unobtrusive and natural as possible, at the cost of losing formal expressiveness.

    I do wonder if there is a general process to place the cursor between those two extremes in a given situation. Surely the last tendencies in programming methodologies (extreme, scrum, agile, etc.) put the emphasis on flexibility vs rigidity, the darcs vs git example seems to lean that way too, so seem to do even very old deep analysis of large software projects (the mythical man-month being a good example). However, this is just a collection of examples and opinions, certainly not a general rule.

    Do you, as a Haskell programmer, have any thoughts to share about this other than just personal preferences ?

  37. Jerome Martin said 1039 days later:

    Actually I might add that I left the static vs dynamic typing impact on possible code performance optimization aside, as I think that often leads to the “save time while prototyping” vs “get a better runtime performance” argument, which is not the point here I think. Or is it ?

  38. Tom Moertel said 1039 days later:

    @Jerome Martin,

    You raise an interesting question: Is there a sweet spot between a powerful (yet constraining) static type system and a permissive (yet flexible) run-time system, or if not, is there a way to choose points along that spectrum best for each task?

    In answer, I offer this. I write a lot of code in Haskell, Erlang, Python, and Perl. I’ve noticed that the code I write in Erlang, Python, and Perl would, for the most part, type-check under a type system like Haskell’s. Where it wouldn’t, I usually have misaligned some concepts subtly and would actually appreciate having a type system tap me on the shoulder to point out the mistake.

    In other words, I rarely have (legitimate) need to do anything that a Haskell-like type system would impede. So, for me, Haskell’s system is close to a sweet spot for day-to-day coding tasks.

    Now, the lack of a static type system in Erlang, Python, and Perl doesn’t impede me much in day-to-day coding, either. So, for the day to day, their typing strategy also represents a sweet spot and thus I don’t see much of a practical difference between Haskell-style and “dynamic”-style typing.

    But I do see a difference – a big difference – when I need to do something requiring rigor. Then I can “dial up” Haskell’s type system to make it carry part of my proof burden for me. I design a lot of logic in Haskell that is destined to be implemented in Erlang and Python, simply because I get very strong guarantees for my effort. When I need strong guarantees, a Haskell-like typing strategy pays off – to the point where I’m willing to implement something once in Haskell and then port it to the target language because it’s cheaper than writing it directly in the target language and trying to purchase those guarantees some other way.

    In sum, I think the Haskell style of typing is hard to beat. It stays out of the way for day-to-day work and “dials up” for rigorous work. The “dynamic” strategy, however, can’t be dialed up when needed and, for the work I do at least, it’s a limitation that I notice semi-regularly.

    One other thing I’ll mention: Even for day-to-day work, I notice a benefit of the Haskell-like strategy: it teaches discipline. When coding in Erlang, Python, and Perl, for example, I can avoid many subtle misalignments of concepts because coding in Haskell has forced me to become good at spotting misalignments. In Haskell, you know the type system is always there to bust you, so you learn to keep your concepts clean. That learning carries over when coding in other languages because you can’t turn off that feeling of a type system watching over your shoulder for mistakes.

    Cheers,
    Tom

  39. Jerome Martin said 1040 days later:

    @Tom Moertel:

    I realize now that my parallel might not have been a proper one, because of Haskell type inference which, according to your description, you seem to leverage. I rarely do, as the explicit typing actually helps me switch to “haskell mode” – but I am just a beginner in haskell.

    What I can add to build on your description of daily use of typing, is a description of how my usage of python (these days the language I use the most) evolved over the years:

    - At first, coming from a C/C++ background, I kept doing paranoia checks on function/method arguments, with a lots of if statements starting the code blocks. Even duck typing I ignored, checking actual parenthood of objects.

    - Then, I sort of relaxed a bit regarding this, and started to use try..except statements instead of if, actually reducing the number of tests, but still catching many typing problems in the functions/methods code directly. This is more pythonic, but still too systematic (typically lots of try: x=int(y)kind of stuff).

    - Now, I feel that I have adopted a more relaxed way of doing it. Even if this is not used in the proper context, it sound a bit like the erlang “Just let it crash.”. In python, this translates into higher-level try..expect, with typically per-module sorting of custom Exceptions and generic exceptions.

    All in all, I feel that this more relaxed ways of doing things is lighter and as powerful as a paranoid systematic type check, but I have to say that this is valid only in the context of dynamically-typed languages. With a static typing (and a good type system to define as custom types all the other data checks that one does in a dynamic language), I feel that most of my code would actually loose a notable amount of error-handling and exception definitions, plus gain compactness and better semantics by doing custom type checking for what it is, instead of mixing it up with other code.

    I got attracted to Haskell exactly for that reason, because the type system looks amazing (which is even more important in a FL, this is my no2 rant about erlang, no1 being the prolog syntax), but as I am still discovering daily usage of it and side effects (no pun intended), I am in the “questionning phase” about it :-)

  40. Det said 1121 days later:

    Kurt, that’s ludicrous. If “Arch” and “Git” sound bad to you (in fact they sound bad to no one else except some fanboys maybe) I wonder why would you even want to discuss this matter.

    These two are what they are exactly because they are better that way. Neither project’s purpose is to be as user friendly as possible. They are not (and never will be) designed with users preferring easy-to-use software in mind, for chrissake..

    And Git undoubtly has some very advcanced features that are – and always will be – difficult for beginners to use/understand. It seems you are one of them. But what you are suggesting is that these features should also be made easier to use so that new users are more comfortable with them.. or something? I mean seriously. Wake up. It’s Git, not Ubuntu.

    What was most hilarious that you came up with was that because Git developers are kernel developers they don’t care about putting of other people because what they already have still works OK for themselves. You are really starting to sound like a fanboy.

    I have never used Git or Darcs myself but the only point you are making is that “oh Git is so hard to use and the kernel devs are lazy”. Congratulations on making your point. It was such a wonderful waste.

  41. eric.des.courtis@gmail.com said 1262 days later:

    Your darcs workflow should not be used in git it make thing difficult for no reason.

    How about using feature branches instead of developing everything on the same branch?

    You should create a branch for each feature

    branch X branch Y branch Z

    You can move from one branch to the other and amend the commits no problem. When the time comes you can successively rebase, merge or cherry pick the commits onto the master branch.

  42. trans said 1425 days later:

    I’ve used both Dracs and Git. And I miss Darcs. I use Git pretty much for one reason: GitHub.

    I have found Git to be a pain in the ass. I’ve lost files for reasons I may never understand. The ui at times can be likened to a form of witchcraft. And the in-directory branch management is a mixed bag, good in some ways not in others.

    The biggest issue I have with Git is the one the author of this post ends on. It’s too difficult to isolate change-sets. I often just say forget it and commit mixed changes, oh well. Rebase is fragile and it’s effects can be irreversible. Topic branches help, but don’t offer solution post-merge. Nor do they take real development into consideration—one is often works on multiple issues at once.

    I guess it’s too bad that instead of fighting over which vcs is better the two projects can’t find common synergy to build a vcs beyond either of them.

Trackbacks

Use the following link to trackback from your own site:
http://blog.moertel.com/articles/trackback/655

(leave url/email »)

   Comment Markup Help Preview comment