Skip to content

Commit b0af9a9

Browse files
committed
rn-127: add article about blob/treeless clones
1 parent c993836 commit b0af9a9

File tree

1 file changed

+97
-2
lines changed

1 file changed

+97
-2
lines changed

rev_news/drafts/edition-127.md

Lines changed: 97 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,104 @@ This edition covers what happened during the months of August and September 2025
2525
### Reviews
2626
-->
2727

28-
<!---
2928
### Support
30-
-->
29+
30+
* [Doing blobless clone by default; switching between blobless, treeless and full clones by a command](https://lore.kernel.org/git/[email protected]/)
31+
32+
Dilyan Palauzov (Дилян Палаузов) sent an email to the Git mailing
33+
list where he proposed making blobless cloning
34+
(`--filter=blob:none`) the default behavior for `git clone` via a
35+
global configuration option. He also suggested adding a command to
36+
download all locally missing history, a command to convert a
37+
repository to a pure treeless or pure blobless clone, and a config
38+
option to make blobless clone the default behavior when running just
39+
`git clone URL`.
40+
41+
He said that most users clone to build or change software, not to
42+
immediately analyze history with commands like `git log`. Therefore,
43+
a reduced data download would speed up initialization, save
44+
bandwidth, and reduce server load.
45+
46+
Kristoffer Haugsbakk replied saying the proposed command to
47+
"download all locally missing history" for treeless and blobless
48+
clones "sounds like git-backfill(1)". He also noted that he had
49+
"never used blob/treeless" clones himself.
50+
51+
Derrick Stolee, who likes to be called just "Stolee", and who
52+
contributed the `git backfill` command, replied to Kristoffer
53+
confirming that git backfill is intended to assist with downloading
54+
the missing blobs in a blobless partial clone.
55+
56+
About treeless clones though, he noted that git backfill is not
57+
optimized for them, and that treeless clones are generally not
58+
intended for "refilling," as downloading missing trees is
59+
"particularly expensive".
60+
61+
Stolee suggested using `scalar clone`, which is already shipped with
62+
Git, instead of making blobless cloning the default, as
63+
`scalar clone` was contributed partly to allow users to opt into a
64+
version of `git clone` that incorporates "best practices and
65+
advanced features as they are developed", while `git clone`
66+
maintains backward compatibility. He recognized that `scalar clone`
67+
might not be "discoverable enough" though.
68+
69+
Junio Hamano replied to Stolee's suggestion that a future command
70+
like `git big-clone` could emerge from the feedback on
71+
`scalar clone`. He said a separate command like `git big-clone`
72+
would not be discoverable enough either. Instead as a new feature
73+
matures, it should be a welcome change for `git clone` to borrow it
74+
as a new option. Such optimizations (like those for large repos)
75+
could be automatically enabled based on the repository's size,
76+
provided it's done with end-user consent.
77+
78+
Patrick Steinhardt replied to Stolee about treeless clones. He
79+
agreed that the existing command `git backfill` is not optimized for
80+
refilling treeless clones, and proposed an idea to backfill trees by
81+
batching based on depth, but concluded that this method is
82+
"definitely not ideal" and would perform "way worse compared to
83+
backfilling blobs".
84+
85+
Patrick also said that for these reasons he generally recommends not
86+
to use treeless clones at all.
87+
88+
Stolee replied to Patrick agreeing with the general caution
89+
regarding treeless clones, and that they are "not a good approach
90+
for doing ongoing work as a human".
91+
92+
However he noted that they are useful if a user needs the speed of a
93+
shallow clone combined with the ability to analyze commit history
94+
(though with no path history) for an "ephemeral scenario like a CI
95+
build". But they are a "tool for a very narrow case" and should only
96+
be used by those who understand how to avoid their pitfalls. Patrick
97+
then agreed with that point of view.
98+
99+
Konstantin Ryabitsev, the system administrator for kernel.org,
100+
replied to the original email from Dilyan about making blobless
101+
clones the default behavior for `git clone`. He said a
102+
counter-rationale to this proposal was that shallow clones (which
103+
include blobless clones) generate significantly more load on the
104+
server side.
105+
106+
The reason is that for these partial clones, no pre-existing packs
107+
are available for the operation, requiring more computation from the
108+
server. So changing the default behavior for `git clone` could
109+
likely result in slower clones for everyone and lead to more
110+
unavailable servers due to the high load.
111+
112+
Ben Knoble also replied to Dilyan's original email by opposing the
113+
proposal to make blobless clones the default behavior while agreeing
114+
that managing this preference via a config option was a reasonable
115+
compromise.
116+
117+
Ben's opinion was that such a default behavior would defeat the
118+
"tremendous advantage of distributed version control" which is about
119+
having the whole repository available independently. It would also
120+
makes some of his use cases more difficult as he frequently clones
121+
repositories specifically to run "history spelunking searches".
122+
123+
He noted that he primarily deals with repositories where the issue
124+
isn't about clones, but about mismanaging large binary files in
125+
history, which creates large blobs and clone times.
31126

32127
## Developer Spotlight: Toon Claes
33128

0 commit comments

Comments
 (0)