Skip to content

[3.0] Adds support for diffs in the package manager#8620

Merged
Sesquipedalian merged 2 commits intoSimpleMachines:release-3.0from
Sesquipedalian:3.0/diff_support
May 3, 2025
Merged

[3.0] Adds support for diffs in the package manager#8620
Sesquipedalian merged 2 commits intoSimpleMachines:release-3.0from
Sesquipedalian:3.0/diff_support

Conversation

@Sesquipedalian
Copy link
Copy Markdown
Member

These changes add support in the Package Manager for diff files as an alternative to modification.xml files.

The exciting part that you came here for:

This means, among other things, that creating patch files for SMF 3.0 will be as simple as running git diff and putting the output directly into our patch packages. We will no longer need to laboriously craft a modification.xml, nor monkey around with scripts that try to convert a diff into a modification.xml. We'll just use the diff directly. This will be both incredibly easier and infinitely more reliable.

Modification developers can also benefit greatly from this. For example, when a mod developer is ready to create an install package for their mod, they could simply run diff -u on the command line to compare an unmodified copy of SMF with their modified copy, and then use the output directly in their package.

Here is the documentation for PackageManager::parseDiff() and FullDiff::createFromRaw(), which explain how to generate diffs for use in the Package Manager:

	/**
	 * Parses a diff-based modification.
	 *
	 * Supports diffs in unified format and context format, although unified
	 * format is recommended.
	 *
	 * Diffs can be generated using the standard Unix `diff` tool or by common
	 * version control software such as Subversion or Git.
	 *
	 * Recommended `diff` command to generate diffs for SMF's package manager:
	 *
	 *    diff -u /path/to/unmodified/SMF /path/to/modified/SMF
	 *
	 * This will create a unified diff by comparing an unmodified copy of SMF
	 * to a copy with modified files. Note that standard `diff` never generates
	 * the extended "rename" and "copy" headers, so mod authors may wish to
	 * insert them manually.
	 *
	 * Recommended Git command to generate diffs for SMF's package manager:
	 *
	 *    git diff -p -M -C -C -B --default-prefix --no-relative <ref1>...<ref2>
	 *
	 * This creates a unified diff by comparing two commits in a Git repository.
	 * The diff will include "rename" and "copy" headers wherever appropriate.
	 * (Note: the `-C` option is intentionally included twice.)
	 *
	 * In order to support the package manager's ability to mark certain changes
	 * as optional, one or more extended "optional" headers can be manually
	 * inserted into a diff at appropriate locations in order to indicate to the
	 * package manager that the changes to one or more particular files can
	 * safely be ignored if they cannot be applied.
	 *
	 * @see \SMF\Diff\FullDiff::createFromRaw() for more info on the extended
	 *    "optional", "rename", and "copy" headers and how to use them.
	 *
	 * @param string $raw_diff The content of a diff file.
	 * @param bool $testing Whether we're just doing a test.
	 * @param bool $undo If true, specifies that the modifications should be
	 *    undone. Used when uninstalling.
	 * @param array $theme_paths An array of information about custom themes
	 *    to apply the changes to.
	 * @return array An array of those changes made.
	 */
	/**
	 * Parses raw diff data to make one or more instances of this class.
	 *
	 * Supports raw diffs in unified format, context format, and normal format.
	 * Does not support ed script format.
	 *
	 * If the raw diff contains info about multiple files, the returned array
	 * will contain multiple instances of this class.
	 *
	 * In addition to the standard diff headers for the unified and context
	 * formats, the following extended headers are also recognized:
	 *
	 *    optional
	 *    rename from <old path>
	 *    rename to <new path>
	 *    copy from <old path>
	 *    copy to <new path>
	 *
	 * The "optional" header can be manually inserted into the raw diff in order
	 * to indicate that the changes to a particular file can be ignored if they
	 * cannot be applied. This header should be inserted before the standard
	 * diff headers and, where applicable, before the "rename" and "copy"
	 * extended headers for the file that it refers to. If a mod wants to make
	 * both mandatory and optional changes in the same file, two diffs for the
	 * file can be included: first a diff containing the mandatory changes to
	 * the file, and then a second diff with the "optional" header containing
	 * the optional changes.
	 *
	 * The "rename" and "copy" headers are generated by `git diff` when a file
	 * is copied or renamed. They can also be manually inserted into the raw
	 * diff by mod authors. They must always occur in pairs with the "from"
	 * header line immediately followed by the corresponding "to" header line.
	 * This information is used to indicate the correct file operations to
	 * perform when applying the diff to files.
	 *
	 * @see \SMF\PackageManager\PackageUtils::parseDiff() for more information
	 *    on how to generate raw diffs for use in SMF's package manager.
	 *
	 * @param string $raw_diff The raw diff.
	 * @throws \ValueError if $raw_diff does not contain valid raw diff data.
	 * @return array Instances of this class.
	 */

A further benefit of using diffs in our packages is that packages will no longer need to include separate copies of any new files added to SMF by the package. One of the limitations of the modification.xml format is that it can only apply changes to existing files; in order to add a new file, we had to instead use <require-file> in package-info.xml. But when using diffs, any and all new text-based files are included in the diff content and do not need to be handled separately. (Unfortunately, new or updated binary files cannot be handled by diffs and still need to be handled using <require-file> in package-info.xml.)

"But wait," you say, "it looks like more is going on here..."

Why, yes. Yes, there is. The observant will notice that the first commit implements not only (1) the abstract SMF\Diff class and (2) the SMF\FullDiff class that is used for the Package Manager, but also (3) the SMF\EditDiff class which doesn't appear to be used for anything. What's going on here?

SMF\EditDiff is unused in this PR, but a future PR will use it to add support for tracking the edit history of posts, the registration agreement, and the privacy policy. The code for all of that is already written (it has been for a while now, actually). However, database changes are required in order to use this new edit history feature. That in turn means that before anyone besides me can start testing this edit history feature, we need to finish dealing with the upgrader rewrite. Then the rest of you will be able to run the upgrader in order to make the necessary database changes so you can test.

@Sesquipedalian
Copy link
Copy Markdown
Member Author

I will add more info and some testing instructions soon. But right now it is very late here, so I am going to sleep.

@jdarwood007
Copy link
Copy Markdown
Member

I want to review this when I get time to do so.

I was thinking. Is there any difficulty splitting up boardmod, XML and diff logic into 3 classes that can then be used for parsing install/uninstall?
The generation of diff part can be left as separate classes, because we don't want to, but could add logic for building XML files then.

@Sesquipedalian Sesquipedalian force-pushed the 3.0/diff_support branch 2 times, most recently from 3847272 to 29e9fd3 Compare April 30, 2025 16:39
@Sesquipedalian
Copy link
Copy Markdown
Member Author

Sesquipedalian commented Apr 30, 2025

How to test:

  1. Checkout this branch.
  2. Create a new branch based on this branch.
  3. Apply this patch to the new branch using git am: SMF-a462a62-Dummy.patch
  4. Make a copy of your working tree so that you can run a comparison at the end.
  5. Switch back to this branch.
  6. Using the SMF Package Manager, install this package: DiffModTest.tar.gz
  7. Compare your working tree after installing the package to the copy of the working tree that you created in step 4. They should be the same.

Regarding the content of the package file

  1. The test package's diff file was created in two steps:
    1. First, I ran git diff -p -M -C -C -B --default-prefix --no-relative '3.0/diff_support'...HEAD > test.patch.
    2. Then, in order to demonstrate the use of our optional diff header (see documentation for the SMF\Diff\FullDiff::createFromRaw() method), I appended two additional diff blobs that are intentionally meant to fail gracefully and be skipped. The first of these tries to target non-existent content in an existing file (Config.php), while the second tries to target a non-existent file (derp.php). Because both of these blobs have the optional diff header, the Package Manager skips them when they fail.
  2. The binary file (Crash_Test_Dummy-4126344717.jpg) cannot be applied using the diff in the package, so it still has to be installed using package-info.xml's <require-file> directive and removed using <remove-file>.
  3. Note how the package creates a new PHP file, but that file does not need to be added using <require-file>. This is because the full contents of the file are included within the diff file and are created as part of applying the diff during package install. It is likewise removed when the diff is reverted during package uninstall.

Here's the list of files inside the package archive:

Crash_Test_Dummy-4126344717.jpg
license.txt
package-info.xml
readme.txt
test.patch

Here is the content of the test package's package-info.xml file:

<?xml version="1.0"?>
<!DOCTYPE package-info SYSTEM "http://www.simplemachines.org/xml/package-info">
<package-info xmlns="http://www.simplemachines.org/xml/package-info" xmlns:smf="http://www.simplemachines.org/">
	<name>DiffModTest</name>
	<id>Jon_Stovell:DiffModTest</id>
	<type>modification</type>
	<version>1.0</version>

	<install for="3.0 Alpha 2 - 3.0.99">
		<readme type="file" parsebbc="true">readme.txt</readme>
		<modification format="diff">test.patch</modification>
		<require-file name="Crash_Test_Dummy-4126344717.jpg" destination="$imagesdir" />
	</install>
	<uninstall for="3.0 Alpha 2 - 3.0.99">
		<modification format="diff" reverse="true">test.patch</modification>
		<remove-file name="$imagesdir/Crash_Test_Dummy-4126344717.jpg" />
	</uninstall>
</package-info>

Here's a screenshot showing the expected install actions for the test package:
Screenshot 2025-04-30 at 10 41 34 AM

Note how the package will create ./Sources/DummyClass.php even though there is no DummyClass.php file inside the archive. Also note the skipped changes to Config.php and derp.php. Those are the two diff blobs that have optional headers.

@Sesquipedalian
Copy link
Copy Markdown
Member Author

Sesquipedalian commented Apr 30, 2025

I was thinking. Is there any difficulty splitting up boardmod, XML and diff logic into 3 classes that can then be used for parsing install/uninstall?

We could, but I don't see much reason to do so. With this change, I fully expect the XML and boardmod formats to rapidly fall into disuse, because why would anyone want to bother with those labour-intensive formats when they can just run a diff command and be done? That means that there's not much point in putting more work into SMF\PackageUtils::parseModification() and SMF\PackageUtils::parseBoardMod().

Meanwhile, SMF\PackageUtils::parseDiff() is a relatively simple method. The heavy lifting is all done by SMF\FullDiff::createFromRaw(), whereas SMF\PackageUtils::parseDiff() just translates the result into an $actions array to be returned and, when appropriate, tells SMF\PackageUtils::packagePutContents() to actually make the changes in the file system. If we moved PackageUtils::parseDiff() into its own class, that class would just contain that one method and would need to call back to the other PackageUtils methods anyway.

The generation of diff part can be left as separate classes, because we don't want to, but could add logic for building XML files then.

We won't need to generate diffs for packages from within SMF, nor will we need to convert them into XML. The diffs will be created on developers' computers and then included in the package archives.

We do already have code for generating a diff between two strings. We do it by feeding the two strings to SMF\Diff::__construct(). We will use it in order to generate diffs for recording the edit history of posts, etc., once I push the PR for that. However, we don't need to generate diffs in the package manager; we just need to parse and apply them. So that ability isn't relevant in this PR.

@jdarwood007
Copy link
Copy Markdown
Member

We could, but I don't see much reason to do so. With this change, I fully expect the XML and boardmod formats to rapidly fall into disuse, because why would anyone want to bother with those labour-intensive formats when they can just run a diff command and be done? That means that there's not much point in putting more work into SMF\PackageUtils::parseModification() and SMF\PackageUtils::parseBoardMod().

We had discussed years ago about removing boardmod and ultimately kept it as it took no effort to maintain it. We have removed references to it in various places, and I don't think any mode since 1.x has used it. Its still around.

The idea of classes was to make it easier to maintain them, and also we can easily drop support or add support for other methods. Not that I expect any new support, really, since diff is widely popular and supported by version control software.

Copy link
Copy Markdown
Member

@jdarwood007 jdarwood007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I had time to peak at.

Comment thread Sources/Diff/Diff.php Outdated
Comment thread Sources/Diff/FullDiff.php
@Oldiesmann
Copy link
Copy Markdown
Contributor

The only downside I see to this is manual installation. The XML format is relatively easy to read and is what people are used to. Reading diffs isn't that difficult once you know how, but everyone is used to the XML format, so we'll need to either implement our own diff parser on the mod site or point users to an option (Diffchecker, Online Diff or a similar option) so we can still support easy manual installation in the few instances where we still run into a conflict somewhere.

@live627
Copy link
Copy Markdown
Contributor

live627 commented May 1, 2025 via email

@jdarwood007
Copy link
Copy Markdown
Member

The mod site can easily be extended to support diffs. The core logic of the mod site already uses the SMF logic to parse out the changes. That would have to happen after we upgrade the main site to SMF 3.0.

@Sesquipedalian Sesquipedalian force-pushed the 3.0/diff_support branch 2 times, most recently from 0e94442 to e7622d7 Compare May 3, 2025 01:17
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
Signed-off-by: Jon Stovell <jonstovell@gmail.com>
@Sesquipedalian Sesquipedalian merged commit 64f08b5 into SimpleMachines:release-3.0 May 3, 2025
6 checks passed
@Sesquipedalian Sesquipedalian deleted the 3.0/diff_support branch May 3, 2025 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants