Skip to content

AdultTime / Algolia API: Handle affiliate sites better#2709

Open
nrg101 wants to merge 3 commits intostashapp:masterfrom
nrg101:handle-affiliate-sites-better
Open

AdultTime / Algolia API: Handle affiliate sites better#2709
nrg101 wants to merge 3 commits intostashapp:masterfrom
nrg101:handle-affiliate-sites-better

Conversation

@nrg101
Copy link
Copy Markdown
Contributor

@nrg101 nrg101 commented Apr 2, 2026

Scraper type(s)

  • sceneByURL

Examples to test

Short description

The sites above are affiliate sites that are not officially run by the corresponding studio. However, they do consistently have the correct title, date, cover image, description, etc.

Rather than just xpath scraping the page at the affiliate site, this change does a simple scrape of the scene at the affiliate site, and then uses the title/slug and date to search Algolia API for the same scene. This results in a proper studio scene result including studio scene URL.

@feederbox826
Copy link
Copy Markdown
Member

Im not sure how I feel about this. They are technically studio-ran sites but "unofficial" and the parent site has been extremely... hostile to scrape

@nrg101
Copy link
Copy Markdown
Contributor Author

nrg101 commented Apr 4, 2026

It was because of a comment by AdultSun at stashdb.org about a create Edit shouldn't have the secondary site as it's an affiliate rather than the official studio site.

I couldn't really confirm one way or the other but the secondary site footers don't seem to have the same official ownership etc. mentioned.

I like the improvement here where a secondary site scene URL can effectively find the official studio API entry and therefore get all the info and official/primary scene URL, I think that part should stand.

The part about not adding secondary (promo/affiliate) site URLs when scraping the scene at the API, I think that could be debatable. I'd say it's harmless and the secondary sites are very consistent and usually provide much longer previews.

Maybe this needs escalating to AdultSun and the Ministry Of Truth?

@AdultSun
Copy link
Copy Markdown

AdultSun commented Apr 5, 2026

These kind of affiliate sites have been around for a long time. From my understanding, they are all built and managed by 3rd parties who get paid for every signup funneled through their referral links, which really isn't much different from Amazon's affiliate program. From memory, this particular network of affiliate sites (don't know the owner but they all use identical formatting) cover more than just Adult Time / Gamma, which isn't unusual at all for affiliate operators. I can't come up with any examples off the top of my head though.

At best, I think these can be considered semi-official, or maybe 2nd party sources. These ones appear to have permission to populate their sites with data through the studios' API. But for me, scraping them directly feels similar to scraping TPDB or Data18 instead of the studio directly: sure it'll be mostly the same since those databases scraped the studio too, but you also have to expect minor shifts in the data when you grab a copy of a copy. The edit nrg saw earlier was an example of that shift. I think scraping the affiliate grabbed a smaller cover image and the release date shifted by a day compared to scraping the studio directly. It was the same "birthday boy" link used at the top of this thread if you want to re-scrape to check though, I might have the details wrong.

On the StashDB side of things, the consensus early on was only to use these affiliate sites as a last resort. For example, if a scene was purged from a MindGeek site and we couldn't find anything through the Wayback Machine, a few editors starting using an affiliate site called NewBrazz as a placeholder studio link and primary source. Again, their data mostly matched the source, but I remember their cover images specifically were a random gallery image instead. Pretty sure scene aliases were often different too, but I can't remember if the release dates were accurate or not.

In short, as a source they're often better than nothing, but I would never use them over an actual studio link. I'm also pretty sure most of the editors who have submitted these Adult Time affiliate links have zero idea they're not actually official, which makes sense since the affiliate operators try very hard to make them seem official too. But adding the affiliate URLs through the scraper just makes it seem like we think these are official too, which is misleading for both local scrapers and StashDB editors.

@AdultSun
Copy link
Copy Markdown

For reference, I found an example of a TeamSkeet / Reptyle studio from what looks like the same affiliate network as the examples above:
https://mypervmom.com/

So this is just confirmation that these affiliate sites are not exclusive to Adult Time / Gamma studios and are definitely owned and operated by a 3rd party.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants