shadowscast: First Slayer shadow puppet (Default)
[personal profile] shadowscast
Regarding getting the data from Yahoo groups:

I finally got the data which I had requested from Yahoo directly by following their not-very-clear "request your data" instructions (involving going through the "privacy dashboard"). (I requested the data on 2019-10-16 and I got it on 2019-10-26.)

It arrived as a zip file, which, when unzipped, had a folder for each of the groups I was a member of (not just the one I moderated).

Each of the group folders had three folders in it: files, links, and messages (also zipped, initially).

Unzipping the "files" folder yielded all of the files that had been in the group's files section, with subfolders intact. So that's pretty handy, I guess.

Unzipping the "links" folder yielded files which Windows tells me are of the type "internet shortcut", but which I cannot figure out how to open. Anyway, each one is about 200 bytes.

Unzipping the "messages" folder yielded one file with a name like: 2215412.mbox.00001
(Update: Whoops, when I wrote that I had only looked at the smaller group. For the larger groups, there are multiple mbox files, with the suffixes 00001, 00002, etc. Each one had a maximum file size of about 10,244 KB—so I guess the bigger message archives got broken into chunks, which makes sense.)

At first I wasn't sure what that was, but after some googling I figured out that it was a saved-emails file format.

I downloaded Thunderbird, an email client, and followed the instructions found on this page:
https://www.wintips.org/how-to-open-mbox-files-in-thunderbird/

After doing that, I was able to view all of the group's email messages in Thunderbird.

Update: After writing this post, I saw that there's now a Yahoo Groups Fandom Rescue Project Tumbler, which has a post with essentially the same information I just discovered for myself: Yahoo Groups Deletion: Requesting Your Groups

But they mention that not all files from the "files" section of the groups seem to be necessarily always included, so that's a warning.

And [personal profile] morgandawn has the same info in a Dreamwidth post: Yahoo Groups Deletion: Requesting Your Groups

Update 2: I think I've figured out what's up with the missing photos. See this comment below.

(no subject)

Date: 2019-10-28 01:06 am (UTC)
catwoman69y2k: SoCal Chillaxin (Default)
From: [personal profile] catwoman69y2k
Okay, this makes me feel better. In using the Yahoo tools (from the Get My Data section in Groups), I had the same problem where the zip file contained my files but left everything empty as far as the photos zip file (that was contained within). Question: What did PGOffline do regarding the photos' captions? Or is that something you are going to have to manually pair back (if you even want to) from one of the json files?

Since it seems more and more that PGOffline is the way people are going to have to go, would you have any background info on this tool as far as whether it could be trusted? One thing that has made me hesitant about what alteratives I implore is the fact that some of my groups' members are no longer with us and so I cannot get consent from all my members regarding the possibility of sharing or reposting the content. However, I *do* need to make sure I get all the photos before 12/14

(no subject)

Date: 2019-10-28 05:20 am (UTC)
catwoman69y2k: SoCal Chillaxin (Default)
From: [personal profile] catwoman69y2k
One more question... were you ever working with PGOffline and a private group. This is one concern I have, as the groups Im trying to save did get put under the adult classification (as far as Yahoo policy at some of those evolutions in the timeline. ) I think that is how we became private. Realize that could be a problem for some tools (and Im wondering if thats somehow why photos was the one thing that Yahoo's data export couldnt retrieve for me).

December 2022

S M T W T F S
    123
45678910
11121314151617
181920 21222324
25262728293031

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated May. 2nd, 2026 04:45 pm
Powered by Dreamwidth Studios