Document Freedom Day 2025

Laura_Michaels · January 7, 2025, 6:31pm

The Digital Freedom Foundation (#DFF) is currently gearing up for Document Freedom Day 2025. #DFD2025 celebrates open standards, Free document and multimedia formats and even CC and Public Domain content using open formats. Hope you’ll join in to celebrate on March 26th. More information at About Document Freedom Day - Digital Freedom Foundation

nevj · January 7, 2025, 10:51pm

This is an important issue.
I wonder how we stand with software that is itself Open Source, but stores its files in a specific binary format? For example the R language stores its workspace in a file called .Data which is only readable by R. The format of .Data is not secret, one could crack it manually if need be.
Are we to assume that because software is Open Source it will always be available to read its files?

I would think the above would be less of an issue than proprietary data files, which are both unreadable except by proprietary software, and secret.

As one who has spent countless hours in my remote past writing code to unscramble weird magnetic tape formats, I appreciate modern data portability and the standards that are required to obtain it.

Laura_Michaels · March 24, 2025, 12:43pm

Unfortunately, some FLOSS applications prefer to work with proprietary formats. It makes them more competitive with commercial projects. Luckily, since the application provides the source, someone can go in and use the source to access the format however they’d like. Some projects create their own Open format options with Open standards. LibreOffice will be talking about their ODT format on Document Freedom Day. Check their site for information on their virtual talks. I think if a user is interested in freedom with regards to their data and how it’s stored, it’s important to choose Free/Open source tools that provide access to that data in ways the user is most comfortable with. It’s interesting to see how many plain text, todo.txt and org mode style software solutions there are out there. These types of programs give easy access to the actual data through the software and a standard text editor.

xahodo · March 24, 2025, 3:15pm

One annoying issue about LO writer is that everything is stored in one xml file. Images? Base64 encoded, etc.

Why, oh, why can’t the open document foundation (the maker of libreoffice) utilize the zip format (which is used to compress documents) to its fullest and store images as a separate file in the zip file?

Laura_Michaels · March 24, 2025, 3:27pm

The ebook format does that. Maybe a format like that would be more useful for you?

Laura_Michaels · March 24, 2025, 3:29pm

LibreOffice developers will be answering questions on Document Freedom Day. Maybe you can ask them in person? REMINDER: Document Freedom Day @ LibreOffice - The Document Foundation Blog

nevj · March 24, 2025, 10:24pm

I might have a bit of a look around plain text formats for documents and for data. They are by far the least troublesome. We need a a list.

Tex and Latex , of course, are the classic case for documents.

Laura_Michaels · March 25, 2025, 3:59pm

I think I’ve already done some of the work on that. I was doing some research on it and we also had a nice discussion about it on the Software Freedom Day mailing list. The emails are archived if anyone wants to read them. I’ll share some of the links I found interesting:
plaintext-everything
plaintext productivity dot net
plain text accounting
recutils
I also have a list of some low dependency, Open Source C command line todo programs here: To Do Lists and Personal Information Managers I have another list on some csv related programs and resources too. Some people find HTML a useful alternative to Latex especially if they already know the HTML syntax and can edit a HTML document in a text editor. I read about someone generating a book using HTML and CSS on the A List Apart blog. I have an article on generating PDF files from HTML. I’ve also experimented with creating ebooks using a text editor and a zip program.

nevj · March 25, 2025, 10:59pm

Thanks for the links.
I have used recutils

A lot of the data I use is in an archaic punched card format with no spaces between the fields and no embedded decimal points.
One needs external information on where the fields are to be able to read it.
I usually read it with either R or a Fortran format statement and convert to spaced fields and insert decimal points.
You must have encountered punched card data somewhere? Any data before about 1970 would most likely be in that form.

Punched cards were manipulated with what was called ‘Unit Record Equipment’… machines called tabulator, sorter, collater, reproducer, … they were programmed with plugboards… ie analog computers.
I think the name ‘recutils’ refers to the 'unit record ’ concept.
The techniques of sort, match, merge developed for cards, are still used today , but in software of course.